HTML5 introduces the ability to cache content client-side so that often-used resources can be used without re-downloading them. This also enables a site to be viewed from the client when no network connection is available (i.e., offline viewing of the site).
In order for this to work, there are a few things one must do:
- Create a plain text file listing all of the resources that should be cached by the user agent (e.g., a web browser)— the cache manifest.
- Refer to that file in the opening html tag of every page that will use cached resources.
- Configure the web server so that the file is sent to the user agent with a specific MIME type: text/cache-manifest
- Regenerate the cache manifest any time you change the files in your site.
Step 1: Create the Cache Manifest
The cache manifest is a plain text file that lists each resource to cache. This file can be created with any plain text editor. A basic manifest contains a required header line indicating that it’s a cache manifest, followed by one resource per line.
CACHE MANIFEST index.html main.css scripts/main.js images/logo.png images/banner.png
For sites using JavaScript libraries, such as jQuery, that can be a long list of files.
For simple manifests, where all of the files in a given subdirectory need to be cached, first change directory (cd) to the root folder of your site (where the index.html file is located). Then, the manifest can be created from the OS X terminal/Unix/Linux command prompt with this one-line command:
echo "CACHE MANIFEST" > cache.manifest; find . -type f | sed "s#^\./##" | grep -vi "ds_store" >> cache.manifest
Taking each part in turn:
echo "CACHE MANIFEST" > cache.manifest;
Output the string literal CACHE MANIFEST and direct the output to a new file named cache.manifest
find . -type f
Use the find command to identify all files (-type f), starting from the current directory (.)
| sed "s#^\./##"
Pipe (that is, “direct the output from the previous command”) the result into the stream editor command: sed. Using sed, substitute every occurrence of a literal period (.) followed by a forward slash (/) that occurs at the beginning of a line (^) with the empty string (##). NB: the empty string is denoted by the lack of content between the octothorps, which act here as delimiters to separate the command (s) from the search pattern (^./) from the replacement pattern (which is the empty string).
| grep -vi "ds_store"
Pipe the result into the Global Regular Expression Parser: grep. Using grep, find all lines that do NOT contain (-v) the case-insensitive (-i) string literal “ds_store”. Mac OS X stores metadata about files and folders in a hidden file named .DS_STORE. This step removes those files from the manifest list.
>> cache.manifest
and append (>>) the results to the file named cache.manifest
Step 2: Refer to the Cache Manifest in your html opening tag
To do this, simply add to each of your HTML pages that should use cached resources, in the opening html tag, a manifest attribute with a value equal to the path to and filename of the manifest:
<html manifest="cache.manifest">
Step 3: Configure the web server so that the file is sent to the user agent with a specific MIME type: text/cache-manifest
AddType text/cache-manifest appcache manifest
curl --head http://yourhost/cache.manifest HTTP/1.1 200 OK Date: Sun, 17 Jun 2012 17:06:22 GMT Server: Apache/2.2.21 (Unix) DAV/2 Last-Modified: Sat, 16 Jun 2012 22:26:36 GMT ETag: "a48bbe-2a-4c29e6cfd3f00" Accept-Ranges: bytes Content-Length: 42 Content-Type: text/cache-manifest
Don’t forget that IOS devices don’t like spaces in their cache manifest file names.
The command below extends what you’ve proposed to replace each space with a %20
echo “CACHE MANIFEST” > cache.manifest; echo “# generated: “$(date)>> cache.manifest;find . -type f | sed “s#^\./##” | sed “s/ /%20/g” | grep -vi “ds_store” >> cache.manifest
Appreciate this is an old post, but its also the first hit if you’re searching for how to automatically generate an html5 cache manifest (which I was).