Some notes about how different browsers implement disk cache Konqueror {{{ kdelibs/kioslave/http/http_cache_cleaner.cpp kdelibs/kioslave/http/http.cpp }}} Cache is stored in "~/.kde/cache-katherline/http/[0-9a-z]/" The top level directories are the first letter in the host and is historical when processing a lot of files in one dir was harder. Each file is stored by itself. So the file http://www.reddit.com/static/aupmod.png goes in: r/www.reddit.com_static_aupmod.png_2cd5ba49 Each file: "host"_"file"_"fullUrlHash" Each file contains the following (minus the text before each : so the first line is just '7') {{{ Version: 7 url: http://www.reddit.com/static/aupmod.png Creation date: 1213180806 Expire date: 1213765446 ETag: 1207334405.0-334 Last Modified: Fri, 04 Apr 2008 18:40:05 GMT File: }}} In KDE 4 the file is gzip'd for 90% savings Also r/www.reddit.com_static_aupmod.png_2cd5ba49_freq contains the number of times the url has been requested in a fashion that is not lock safe a has a bunch of problems. Not something I would recomend copying A seperate application goes through cleaning the cache removing oldest first. A very simple system. FireFox: {{{ mozilla/netwerk/cache/src/* }}} A detailed article on extracting information from the FireFox cache, a good overview: http://www.securityfocus.com/print/infocus/1832 Chrome http://sites.google.com/a/chromium.org/dev/developers/design-documents/disk-cache