Accessing content from Google Chrome cache

If you want to know how Google Chrome stores files in its cache on your computer, the code is open source and you can look it up here: http://www.chromium.org/developers/design-documents/network-stack/disk-cache

Otherwise, you can use Nirsoft’s GoogleChromeView to access it.

Chrome Cache Default Folder Location

First, if you wanted to know the exact location of Chrome cache folder, it is at:

your local application data folder\Local\Google\Chrome\User Data\Default\Cache

in Windows Vista
C:\Users\[USERNAME]\AppData\Local\Google\Chrome\User Data\Default\Cache

in Windows XP
C:\Documents and Settings\[USERNAME]\Local Settings\Application Data\Google\Chrome\User Data\Default\Cache

Index and Data Files

In there you will find an index file and 4 data files. The index file catalogs the files. The data files store the actual binary files. If cached files are larger than any of the data file entry size limit, they will be a standalone, separate file, similar to how Internet Explorer does it. Separate cache files are annotated like “f_00000xx”, the xx representing an arbitrary number.

Simple ways to determine file types

By file size
If you’re looking for a video file, you can always assume it’ll be the biggest files sizes. Images, particularly JPEGS, will have a similar range of sizes, but are generally mid-ranged about 50KB to 300KB. Of course, I cannot guarantee these numbers, it’s just my observation. The smallest files are usually text-based. Some examples are html, php, javascript, xml files.

By binary header
Using the filesize to predict the filetype is the first step in locating the file you’re looking for. Of course, it’s not a given. A more precise method is to look at the fileheader on a binary level. That can easily be done by opening the cache file in a hex editor. With a little research and some organization, you will be able to differentiate between the many different files.

Brute Force Extraction Method #1

However, the cache files embedded in the data files aren’t as easy to pinpoint. You’ll have to parse the data files out. Each data file has an 8KB header. After the header follows the entry blocks. The entry block sizes are predefined in the header. The blocks don’t store just the cache file itself, but some other identifying information as well. At times, some blocks contain only a text string, which is used as a resource key, only an identifier. Therefore, you’ll need to parse even a block/entry from a data file to get to the actual binary information of a cache file. To get a complete set of info on one entry, the information might be scattered across multiple data files. Google has implemented it this way to access variables/binary content more efficiently. It obviously works wonders for performance, but isn’t user-friendly.

Brute Force Extraction Method #2

If you just want to see a list of all the cached files, you could use Chrome’s built-in “about” uri. Type in about:cache into the url bar and you’ll get a full list of all the indexed cached files. The problem is, however, that it’s not really user-friendly and there’s no way to extract the file itself. By clicking on a file link, Chrome actually shows the binary data among other debug information, including the web header. That could be an additional way to make sure the file is what you want. You can right click on the file link and do “Save Link As” to save out the file, but that saves out the debug page, not the actual file.

You could always bruteforce copy the binary data. That wouldn’t be anymore different than manipulating the cache files with a hex editor. For example,if there was an image you wanted to save out from the cache, click on the link and then copy each hexadecimal bit exactly into a new blank file in the hex editor. Of course, you need to know what you’re doing, otherwise you could seriously damage your cache files or other system files.

Tags: , , , , , , , , , ,

Comments are closed.