Quantcast
Channel: gooli.org » Python
Viewing all articles
Browse latest Browse all 10

Extracting bookmark icons (favicons) from Firefox

$
0
0

Favicons are those little icons sites have that end up in your bookmark list if you add the site there. I was wondering where were all these icons stored when I ran across this post that explains that Firefox holds those icons as base64-encoded strings inside the bookmarks.html file.

I wanted to get access to all those icons and wrote a small Python script to help me do that.

I used the Python built-in base64 module to handle base64 encoding and decoding and the wonderful BeautifulSoup library for parsing the bookmarks.html file.

The resulting code snippet is quite short:

import base64
import re
from BeautifulSoup import BeautifulSoup

HEADER = "data:image/x-icon;base64,"

f = file("bookmarks.html")
page = BeautifulSoup(f)
for tag in page.findAll("a"):
    try:
        iconData = tag["icon"]
        print tag.string
        if iconData.startswith(HEADER):
            iconData = iconData[len(HEADER):]
            iconBinaryData = base64.decodestring(iconData)
            iconFilename = re.sub("[^a-zA-Z0-9_\-.' ]", "_", tag.string) + ".ico"
            file(iconFilename, "wb").write(iconBinaryData)
    except KeyError:
        pass

Using BeautifulSoup, I do a simple for on all the <a> tags in the bookmarks.html file. For each such tag, I get the “icon” attribute and parse the base64 encoded icon using the base64 module after removing a small header that Firefox puts in each “icon” attribute before the actual icon data.

Maybe somebody will find it useful someday.


Viewing all articles
Browse latest Browse all 10

Trending Articles