In order to tag songs properly, I need some authoritative source for the titles of songs. The database from MusicBrainz is an excellent candidate. There are three major issues with MusicBrainz:
- The core of their track identification system is a proprietary online service known as MusicDNS. I can generate audio fingerprints for my files and submit those for matching GUIDs. The system works well enough, but is completely closed and I don't like that.
- MusicBrainz is an album structured database. It compiles track lists for albums. That's fine if you were using it as a replacement for something like CDDB, but many of the tracks I want to identify have never been on an album.
- MusicBrainz uses postgresql to run their site. DreamHost, my provider, only has MySQL available.
The proprietary nature of the system is something that I'm willing to accept for the time being. The other two issues can be solved my taking the existing database dump and running it through a script to produce a MySQL dump. The data that is important to me:
A python script, load_mbdump.py, then converts the PostgreSQL table to MySQL format. Once the tables have been loaded into MySQL, clean_db.php can be used to reformat the entries a bit for usage.