ID3 Specifications

The ID3 site has a good overview of ID3. There is a type of graph that I would like to reproduce and so I'll try:

Well, it's a pretty graph, but really better suited for showing proportions than a technical spec. I guess I'll leave it here for reference… The fields are null padded to their appropriate length. The genre is a code. Initially there were 80 possible genres and this list was extended by Nullsoft:

00	Blues
01	Classic Rock
02	Country
03	Dance
04	Disco
05	Funk
06	Grunge
07	Hip-Hop
08	Jazz
09	Metal
0A	New Age
0B	Oldies
0C	Other
0D	Pop
0E	R&B

0F	Rap
10	Reggae
11	Rock
12	Techno
13	Industrial
14	Alternative
15	Ska
16	Death Metal
17	Pranks
18	Soundtrack
19	Euro-Techno
1A	Ambient
1B	Trip-Hop
1C	Vocal
1D	Jazz+Funk

1E	Fusion
1F	Trance
20	Classical
21	Instrumental
22	Acid
23	House
24	Game
25	Sound Clip
26	Gospel
27	Noise
28	Alternative Rock
29	Bass
2A	Soul
2B	Punk
2C	Space

2D	Meditative
2E	Instrumental Pop
2F	Instrumental Rock
30	Ethnic
31	Gothic
32	Darkwave
33	Techno-Industrial
34	Electronic
35	Pop-Folk
36	Eurodance
37	Dream
38	Southern Rock
39	Comedy
3A	Cult
3B	Gangsta

3C	Top 40
3D	Christian Rap
3E	Pop/Funk
3F	Jungle
40	Native US
41	Cabaret
42	New Wave
43	Psychadelic
44	Rave
45	Showtunes
46	Trailer
47	Lo-Fi
48	Tribal
49	Acid Punk
4A	Acid Jazz

4B	Polka
4C	Retro
4D	Musical
4E	Rock & Roll
4F	Hard Rock
50	Folk
51	Folk-Rock
52	National Folk
53	Swing
54	Fast Fusion
55	Bebob
56	Latin
57	Revival
58	Celtic
59	Bluegrass

5A	Avantgarde
5B	Gothic Rock
5C	Progressive Rock
5D	Psychedelic Rock
5E	Symphonic Rock
5F	Slow Rock
60	Big Band
61	Chorus
62	Easy Listening
63	Acoustic
64	Humour
65	Speech
66	Chanson
67	Opera
68	Chamber Music

69	Sonata
6A	Symphony
6B	Booty Bass
6C	Primus
6D	Porn Groove
6E	Satire
6F	Slow Jam
70	Club
71	Tango
72	Samba
73	Folklore
74	Ballad
75	Power Ballad
76	Rhythmic Soul
77	Freestyle

78	Duet
79	Punk Rock
7A	Drum Solo
7B	Acapella
7C	Euro-House
7D	Dance Hall
7E	Goa
7F	Drum & Bass
80	Club-House
81	Hardcore
82	Terror
83	Indie
84	BritPop
85	Negerpunk
86	Polsk Punk

87	Beat
88	Christian Gangsta
89	Heavy Metal
8A	Black Metal
8B	Crossover
8C	Contemporary Christian
8D	Christian Rock
8E	Meringue
8F	Salsa
90	Thrash Metal
91	Anime
92	JPop
93	SynthPop

ID3v1.1

This is essentially the same as v1, but the last two bytes of the comment are taken to represent the track number. The first byte is a null just to guarantee that a v1 parser doesn't try to parse the track number. The second byte is the track number (not an ASCII digit).

ID3v2

The rigidity of ID3v1 should be readily apparent. To deal with this, as well as some other issues, ID3v2 was introduced. It is a frame based format allowing for quite a bit of flexibility and extensibility.

ID3v2.2

ID3v2.2 Tag

The tag is a container for several frames that contain actual information. The tag has a ten byte header:

ID3v2.2 Frame

The tag is comprised of several frames which contain the actual information. The header for the frame is:

ID3v2.4

The various frames are pretty straightforward, but there are a couple changes in 2.4 that are particularly interesting to me:

TPE[1-4] Frame

Because of the amount of data I am dealing with in this program, I am planning on backing it with a database. One of the irritating things that I would be doing is taking songs with multiple artists nd separate the artists. As it is when I search for songs by Ciara, I don't get duets which include Ciara because the name doesn't match. If I have the names separated out I can search better.

In ID3v2.3 doing this requires some method for taking a list of artists and combining them. A good method is to separate each artist from the next with a comma except for the last two which are separated by "and." This way we get John, Paul and Ringo. What happens with Crosby, Stills and Nash though? That is a single band and I don't want to separate those artists. Also Jay-Z and Linkin Park should be split, C and C Music Factory should not. Allowing multiple artists to be stored unambiguously is nice.

SYLT Frame

Synchronized Lyrics and Text lets me set up karaoke, which is cool, and singalong, which is even cooler. One of my goals in life is to be able to understand French rap and I'd love to be able to have the lyrics play in time with the music. Additionally the program for syncing the lyrics could be used for creating ETCO transition points for the slideshow program. I'd just need some sort of interface to record them in conjunction with keyboard events or something like that.

Synchronized Lyrics/Text Header (SYLT)
Text Encoding	$00 • ISO-8859-1 [ISO-8859-1]. Terminated with $00. $01 • UTF-16 [UTF-16] encoded Unicode [UNICODE] with BOM. All strings in the same frame SHALL have the same byteorder. Terminated with $00 00. $02 • UTF-16BE [UTF-16] encoded Unicode [UNICODE] without BOM. Terminated with $00 00. $03 • UTF-8 [UTF-8] encoded Unicode [UNICODE]. Terminated with $00.
Language	ISO-639-2 three byte language code
Time Stamp Format	$01 • Absolute time, 32 bit sized, using MPEG [MPEG] frames as unit $02 • Absolute time, 32 bit sized, using milliseconds as unit
Content Type	$00 • other $01 • lyrics $02 • text transcription $03 • movement/part name (e.g. "Adagio") $04 • events (e.g. "Don Quixote enters the stage") $05 • chord (e.g. "Bb F Fsus") $06 • trivia/'pop up' information $07 • URLs to webpages $08 • URLs to images
Content Descriptor	text string according to encoding

Each syllable is positioned chronologically. The example given in the spec is: (note the placement of spaces and the newline (0A) character)

I like this system more than the CD+G used in normal karaoke machines. That allows for arbitrary bitmaps to be displayed on screen, which is more versatile, but which destroys the actual text. If all my songs included these tags, I could search for songs based on a lyric from the song.

UFID Frame

The Unique File Identifier frame is interesting to me because though I really like the idea, I am not at all pleased with musicbrainz's interface. This frame could be used to hold a MusicBrainz id.

Unique File Identifier Header (UFID)
Owner Identifier	ISO-8859-1 encoded uri (preferably mailto:) $00
Identifier	up to 64 bytes binary data

API

Synchronization

The important bit in all that so far as tagging is the first 11 bits, all 1 which represent a frame sync. If a tag were to contain those bits in a player that doesn't recognize ID3v2 tags, the tag data would be incorrectly interpreted as MPEG frame data. To avoid this, if a tag is marked as unsynchronized, all occurrences of 11111111 111xxxxx are replaced with 11111111 00000000 111xxxxxx. This process performed after any compression and undone before the frames interpreted.

The issue is that the string 11111111 00000000 could occur in the data before unsynchronization and upon resynchronization the 00000000 would be erroneously lost. To avoid this, 11111111 00000000 is replaced with 11111111 00000000 00000000 in the unsyncronization process.

Identifier	1-3	"`ID3`"
Version	4-5	major (`02`) minor (`02`)
Flags	6	[uc000000] `u` - if the data is unsynchronized `c` - if the data is compressed
Size	7-10	[0xxxxxxx]{4} 28 bits for the size of the tag after unsynchronization without the header

Identifier	1-3	[A-Z0-9]{3} ([XYZ][A-Z0-9]{2} for experimental use)
Size	4-7	size of the tag
Encoding