iTunes 4 has arrived, and with it, the Music Store, AAC encoding and decoding, and artwork to go with your tunes.
Very cool. So how does it work? What follows is some stuff that might be handy for devloping helper apps, or just poking around.
Legal CYA note: I'm pretty sure this is all in line with the Music Store terms of service - this isn't a derivative work since this just examines the very-public ID3 and QuickTime formats, and I'm not looking at ways to get around the security technology in iTunes 4. In fact, the Music Store doesn't work for me (it's never satisfied with my account information), so I don't have any DRM'ed files to play with anyways.
As an MPEG-4 offshoot, it shouldn't surprise anyone that the AAC files now used by iTunes use the QuickTime file format, which was adopted for the MPEG-4 file format. In an OnJava article a few months ago, I examined this format and provided an all-java parser for the format.
Here's what a typical .m4a file looks like when run through the parser:
The big picture is that there's a
ftyp (32 bytes)
moov (73049 bytes) - 3 children
mvhd (108 bytes)
trak (54805 bytes) - 2 children
tkhd (92 bytes)
mdia (54705 bytes) - 3 children
mdhd (32 bytes)
hdlr (34 bytes) [/soun - ]
minf (54631 bytes) - 3 children
smhd (16 bytes)
dinf (36 bytes) - 1 child
dref (28 bytes)
stbl (54571 bytes) - 5 children
stsd (103 bytes)
stts (24 bytes)
stsc (40 bytes)
stsz (51908 bytes)
stco (2488 bytes)
udta (18128 bytes) - 1 child
meta (18120 bytes)
mdat (4819123 bytes)
moov (a "moo-vee", get it?) with a single audio track. The structure here, with the stbl sample table that describes where to get chunks of from the raw mdat media data is pretty typical.
But notice that udta atom. That's "user data", a wild and wooly place to hide anything you darn well like. My parser doesn't think that meta is a container and the parsing stops there... meta isn't described in the QT file format doc, so maybe it's an MPEG-4 atom or maybe just a proprietary Apple format.
At any rate, the contents are clearly our music metadata Here's some character data from the right column of a hexdump -C:
The use of 4 bytes of size, and 4-byte names like
.>.ilst...;.nam.
..3data........A
eris' Theme (Orc
hestrated Versio
n)...%.ART....da
ta........Nobuo
Uematsu...8.alb.
..0data........F
inal Fantasy VII
Reunion Tracks.
...gnre....data.
............ trk
n....data.......
.............cpi
l....data.......
......tmpo....da
ta.............2
.too...*data....
....iTunes v4.0,
QuickTime 6.2..
..----....mean..
..com.apple.iTun
es....name....iT
genr make this structure look like normal Quicktime atoms (the nam and alb are preceded by the same byte, 0xA9, so these are probably 4-byte name constants too), so maybe this is some MPEG-4 thing I don't know about (yes, I'm too cheap to buy the real MPEG-4 docs). At any rate, it looks eminently parsable.
iTunes still uses version 2.2 of the ID3 MP3 metadata standard, as indicated by the first four bytes of a tagged file: 49 44 33 02, or ID3 followed by the version byte 0x02. According to the standard, a picture can live in a PIC frame, and if you look for it with a hex editor, that's exactly what you'll find. This will be followed by 8 bytes (three for frame size, one for text encoding, three for image format [iTunes seems to always be JPG], one for picture type [probably 0x00 for "other]), then a null-terminated description string (looks like iTunes doesn't use the String, so there's just a 0x00), and then the image data. Plenty easy to find or write once you look at the standard.
Interesting thing I've noticed. Take a .gif file. Look at its header. The first six bytes should be GIF89a. Now use this file as your album art for a track in iTunes. Something interesting has happened.
I took Roxy Music's notorious and outrageous cover to Country Life, turned the JPEG into a GIF with GraphicConverter, and dropped it on to the 10 tracks of the Country Life album. Then I dug in to one of the MP3's and found the PIC frame. After the header, GIF89a is not there. Instead, it's 0x89 followed by PNG. Looking further, we see the chunks IHDR, PLTE, IDAT... yep, it looks like PNG format to me. I found the same thing had happened in the presumable covr atom of the AAC file - a dropped GIF has become a PNG.
Which is interesting because this suggests iTunes has quietly performed a GIF to PNG conversion when I dropped in the artwork. Conspicuously or coincidentally, Jaguar's "Preview" app includes JPEG and PNG among the formats it can convert to, but does not export to GIF. Perhaps Apple has quietly decided to burn all GIFs. If so, my hat's off to them.
One non-technical note you'll have to oblige me, as I bust out the "when I was your age" cliches...
Some are complaining about the music selection in the iTunes Music Store, but frankly, I'm pretty impressed. When compact discs came out in the early 80's, there were virtually no pop or rock CD's available, and some people said the format was too good for anything but classical and maybe jazz. Pop CD's eventually did emerge, but some were a long time coming. In particular, the Beatles didn't hit CD until about 1987, by which time CD players were rather common in the US. Today, the Beatles aren't in the Music Store, but give it some time. Frankly, I'm more worried about the small-label artists, both the young punk bands on hole-in-the-wall labels, and the classic artists like Todd Rundgren, Elvis Costello, and Frank Zappa, whose stuff is on specialty reissue labels like Rhino and Rykodisc. Hopefully the Music Store will eventually stock their stuff too.
Chris Adamson is an author, editor, and developer specializing in iPhone and Mac.
oreillynet.com Copyright © 2006 O'Reilly Media, Inc.