Too Cool for Internet Explorer

Benchmarking lame presets

benchmarking-lame-presets

As some of you may now, I maintain a very well-regulated audio collection. Some month ago I decided to entirely kick mp3 and from now on just accept FLAC files as valid extension to this collection. Reasons for this decision are not described in one or two sentences so maybe a post on this topic may follow in good times.
However, due to the limited FLAC support for audio hardware (such as *PODs, car radios and so on) I’m still reliant on transcoding my FLAC files to mp3 if I wanna use them with such devices. Since CPU power doesn’t matter nowadays that’s not a real problem. The transcode process doesn’t comsume much more time than the actual file transfer via USB2.0. For sure I use the best mp3 codec available: lame.
Lame implements a huge list of so called presets, which have been developed further in the last years. Without discussing the presets in detail (a corresponding discussion can be found here), presets have to be understood as different quality levels the final mp3 is encoded. As these presets have been going through a process of refinement, different types of usage and labels have come up.
It seems obvious that new and old presets have been linked internally and not all different types of presets and codec switches implement exclusive algorithms. So I did a simple benchmark in order to find out which presets/options actually have to be understood as an unique preset and which link to another one. Of course I’ve been also interested in the speed/efficiency ratio! For the test i compared the “standard” and the “extreme” respectively the V2 and V0 preset as these are most commonly used. The results are quite evident and are listed below.

Setup:

  • codec: lame 3.98.2
  • OS: Linux 2.6.29
  • CPU: Intel L2300 2x (1500MHz, 2MB)
  • Testfile: WAVE (60MB), Tracklength: 5:25

Results:

preset time consumed file size md5sum header entry
–vbr-new -V0 29,40s 10763184 6be8….ea37 -V0n
–vbr-new -V2 28,81s 9258816 0915….cd2d -V2n
-vbr-old -V0 39,80s 10590624 151c….9071 -V0
-vbr-old -V2 41,20s 8872680 f802….9073 -V2
–preset fast extreme 29,41s 10763184 6be8….ea37 -V0n
–preset fast standard 26,38s 9258816 0915….cd2d -V2n
–preset extreme 39,21s 10590624 151c….9071 -V0
–preset standard 40,69s 8872680 f802….9073 -V2

The results look pretty obvious. It seems that internally –vbr-new is identical with –preset fast and the same is true for –vbr-old and –preset. So in other words, the old scheme –preset isn’t used anymore respectively there is always an equivalent for the old and the new preset style.
I still own old mp3 files where header fields say “aps” for “alt preset standard” or “ape” for “alt preset extreme”. However mp3 files encoded with current versions of lame will always be marked with the modern vbr-new style, which is also an indication that the old presets don’t have exclusive relevance anymore.
Furthermore it can be seen, that the new algorithm user by –vbr-new and –preset fast is significant faster (around 25-30%) with slight losing in compression efficiency.

Of course all this is not representative on a large scale but it gives an idea of differences and ratios of different preset types. But for all that FLAC should always be first choice for audio encoding, since good music doesn’t deserve to get encoded lossy ;)

What the FLAC? A journey from MP3 to FLAC

what-the-flac-a-journey-from-mp3-to-flac

One of my current idle tasks is to nurse and improve my music collection. A result of that work is a pretty mature specification of how my digital music collection should look like - properly tagged and a consistent file and directory structure of course.

The most recent spleen is the migration from MP3 to FLAC. That spleen actually emerged during a nasty file system corruption which was caused by the experimental ext4 driver.  As a consequence,  a considerable amount of recently created MP3 files had audible glitches.

While I tried to find all broken music files, I realized that MP3 wasn’t exactly designed to check them for integrity - it’s lossy in every meaning :?

FLAC is different - it’s lossless in every meaning :) At least one is able to indicate a broken file as such: FLAC integrates a 16-bit CRC for each frame and a MD5 signature of the unencoded audio data. It’s also open, fast and well supported by almost every player software out there. As disk space is becoming cheaper and cheaper and the transcoding speed (eg. FLAC → OGG) is even faster as the transfer speed of USB 2.0, there are few counter arguments remaining.

How to keep Amarok statistics?

My favorite audio player is Amarok. Besides many other features it tracks which songs you like, when and how often you played them.

During the migration I wouldn’t want to lose those stats. This is where a nice feature comes into play: Amarok File Tracking (AFT). It enables Amarok to identify an audio track, even if it is moved or renamed. Moving/Renaming, isn’t that exactly what we’re doing? Right!

I asked an Amarok developer (Jeff Mitchell) how AFT could be abused leveraged for my needs. He immediately fixed AFT to work with FLAC and gave me migration instructions:

  1. Use amarok_afttagger to embed identifiers into your MP3s.
  2. Make sure that those files are scanned into Amarok (full rescan), so
    the UIDs in the tags overwrite the ones that were auto-generated before.
  3. Use a script to transfer the identifiers into the FLACs.
  4. Remove the MP3s from Amarok’s collection, add the FLACs, and do a
    full rescan.

Step 3. might be most difficult.
AFT uses the UFID header in case of a MP3 (ID3) and a Vorbis comment in case of a FLAC.
To transfer the unique identifier from MP3 to FLAC all you need to do is to “copy/paste” that tag.

After you successfully ran step 2. you can dump the UFID of a MP3 file using some ID3 tool like eyeD3:
12 Clarity.mp3 [ 7.96 MB ]
——————————————————————————–
Time: 4:03 MPEG1, Layer III [ ~274 kb/s @ 44100 Hz - Joint stereo ]
——————————————————————————–
ID3 v2.4:
title: Clarity artist: Jimmy Eat World
album: Clarity year: 1999
track: 12/13 genre: Rock (id 17)
Publisher/label: Capitol Records
Unique File ID: [Amarok 2 AFTv1 - amarok.kde.org] 62e164257a0c5c01761ac7740269d31c
Unique File ID: [http://musicbrainz.org] fec93932-51af-4f05-8aa1-3457bbc1e3b4

The identifier (62e164257a0c5c01761ac7740269d31c) can then be inserted into the corresponding FLAC file.

Automation

To automate the transfer of the identifier, I wrote a small python script (using Mutagen).
Given the case, that all files are properly assigned a MusicBrainz trackid (eg. using Picard), it can be used like that:

# cd /music/interpret/mp3-album
# amarok-ufid.py dump
# cp ufid.dump /music/interpret/flac-album
# cd /music/interpret/flac-album
# amarok-ufid.py apply

done!

Powered by WordPress with GimpStyle Theme design by Horacio Bella. Get Entries and comments.