Too Cool for Internet Explorer

Stick Figure Guide to AES

On Ben Laurie’s blog I found a really nice comic explaining the Advanced Encryption Standard :) Find it here and take the time. It’s worth it!

Benchmarking lame presets

As some of you may now, I maintain a very well-regulated audio collection. Some month ago I decided to entirely kick mp3 and from now on just accept FLAC files as valid extension to this collection. Reasons for this decision are not described in one or two sentences so maybe a post on this topic may follow in good times.
However, due to the limited FLAC support for audio hardware (such as *PODs, car radios and so on) I’m still reliant on transcoding my FLAC files to mp3 if I wanna use them with such devices. Since CPU power doesn’t matter nowadays that’s not a real problem. The transcode process doesn’t comsume much more time than the actual file transfer via USB2.0. For sure I use the best mp3 codec available: lame.
Lame implements a huge list of so called presets, which have been developed further in the last years. Without discussing the presets in detail (a corresponding discussion can be found here), presets have to be understood as different quality levels the final mp3 is encoded. As these presets have been going through a process of refinement, different types of usage and labels have come up.
It seems obvious that new and old presets have been linked internally and not all different types of presets and codec switches implement exclusive algorithms. So I did a simple benchmark in order to find out which presets/options actually have to be understood as an unique preset and which link to another one. Of course I’ve been also interested in the speed/efficiency ratio! For the test i compared the “standard” and the “extreme” respectively the V2 and V0 preset as these are most commonly used. The results are quite evident and are listed below.

Setup:

  • codec: lame 3.98.2
  • OS: Linux 2.6.29
  • CPU: Intel L2300 2x (1500MHz, 2MB)
  • Testfile: WAVE (60MB), Tracklength: 5:25

Results:

preset time consumed file size md5sum header entry
–vbr-new -V0 29,40s 10763184 6be8….ea37 -V0n
–vbr-new -V2 28,81s 9258816 0915….cd2d -V2n
-vbr-old -V0 39,80s 10590624 151c….9071 -V0
-vbr-old -V2 41,20s 8872680 f802….9073 -V2
–preset fast extreme 29,41s 10763184 6be8….ea37 -V0n
–preset fast standard 26,38s 9258816 0915….cd2d -V2n
–preset extreme 39,21s 10590624 151c….9071 -V0
–preset standard 40,69s 8872680 f802….9073 -V2

The results look pretty obvious. It seems that internally –vbr-new is identical with –preset fast and the same is true for –vbr-old and –preset. So in other words, the old scheme –preset isn’t used anymore respectively there is always an equivalent for the old and the new preset style.
I still own old mp3 files where header fields say “aps” for “alt preset standard” or “ape” for “alt preset extreme”. However mp3 files encoded with current versions of lame will always be marked with the modern vbr-new style, which is also an indication that the old presets don’t have exclusive relevance anymore.
Furthermore it can be seen, that the new algorithm user by –vbr-new and –preset fast is significant faster (around 25-30%) with slight losing in compression efficiency.

Of course all this is not representative on a large scale but it gives an idea of differences and ratios of different preset types. But for all that FLAC should always be first choice for audio encoding, since good music doesn’t deserve to get encoded lossy ;)

What the FLAC? A journey from MP3 to FLAC

One of my current idle tasks is to nurse and improve my music collection. A result of that work is a pretty mature specification of how my digital music collection should look like – properly tagged and a consistent file and directory structure of course.

The most recent spleen is the migration from MP3 to FLAC. That spleen actually emerged during a nasty file system corruption which was caused by the experimental ext4 driver.  As a consequence,  a considerable amount of recently created MP3 files had audible glitches.

While I tried to find all broken music files, I realized that MP3 wasn’t exactly designed to check them for integrity – it’s lossy in every meaning :?

FLAC is different – it’s lossless in every meaning :) At least one is able to indicate a broken file as such: FLAC integrates a 16-bit CRC for each frame and a MD5 signature of the unencoded audio data. It’s also open, fast and well supported by almost every player software out there. As disk space is becoming cheaper and cheaper and the transcoding speed (eg. FLAC → OGG) is even faster as the transfer speed of USB 2.0, there are few counter arguments remaining.

How to keep Amarok statistics?

My favorite audio player is Amarok. Besides many other features it tracks which songs you like, when and how often you played them.

During the migration I wouldn’t want to lose those stats. This is where a nice feature comes into play: Amarok File Tracking (AFT). It enables Amarok to identify an audio track, even if it is moved or renamed. Moving/Renaming, isn’t that exactly what we’re doing? Right!

I asked an Amarok developer (Jeff Mitchell) how AFT could be abused leveraged for my needs. He immediately fixed AFT to work with FLAC and gave me migration instructions:

  1. Use amarok_afttagger to embed identifiers into your MP3s.
  2. Make sure that those files are scanned into Amarok (full rescan), so
    the UIDs in the tags overwrite the ones that were auto-generated before.
  3. Use a script to transfer the identifiers into the FLACs.
  4. Remove the MP3s from Amarok’s collection, add the FLACs, and do a
    full rescan.

Step 3. might be most difficult.
AFT uses the UFID header in case of a MP3 (ID3) and a Vorbis comment in case of a FLAC.
To transfer the unique identifier from MP3 to FLAC all you need to do is to “copy/paste” that tag.

After you successfully ran step 2. you can dump the UFID of a MP3 file using some ID3 tool like eyeD3:
12 Clarity.mp3 [ 7.96 MB ]
--------------------------------------------------------------------------------
Time: 4:03 MPEG1, Layer III [ ~274 kb/s @ 44100 Hz - Joint stereo ]
--------------------------------------------------------------------------------
ID3 v2.4:
title: Clarity artist: Jimmy Eat World
album: Clarity year: 1999
track: 12/13 genre: Rock (id 17)
Publisher/label: Capitol Records
Unique File ID: [Amarok 2 AFTv1 - amarok.kde.org] 62e164257a0c5c01761ac7740269d31c
Unique File ID: [http://musicbrainz.org] fec93932-51af-4f05-8aa1-3457bbc1e3b4

The identifier (62e164257a0c5c01761ac7740269d31c) can then be inserted into the corresponding FLAC file.

Automation

To automate the transfer of the identifier, I wrote a small python script (using Mutagen).
Given the case, that all files are properly assigned a MusicBrainz trackid (eg. using Picard), it can be used like that:

# cd /music/interpret/mp3-album
# amarok-ufid.py dump
# cp ufid.dump /music/interpret/flac-album
# cd /music/interpret/flac-album
# amarok-ufid.py apply

done!

Hiding information steganographically

I just found this nice Unix tool to hide sensible data in your multimedia files like JPEG images: steghide

It attaches the secret in compressed+encrypted form and even adds a checksum ;)

The usage is quite simple:

# steghide embed -cf hidden.jpg -ef secret.txt
# steghide extract -sf hidden.jpg

Iotop

One of the drawbacks of top is that it often can’t help to spot processes which push up system load so high.
High system loads are often caused by very I/O intensive tasks.
And as I/O intensive needn’t mean CPU intensive, those tasks may not even show up in top.

This is where Iotop comes into play. It is just a python script which evaluates the per-task disk I/O accounting statistics exported by the Linux kernel (2.6.20+).

Iotop screenshot

This is what you need to enable in your kernel configuration:

General setup
[*] Export task/process statistics through netlink (EXPERIMENTAL)
[ ]   Enable per-task delay accounting (EXPERIMENTAL)
[*]   Enable extended accounting over taskstats (EXPERIMENTAL)
[*]     Enable per-task storage I/O accounting (EXPERIMENTAL)

Now THAT feels ever more like home…

Firefox 3 is finally ready to download (though all servers are at the moment more or less overwhelmed, due to the worldrecord attempt).

Since it’s first release it integrates very well into the various systems because someone at Mozilla (after all) had the look & feel in mind. I think thats a big achievement though I stilled wished a support for the various password managers like the keychain or the kde wallet would have made it into the final release.

To top it all off, the UI gets even more “mac” with Aroonax’s GrApple Theme from takebacktheweb.org. (Which is an awesome domainname by the way ;) )

So folks, have a nice day and light a candle, hoping for people around the world to finally abandon this so called “internet explorer” and get their copy of web-freedom.

word!

Songbird

SongbirdDid anyone try Songbird? As a Linux user I’m always in search of a good audio player. Of course there is Amarok and of course there is Audacious but in my eyes there is not much in between. Amarok is quite large and didn’t really work out at all on my 600MHz P3. On the other hand Audacious is quite tiny but not made for handling large music archives like, .. let’s say.. mine ;)
More accidentally I discovered a new audio player called Songbird. It somehow uses mozillas xul and seems to be a good compromise. To be honest, I didn’t test it too much so far. This might relate to my love-hate relationship to some other media player. Obviously the design is somehow inspired by this specific product ;)
The current version 0.6 is still alpha, so please be patient. Maybe at some day it becomes a real alternative to the mentioned audio players. At least I already love it for it’s logo :D

Linux kernel 2.6.25.6

Chris Wright just pushed version 2.6.25.6 to the current stable kernel tree.

It’s definitely worth to check it out: There are 50 commits since 2.6.25.5, many of them backports of bugfixes in 2.6.26-rcx…

Happy compiling ;)

Parse HTML the Groovy way

In the last couple of weeks I often had to download a lot of files, submitted to a web-based teaching platform. Downloading all these files by hand is very annoying so I implemented a short Groovy script. Since Groovy has a great support for parsing well-formed XML-like information it fails if you want to parse unstructured and nasty HTML code.

At last I searched for a Java library containing an HTML-parsers and I found TagSoup. This is a SAX-compliant HTML-parser specialized in re-formating and cleaning up faulty HTML code.

This is <B>bold, <I>bold italic, </b>italic, </i>normal text

will be rewritten to

This is <b>bold, <i>bold italic, </i></b><i>italic, </i>normal text.

One advantage of TagSoup is the Xpath-like query mechanism. It parses the HTML code and generates an object structure representing this content. Now the user can access the single elements. One possible example could be:

def slurper = new XmlSlurper(new org.ccil.cowan.tagsoup.Parser())
html = slurper.parse("an_example_file.html")
table = html.body.div.find{ it.@id == "content" }.form.table.
find{ it.@id == "attempts" }

This retrieves the table “attempts” placed inside a form in the div “content”. The method findAll() will retrieve all elements for a given attribute or with given child elements.

After all I fell in love with TagSoup. It saves a lot of work when you have to access HTML content of websites, portals or similar, which are not able to send a XHTML 1.x compliant responses. But this is an other topic ;) .

Big Buck Bunny

Shortly after Blender 2.46, Big Buck Bunny was released recently. The new open movie (released under CC 3.0) resulted from the “Peach” project, which was already the second open movie project after project “Orange” (resulting in Elephants Dream).

Both movies demonstrate the numerous astonishing features of Blender and caused loads of valuable feedback for improvements for the Blender developers. In order to keep up with the young tradition, Blender already started the follow-up project called Apricot in December 2007. This time the main focus lies not only on modeling and rendering, but also on the included 3D-Engine that Blender comes with.

Powered by WordPress with GimpStyle Theme design by Horacio Bella. Get Entries and comments.