LAME

Mainline

Modifications and customizations

ATH Type 5

Neglect not and marginalize not frequencies above 15 kHz.

(I currently use MinGW GCC 3.4.4 or GCC 3.4 to compile binary executables. Pre-2004 binaries were built with MinGW GCC 2.95.3.)


Mainline

Download a command-line Windows compile of LAME 3.98alpha2 (including the optional "--athtype 5" patch). Compared to the 3.97 beta, this version includes a fix by R.H. to handle certain cases correctly when using "--vbr-new". Personally, I always favor using "--vbr-new" code whenever I encode in VBR, due to its efficiency.

Due to the increasing popularity of Linux, you can grab a like build of LAME 3.98alpha2 for Linux. Just tar -zxf, and toss the binary into your favorite "bin" directory.

Here's an old Windows compile of LAME 3.94beta, for historic comparison. (This executable does not include custom patches, or LAME's assembly-optimized routines, and will be slower than a binary that does.)

Modifications and customizations

ATH Type 5

Once in the past, I was not satisfied with the existing LAME ATH (absolute threshold of hearing) curves at frequencies above 15 kHz. It seemed that existing curves concentrated relatively little on that region. (Plot an ATH curve on a logarithmic frequency scale, and imagine why.) Over one fifth of the available information bandwidth of a signal limited to 19 kHz, for example, occurs over 15 kHz. Relatively low interest in this frequency region often leads to aggressive encoder behavior, justified at low bit rates, but not at open bit rates that seek to preserve practically everything humanly audible.

I proceeded to record measurements above 15 kHz in 500 Hz increments up to 22 kHz, and merged the measurements taken from several runs a few days apart. My measured data (above 15 kHz) appeared lower than what the ATH types built into LAME would have suggested. However, I also found a threshold of hearing curve (data that ff123 credited to Adam using a pair of Sony earbuds) that roughly did match. For comparison, I used a set of Monsoon MM-1000 flat-panal speakers, with careful speaker placement.

To build a complete ATH curve, I linked my data together with the lower frequency (less than 15 kHz) sections of several existing ATH curves.

Performance

Above 15000 Hz, roughly 10 dB occurs between the edge of ready detection, and statistical indistinguishability. Although I set the ATH curve halfway between these levels, it is still typically a bit lower than the other ATH types of LAME. Thus, encoding some samples at variable bit rates will yield larger output files.

Trade-offs abound, and this modification may not yield the desired personal balance without tuning. Perceived quality per bit may well go down (because of the higher bit rate). Limited fixed bit rates may suffer; as MPEG layer-3 encoding seeks a balanced limit of perceived noise, it may reduce noise in the highest frequency bands, and allow more noise in the lower frequency bands that may be much more annoying. Unfortunately, no "relative annoyance threshold" (RAT) guides the encoding process at fixed bit-rate targets, only the assumption that louder perceived noise is always more annoying at any band, regardless of sample.

At lower bit rates, the encoder will perform a low-pass filter operation on the audio signal before encoding, thus rendering any ATH consideration pointless above that filter's cut-off frequency.

Note that at the high frequencies, the effects of the incident angle of the audio wave upon the ear, and objects placed in the vicinity of the aural field become much easier to perceive. (Bats have good reason for using ultrasonic frequencies for echolocation.) In other words, even something like speaker spacing or headphone type can mask or reveal whether an encoder is producing artifacts at these frequencies.

Improvement

For some, reducing the 15+ kHz ATH sensitivity, for example, by adding 5 dB to place the ATH at the edge of ready detection, might be worth a try to save bits. (I do not recommend simply using "--athlower" to accomplish this, as the lower frequencies may well be unacceptably affected.)

You may also find this code useful, if you're serious about tweaking the ATH curve to fit your own experimental observations. Unlike the stock ATH formulas built into the LAME source code in "libmp3lame/util.c", you can plug experimental values directly into the table inside the ATHformula_jd function. (If you're comfortable with a hex editor, and know where to look, you can even modify the values directly in the executable binary without recompiling.)

Download

Here is a source patch for LAME 3.97 or 3.98 that restores this ATH curve (activated with "--athtype 5"). Although originally included in mainline LAME 3.92, it was since removed in a reductionist code clean-up. (Specifying any unrecognized "--athtype" will select the default ATH.)

Download LAME 3.98alpha2 with ATH type 5, compiled for Windows, if you care to test it out.

For historic comparison, here's a Windows compile of LAME 3.94beta with ATH type 5, and the corresponding patch for 3.94.


/ code/ audio/
Copyright © 2003-2005 jodarom -- last update: December 8, 2005