Audio Encoding: 8kHz 8-bit mu-law wav for IVR systems

Ok, so I’ve recently had to deliver some files to a client for their IVR phone systems; i.e., the stuff you hear while you’re on hold. It has been most troublesome finding software that will actually hack it down to 8-bit, since no one really cares about 8-bit audio anymore (except people using these systems). After trying many things, SRC and dithering with Izotope RX (arguably the highest quality software to use for SRC), I found that the result was acceptable, though noisy from the dithering, not to mention that I still ended up with a 16-bit file. Audacity kind of worked, but I had to go get an old version to make it 8-bit.

What ended up working the best by a large margin was actually sox. I did it on linux, but it appears it’s cross-platform. I fed it 48kHz, 16-bit wav files, and let it do its work. It ended up being super clean and encoded perfectly.

Here’s a script that you can drop in the folder full of wav files you need to convert (after you install sox through your package manager of choice:

mkdir $out
for f in *.wav; do
    sox $f -t wav -e mu-law -r 8000 -b 8 -c 1 $out/$f

If it worked, you’ll end up with a small wav file with a bitrate of 64kbps.

I know probably no one needs this, but it took some time to figure out, so hopefully someone who does stumbles upon it in the future. Also, if anyone would like a proper tutorial on more relevant pro audio stuff, let me know and I’ll start a series.

Here’s an applescript app that will auto-convert dropped files (includes sox).
It drops them into a newly created folder, just like the above bash script.


I think something like this would be good to use in home automation or a personal assistant or something. Since a whole slew of clips could be crammed into a small amount of storage it could keep cost down.

At least theoretically since storage is cheap as chips nowadays.

That’s one reason the older systems were used. Although storage costs negate the need, the challenge nowadays is keeping compatibility with such an old system; hence the post.

Honestly, it just sounds so bad comparatively, I can’t think of a good reason to use it, unless you really gotta skimp on storage for some kind of mass production. Like… phone systems, for example. :stuck_out_tongue: