Improving Linux Audio [UPDATED]

PhaseLockedLoop · October 29, 2018, 12:27am

Revision Date: October 30, 2018 (Europe: Paris), October 29, 2018 (America: Los Angeles), October 30, 2018 (US: Mountain)

Let me put this at the top. This will absolutely result in better quality then windows even with windows having its settings and tweaks maxed out. I am not sure why this occur. I think it has to do with driver programming

So we like audio. We especially like GOOD quality audio and as much as that is also dependent on the source. We would ideally like our operating system to not interfere with it unless its to resample it higher. In this guide I will be going through my pulse audio config in an effort to demonstrate the settings I have found most effective. This also depends on hardware and sound card quality/capabilities as well.

AUDIOPHILES : I know I am leaving quiet a bit of additional information out here. This is intended for the “average avid music listener” LOL

My config is at the bottom. The top one is the two channel default.

So lets begin with our pulse audio config. There are two ways to do this. You can either edit your user specific configuration (safer) or modify directly in edit to config.

In any account I would like to do so in edit to config as it applies system wide and its easy to make a backup.

sudo nano /etc/pulse/daemon.conf

The following settings we will modify are: (default)

default-sample-format = s16le
default-sample-rate = 48000
alternate-sample-rate = 44100
default-sample-channels = 2
default-channel-map = front-left,front-right
default-fragments = 4
default-fragment-size-msec = 25
resample-method = speex-float-1
enable-lfe-remixing = no
high-priority = yes
nice-level = -11
realtime-scheduling = yes
realtime-priority = 9
rlimit-rtprio = 9
daemonize = no

Lets talk about what those settings are before they are blindly modified.

Lets begin with the default-sample-format :
This is the default sampling format of the sampler. The quality will be different for each sample format. This is because each sampler has some slightly different mathematics. s samplers are fixed point samplers where ulaw alaw and u8 filters are older 8bit PCM samplers I would avoid using them for obvious reasons… Floats offer a very high level of precision however it is still dependent on how precise your source is. (assuming good hardware) For me, float32le appears to be producing the highest quality of sound.

When determining sample format you must select it based upon your CPU architecture’s byte order or also called Endianness. You can determine your CPU’s byte order by using below command: lscpu | grep 'Byte Order'
Here is a list of sample formats:
u8, s16le, s16be, s24le, s24be, s24-32le, s24-32be, s32le, s32be float32le, float32be, ulaw, alaw

What about default-sample-rate and alternate-sample-rate?:
This determines in Analog to Digital Converter (ADC) or Digital to Analog Converter (DAC) sampling rate and the alternative sample rate. The higher the rate of the source typically the better the quality of audio IF done correctly. So I definitely would like to set the bar high and rather upsample then downsample. The sound system will determine which to use, either the default or alternative automatically.

In my configuration, I have used 192000Hz. This is more than enough to get higher quality sound. We can speak more about sampling if enough people would like to understand some rudimentary signal processing (I WOULD LOVE TO GEEK OUT ABOUT IT)

The next setting is default-sample-channelsanddefault-channel-map:

The number of channels to be sampled and their output mapping. I use a 5.1 system so I have selected the number of channels to be 5.1 and channel map to suit it. In case if you have 2.1 or 5.1 surround sound system, you will have to consult the pulse documentation to figure out how to do tweak the configuration or see mine for 5.1. It is just a case of identifying speakers and stuff.

The next setting I am modifying is default-fragmentsanddefault-fragment-size-msec:

Some hardware drivers require the hardware playback buffer to be subdivided into several fragments. These configurations determines the number of fragments and a duration of a single fragment. Defaults are 4 and 25ms so the total buffer will be 100ms long. I have selected 2 and 125ms. If you have a good sound card you can ignore this configuration since most of newer sound drivers support timer-base scheduling which is a god send. Gone are the days of locking up your audio system because of hardware buffer issues

Moving onto the next setting we have resample-method:

The resampling algorithm to use when requiring up or down sampling math applied to the signal. I have selected and gone for broke speex-float-10. It is a point sampler, which offers better sound quality than speex-fixed* methods however it is CPU intensive. On a single core it will use upwards of 8-12 percent instead of less then 1.

Available values:

trivial speex-float-0 speex-float-1 speex-float-2 speex-float-3 speex-float-4 speex-float-5 speex-float-6 speex-float-7 speex-float-8 speex-float-9 speex-float-10 speex-fixed-0 speex-fixed-1 speex-fixed-2 speex-fixed-3 speex-fixed-4 speex-fixed-5 speex-fixed-6 speex-fixed-7 speex-fixed-8 speex-fixed-9 speex-fixed-10 ffmpeg auto copy peaks

The next setting is enable-lfe-remixing:

This determines the upmixing or downmixing channel behaviour. If disabled output LFE channel is available the signal on the input LFE channel will be ignored. This output is for moving the LFE to the subwoofer instead of deferred mixing into 2.0 and 4.0 speakers. I have a 5.1 system so I need it on. The Low-frequency effects (LFE) channel is the name of an audio track specifically intended for deep, low-pitched sounds ranging from 3–120 Hz. No it wont boost your bass on your headphones LOL.

We do need to make a small modification to the scheduling configuration of pulse daemon:

high-priority = yes
nice-level = -11
realtime-scheduling = yes
realtime-priority = 9
rlimit-rtprio = 9
daemonize = no

high-priority setting makes the pulse daemon is a high priority process. I would advice against using any kind of real time processing. This can cause the system to lock up and high priority should suffice. I have personally experienced it lock up a 16 core processor system so BE ADVISED. This is accomplished by increasing the realtime-priority much higher

Lets create direct communication between pulse and the kernel driver. This should reduce our latency. Run the following command:

$ sudo nano /etc/asound.conf

The default configuration will be attempting to use pulseaudio by default. Change the pulse audio section or create the config and make sure it reads as follows

# Use PulseAudio plugin hw
pcm.!default {
   type plug
   slave.pcm hw
}

slave.pcm hw plugin will communicates directly with ALSA kernel driver. This means you have completely unaltered communication to the kernel audio driver. This will reduce the latency involved in higher quality conversion

Then reboot your computer or restart Pulse and ALSA processes.

Here is my configuration: (LFE is sub)

default-sample-format = float32le
default-sample-rate = 19200
alternate-sample-rate = 96000
default-sample-channels = 5.1
default-channel-map = front-left, front-right, rear-left, rear-right, center, LFE
default-fragments = 2
default-fragment-size-msec = 125
resample-method = speex-float-10
enable-lfe-remixing = yes
high-priority = yes
nice-level = -11
realtime-scheduling = yes
realtime-priority = 9
rlimit-rtprio = 9
daemonize = no

Melcar · October 29, 2018, 1:57am

I think your options for “resample-method” are deprecated in the latest PA.

pulseaudio --dump-resample-methods

That will give you the available resample methods of your current PA version. speex-float-10 is currently the highest quality one I think.
Also, at least for me, setting the default sample rate to 96000 or 192000 causes distortion in certain sound streams, most notably Youtube and in some cases when you start playing more than one sound stream simultaneously. I guess it will depend on many thinks (software and hardware configuration), but this happens on both onboard (ALC1220) and discrete sound boards (Xonar DX).
If I’m not mistaken, the purpose of the alternate-sample-rate option is to avoid needless resampling. With the default configuration, all audio gets resampled to 44100, with PA using 48000 if it detects a sound stream at that rate. Obviously, this will be detrimental if you are doing some high quality audio production (in which case you shouldn’t be using PA), but for regular desktop use (movies, games, etc) those defaults are said to offer the best trade-offs in terms of quality and resources used. At least, that’s what I read when looking at the PA manuals. PA resampling also applies to all your input devices, I think. Can be a bit problematic if you have more than one sound device. Personally, I always set it as 48000/192000 (default/alternate).

PhaseLockedLoop · October 29, 2018, 9:05pm

It does depend on the card. Some cards can handle it (Some realtek chipsets (found in highend gaming boards and Xonar cards) other cards cause aliasing which is the distortion that you hear. Often noted as clipping… So choosing with experimentation is what I advised.

This might be a better way to do so. I might have to update my old LTS at this point and use the latest version of PA and see what works and what does not… I will change the guide accordingly

@Melcar Updated for latest PA version. I do not encounter distortion at the higher rates however others might so I definitely recommend people to read what you said as well. Thank you for bringing that to my attention