ALSA, exposed!

By rendaw

The clearest ALSA documentation in the universe.

ALSA, one of the last great mysteries of Linux, is notoriously hard to use, mostly stemming from the atrocious (read: almost entirely nonexistent) end user documentation.

The internet is scattered with outdated, incorrect, incomplete, confused information written by monkeys at Linux terminals trying random ALSA configs in an attempt to get sound to come out. Being a representative of monkeys myself, I’ve also spent days (that is, multiples of 24 hours) trying to get ALSA to behave sanely.

I finally jumped into the source, straced, dug through configs, library documentation, searched my soul and now I’ve cracked this banana! And it’s actually not bad. Here’s everything you need to know, once and for all.

Any audio devices (audio chips, USB audio hardware) are considered cards in ALSA.

cards have three identifiers:

  • Number: This is a number starting from 0 based on the order the kernel found the device. This is useless, because it may be different each boot (and will be different each boot if you plug in and unplug things).

    Edit: gen2brain notes you can control the number using kernel module options

  • ID: This is a (hopefully unique and consistent) text identifier for a card. My built in device has the id Generic.

  • Name: This is another text identifier, but you can’t use it anywhere. Maybe it will help you identify the device or something… but don’t count on it. My built in device has the name HD-Audio Generic.

When referring to cards in config, programs, etc. you can use the number and ID interchangeably.

See Listing your devices for how to find the identifiers for devices on your system.

devices are subdivisions of a card. For example, my built in audio device has 3 devices: an analog input + output, a digital output (maybe HDMI?), and an alt analog input. I can configure the analog input and alt input on the device to be microphone-in or line-in independently.

Same as with cards, devices have three identifiers:

Unlike the card number, device numbers are generally consistent so feel free to use them in configs and other places.

See Listing your devices for how to find the identifiers for devices on your system.

I’m not sure what subdevices are other than yet another conceptual subdivision, but each device has at least one subdevice. My subdevices are all called “subdevice 0”. By default subdevice 0 is used everywhere, so you can mostly not worry about this.

A PCM is an object internal to ALSA that processes audio. PCMs have a (direction) stream which can be playback, capture, or both. PCMs can be chained together, and typically connect to hardware at one end, although they can also be used to route audio to/from a filesystem device, a server, or to drop audio entirely.

Named PCMs definitions can be templatized, where arguments are provided when they’re referenced to dynamically define the PCM. For instance, the built-in hw PCM takes 3 arguments: "hw:CARD,DEVICE,SUBDEVICE" (more details in Default PCMs and CTLs).

A CTL is an object internal to ALSA processes non-audio data. This is what you see in your mixer: volume controls, toggle controls, multiple-choice selections (like when you can change a device to use different ports), etc.

You can save and load CTL values using the alsactl CLI and modify the values with alsamixer and other mixing software.

Like PCMs, CTL definitions can also be templatized.

A slave wraps a PCM and allows you to set some audio stream details like bit rate. Generally a slave is just an extra step of indirection to PCM and contains no useful data itself.

A client is a piece of software that uses ALSA, via PCMs and CTLs. Most clients use the default PCM/CTL, but some provide methods for explicitly selecting the PCM/CTL.

With mpv you can select a PCM named hello with --audio-device=alsa/hello, otherwise it will use the default PCM. Templatized PCMs also work, like --audio-device=alsa/hw:Generic.

Similarly, if you have a CTL named dog you can change the levels with alsamixer -D dog (the help text uses the word device incorrectly) and, templatized, alsamixer -D hw:Generic.

The easiest way to list devices is:

aplay -l

or

arecord -l

which produce output like this:

**** List of PLAYBACK Hardware Devices **** card 1: Generic [HD-Audio Generic], device 0: ALC887-VD Analog [ALC887-VD Analog] Subdevices: 0/1 Subdevice #0: subdevice #0 card 1: Generic [HD-Audio Generic], device 1: ALC887-VD Digital [ALC887-VD Digital] Subdevices: 1/1 Subdevice #0: subdevice #0

The format is:

card CARD_NUMBER: CARD_ID [CARD_NAME], device DEVICE_NUMBER: DEVICE_ID [DEVICE_NAME] ... Subdevice #SUBDEVICE_NUMBER: SUBDEVICE_NAME

Alternatively, you can go directly to the /proc tree.

You can list cards with

cat /proc/asound/cards

This produces output like:

0 [USB ]: USB-Audio - Realtek Audio USB Generic Realtek Audio USB at usb-0000:03:00.0-6, high speed 1 [Generic ]: HDA-Intel - HD-Audio Generic HD-Audio Generic at 0xf7800000 irq 53

On the first line, 0 is the card number, USB (remove trailing spaces) is the card ID, Realtek Audio USB is the card name, I’m not sure what USB-Audio is.

Underneath /proc/asound/cardNUMBER/ you’ll see nodes like pcm1c/ and pcm2p/. 1 and 2 are the device number and p or c stands for playback or capture.

Using my system as an example, cat /proc/asound/card1/pcm0p/info shows:

card: 1 device: 0 subdevice: 0 stream: PLAYBACK id: ALC887-VD Analog name: ALC887-VD Analog subname: subdevice #0 class: 0 subclass: 0 subdevices_count: 1 subdevices_avail: 1

Underneath /proc/asound/card.../pcm.../ you’ll see nodes like sub0, sub1. 0 and 1 are the subdevice numbers.

In that directory, cat info and other nodes for details.

Each client loads /usr/share/alsa/alsa.conf at startup.

That config defines a number of other configs in @hooks which are all merged together, with later ones overriding earlier ones. On my system this pulls in:

  1. /etc/alsa.d/*.conf in alphanumeric order
  2. /etc/asound.conf
  3. ~/.asoundrc
  4. ~/.config/alsa/asoundrc

If you change the config, you need to restart each ALSA client for the changes to take effect.

The configuration is a tree, with top level keys:

  • pcm
  • ctl
  • slave_pcm
  • timer
  • rawmidi
  • hwdep

Each one is a dictionary with key value pairs of names and objects of the given type: pcm contains PCM definitions, ctl of CTL definitions, etc. pcm and ctl are expected to have a key default for clients that don’t explicitly choose one (most software).

The config file itself consists of multiple statements of the form:

KEY1.KEY2.KEY3... VALUE

VALUE can be a "string", a number, a compound - a value that has multiple subproperties, or an absolute (top rooted) reference/alias to another value like pcm.default.

You can use { } with compounds to avoid writing the whole chain of keys in every statement:

pcm.a.b 4 pcm.a.c "hi"

is equivalent to

pcm.a { b 4 c "hi" }

This is also equivalent:

pcm.a { b 4 } pcm.a { c "hi" }

; and , end statements but they aren’t necessary. You can also put a = between the key and value if you really want to.

No values “exist” until you set them in the config, even if there’s a default value. Alsa uses knowledge of this “existance” to raise spurious when loading your config.

The errors are controlled by modifiers you can prefix on keys, like:

+a "hi"

The four modifiers are:

  • +

    (default if no modifier specified)

    Creates the config value if it doesn’t exist, and sets it. No config values exist until you specify them, so this is purely determined by other config file statements.

    If the value already exists, the new value must have the same type or else you’ll get an error like:

    ALSA lib conf.c:1446:(parse_def) KEY is not a TYPE

    For example, if you specify pcm.default instead of pcm.!default you’ll probably see

    ALSA lib conf.c:1446:(parse_def) default is not a compound

    since it’s already defined as an alias/reference, not a compound, in the generic packaged configurations.

  • -

    Sets the value, but doesn’t create it. If the value doesn’t exist and you try to set it with this, you’ll get the error:

    ALSA lib conf.c:1432:(parse_def) KEY does not exists
  • ?

    Only sets the value if it’s not already set. This is mostly used by package/distro maintainers that are providing default configurations.

  • !

    Creates, sets, and changes the type of the value.

TLDR: Just use the default until you get an error and then try !.

An @ begins a special statement, like @func or @args. How this works and the syntax seems to differ based on the symbol, so I won’t provide a general guide.

Named PCMs and CTLs can be parameterized to turn them into reusable templates.

For example, slave.pcm "hw:Dog,5" will instantiate the pcm.hw object as a template, where Dog is the first argument, 5 is the second, etc.

Arguments are specified with the special statement @args followed by @arg.NAME for each positional argument given name in the former. I don’t know the details on this, but you should be able to find examples in /usr/share/alsa/alsa.conf and other config files.

A compound containing just { @func getenv vars [ ENVVAR1 ENVVAR2 ... ] default VALUE } will turn into a string from the specified environment variable. Each environment variable is queried and the first to match is used, otherwise VALUE.

A named PCM config is defined as:

pcm.NAME { type TYPE ... }

and referred to like:

{ ... playback.pcm "NAME" ... }

(via the playback slave field), or defined inline without a name like:

{ ... playback.pcm { type TYPE ... } ... }

See more information in Slave config.

All configuration parameters depend on TYPE. All types are documented with their configuration parameters in https://www.alsa-project.org/alsa-doc/alsa-lib/pcm_plugins.html

A quick overview:

  • hw is how you connect a PCM to a device
  • dmix mixes multiple audio inputs; a piece of hardware can only be used by one client at a time unless you use this
  • asym allows you to use separate PCMs for capture and playback, rather than one PCM for both
  • dsnoop appears to allow multiple clients to read recorded audio from the same device at once

A CTL config is defined as:

ctl.NAME { type TYPE ... }

I couldn’t find documentation on the types here, but I’ve seen

  • hw - looks the same as the pcm hw type, takes card only though (not device or subdevice)
  • shm - takes a server socket, seems to be some sort of external control attachment thing.

A slave config is specified as:

slave_pcm.NAME { pcm PCM ... }

pcm can either be a string indicating a named PCM definition (ex: "hi" for pcm.hi), or the inline definition of the PCM itself. When using the name string, you can also provide arguments if the name is for a template (ex: "hw:0,0").

The valid configuration options are detailed at the top of https://www.alsa-project.org/alsa-doc/alsa-lib/pcm_plugins.html, in Slave definition.

A number of useful standard PCMs and CTLs are defined in the config file /usr/share/alsa/alsa.conf.

The most important of these is hw which is a template (argumented) hw-type PCM (and CTL) so you don’t have to define that yourself. hw has 3 parameters which default to 0 if you don’t specify them: CARD, DEVICE, and SUBDEVICE. For my built in audio, for instance, I use hw:Generic,0 or just hw:Generic.

pcm.!default "hw:CARD" ctl.!default { type plug slave.ctl "hw:CARD" } pcm.!default { type dmix ipc_key 1 slave.pcm "hw:CARD" }

Note that ipc_key can be anything other than 0 which ALSA thinks means you haven’t specified a value (yay C). It just needs to be a unique number (in case you have other ipc_keys in your config). Programs use this to see if another client has already created a dmix device by looking it up with ipc_key.

FWIW I don’t know how this works with the audio capture side, although there are no errors. The above link suggests you’re supposed to use asym and use dmix only for playback, dsnoop for capture.

This functionality isn’t built in, but there’s an external project alsaequal that will create a virtual card named equal you can use to equalize audio.

I don’t believe this or something similar is possible, at least using built in functionality, and I’m not aware of any extensions providing this either. dmix mixes everything at full volume.

  • ALSA lib pcm_dmix.c:1090:(snd_pcm_dmix_open) unable to open slave

    In my case, hw:0 (the default) didn’t have a playback stream, it’s capture only. Make sure you’re using the correct device and it has the type of stream you need (playback or capture).

  • ALSA lib pcm_direct.c:1821:(_snd_pcm_direct_get_slave_ipc_offset) Invalid type 'asym' for slave PCM

    This seems to be the asym variant of above. I had set the playback half of the asym to hw:0 which didn’t have a playback stream.

  • ALSA lib pcm_direct.c:1836:(_snd_pcm_direct_get_slave_ipc_offset) Invalid value for card

    I tried hw:3 which doesn’t exist.