Bounding Unicorns

Capturing Audio via ALSA from USB Interfaces on Linux

Alesis iMultiMix 8

[3895402.372731] usb 1-8: new full-speed USB device number 9 using xhci_hcd
[3895402.524261] usb 1-8: New USB device found, idVendor=08bb, idProduct=2900, bcdDevice= 1.00
[3895402.524276] usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[3895402.524283] usb 1-8: Product: USB Audio CODEC 
[3895402.524288] usb 1-8: Manufacturer: Burr-Brown from TI              
[3895402.554962] input: Burr-Brown from TI               USB Audio CODEC  as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.3/0003:08BB:2900.0006/input/input25
[3895402.613217] hid-generic 0003:08BB:2900.0006: input,hidraw1: USB HID v1.00 Device [Burr-Brown from TI               USB Audio CODEC ] on usb-0000:00:14.0-8/input3

lsusb:

Bus 001 Device 009: ID 08bb:2900 Texas Instruments PCM2900 Audio Codec

To map this device to the ALSA card, we can examine /proc/asound/card*/usbid. In my case I have:

% cat /proc/asound/card2/usbid  
08bb:2900

So, this mixer is "card 2".

Stream information is very extensive compared to other interfaces:

% cat /proc/asound/card2/stream0 
Burr-Brown from TI USB Audio CODEC at usb-0000:00:14.0-8, full speed : USB Audio

Playback:
  Status: Stop
  Interface 1
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x02 (2 OUT) (ADAPTIVE)
    Rates: 32000, 44100, 48000
    Bits: 16
    Channel map: FL FR
  Interface 1
    Altset 2
    Format: S16_LE
    Channels: 1
    Endpoint: 0x02 (2 OUT) (ADAPTIVE)
    Rates: 32000, 44100, 48000
    Bits: 16
    Channel map: MONO
  Interface 1
    Altset 3
    Format: S8
    Channels: 2
    Endpoint: 0x02 (2 OUT) (ADAPTIVE)
    Rates: 32000, 44100, 48000
    Bits: 8
    Channel map: FL FR
  Interface 1
    Altset 4
    Format: S8
    Channels: 1
    Endpoint: 0x02 (2 OUT) (ADAPTIVE)
    Rates: 32000, 44100, 48000
    Bits: 8
    Channel map: MONO
  Interface 1
    Altset 5
    Format: U8
    Channels: 2
    Endpoint: 0x02 (2 OUT) (ADAPTIVE)
    Rates: 32000, 44100, 48000
    Bits: 8
    Channel map: FL FR
  Interface 1
    Altset 6
    Format: U8
    Channels: 1
    Endpoint: 0x02 (2 OUT) (ADAPTIVE)
    Rates: 32000, 44100, 48000
    Bits: 8
    Channel map: MONO

Capture:
  Status: Stop
  Interface 2
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 48000
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 2
    Format: S16_LE
    Channels: 1
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 48000
    Bits: 16
    Channel map: MONO
  Interface 2
    Altset 3
    Format: S16_LE
    Channels: 2
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 44100
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 4
    Format: S16_LE
    Channels: 1
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 44100
    Bits: 16
    Channel map: MONO
  Interface 2
    Altset 5
    Format: S16_LE
    Channels: 2
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 32000
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 6
    Format: S16_LE
    Channels: 1
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 32000
    Bits: 16
    Channel map: MONO
  Interface 2
    Altset 7
    Format: S16_LE
    Channels: 2
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 22050
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 8
    Format: S16_LE
    Channels: 1
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 22050
    Bits: 16
    Channel map: MONO
  Interface 2
    Altset 9
    Format: S16_LE
    Channels: 2
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 16000
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 10
    Format: S16_LE
    Channels: 1
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 16000
    Bits: 16
    Channel map: MONO
  Interface 2
    Altset 11
    Format: S8
    Channels: 2
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 16000
    Bits: 8
    Channel map: FL FR
  Interface 2
    Altset 12
    Format: S8
    Channels: 1
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 16000
    Bits: 8
    Channel map: MONO
  Interface 2
    Altset 13
    Format: S8
    Channels: 2
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 8000
    Bits: 8
    Channel map: FL FR
  Interface 2
    Altset 14
    Format: S8
    Channels: 1
    Endpoint: 0x84 (4 IN) (ASYNC)
    Rates: 8000
    Bits: 8
    Channel map: MONO
  Interface 2
    Altset 15
    Format: S16_LE
    Channels: 2
    Endpoint: 0x84 (4 IN) (SYNC)
    Rates: 11025
    Bits: 16
    Channel map: FL FR
  Interface 2
    Altset 16
    Format: S16_LE
    Channels: 1
    Endpoint: 0x84 (4 IN) (SYNC)
    Rates: 11025
    Bits: 16
    Channel map: MONO
  Interface 2
    Altset 17
    Format: S8
    Channels: 2
    Endpoint: 0x84 (4 IN) (SYNC)
    Rates: 11025
    Bits: 8
    Channel map: FL FR
  Interface 2
    Altset 18
    Format: S8
    Channels: 1
    Endpoint: 0x84 (4 IN) (SYNC)
    Rates: 11025
    Bits: 8
    Channel map: MONO

I wonder if the mono capture involves the mixer mixing left and right main outputs solely for the capture over USB? Impressive, if so.

ALSA hardware parameters are as follows:

% arecord --dump-hw-params -s 1 /dev/null -D hw:2
Warning: Some sources (like microphones) may produce inaudible results
         with 8-bit sampling. Use '-f' argument to increase resolution
         e.g. '-f S16_LE'.
HW Params of device "hw:2":
--------------------
ACCESS:  MMAP_INTERLEAVED RW_INTERLEAVED
FORMAT:  S8 S16_LE
SUBFORMAT:  STD
SAMPLE_BITS: [8 16]
FRAME_BITS: [8 32]
CHANNELS: [1 2]
RATE: [8000 48000]
PERIOD_TIME: [1000 1000000]
PERIOD_SIZE: [16 48000]
PERIOD_BYTES: [64 192000]
PERIODS: [2 1024]
BUFFER_TIME: (666 2000000]
BUFFER_SIZE: [32 96000]
BUFFER_BYTES: [64 384000]
TICK_TIME: ALL
--------------------
arecord: set_params:1371: Sample format non available
Available formats:
- S8
- S16_LE

Recording from this mixer via arecord requires specifying the format explicitly and, since arecord defaults to 8000 Hz sample rate, a higher sample rate should practically be specified. You should also specify the number of channels you want, especially given that the mixer apparently can output either mono or stereo:

arecord -D hw:2 -f s16_le -r 48000 -c 2 /tmp/out.wav

ffmpeg is able to work this all out by itself and it just needs to be told the device number to record from:

% ffmpeg -hide_banner -f alsa -i hw:2 /tmp/out.wav
[aist#0:0/pcm_s16le @ 0x560f97a1a2c0] Guessed Channel Layout: stereo
Input #0, alsa, from 'hw:2':
  Duration: N/A, start: 1700935015.064600, bitrate: 1536 kb/s
  Stream #0:0: Audio: pcm_s16le, 48000 Hz, 2 channels, s16, 1536 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '/tmp/out.wav':
  Metadata:
    ISFT            : Lavf60.16.100
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
    Metadata:
      encoder         : Lavc60.31.102 pcm_s16le
...

PreSonus AudioBox USB

This device identifies itself thusly:

[3907114.631345] usb 1-8: new full-speed USB device number 10 using xhci_hcd
[3907115.962601] usb 1-8: New USB device found, idVendor=194f, idProduct=0302, bcdDevice= 2.70
[3907115.962616] usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[3907115.962622] usb 1-8: Product: AudioBox USB
[3907115.962626] usb 1-8: Manufacturer: PreSonus Audio

To find which ALSA card corresponds to this, we can look up its USB ids thusly:

% grep -r  194f /proc/asound
/proc/asound/card2/usbid:194f:0302
/proc/asound/card2/usbmixer:USB Mixer: usb_id=0x194f0302, ctrlif=1, ctlerr=0

Hardware info:

% cat /proc/asound/card2/stream0 
PreSonus Audio AudioBox USB at usb-0000:00:14.0-8, full speed : USB Audio

Playback:
  Status: Stop
  Interface 2
    Altset 1
    Format: S24_3LE
    Channels: 2
    Endpoint: 0x01 (1 OUT) (ADAPTIVE)
    Rates: 44100, 48000
    Bits: 24
    Channel map: FL FR

Capture:
  Status: Stop
  Interface 3
    Altset 1
    Format: S24_3LE
    Channels: 2
    Endpoint: 0x82 (2 IN) (SYNC)
    Rates: 44100, 48000
    Bits: 24
    Channel map: FL FR

We can see from this output that the AudioBox USB only supports 24-bit samples, in the S24_3LE sample format.

% arecord --dump-hw-params -s 1 -D hw:2
Warning: Some sources (like microphones) may produce inaudible results
         with 8-bit sampling. Use '-f' argument to increase resolution
         e.g. '-f S16_LE'.
HW Params of device "hw:2":
--------------------
ACCESS:  MMAP_INTERLEAVED RW_INTERLEAVED
FORMAT:  S24_3LE
SUBFORMAT:  STD
SAMPLE_BITS: 24
FRAME_BITS: 48
CHANNELS: 2
RATE: [44100 48000]
PERIOD_TIME: [1000 1000000]
PERIOD_SIZE: [45 48000]
PERIOD_BYTES: [270 288000]
PERIODS: [2 1024]
BUFFER_TIME: [1875 2000000]
BUFFER_SIZE: [90 96000]
BUFFER_BYTES: [540 576000]
TICK_TIME: ALL
--------------------
arecord: set_params:1371: Sample format non available
Available formats:
- S24_3LE

This device weighs 25.7 oz (728 g). It has chunky aluminum side panels to give it a hefty feeling, which unfortunately does make the device heavier if you want to carry it with you somewhere.

Audio can be captured using arecord in a straightforward manner, in that all of the parameters have to be specified but what to set them to is obvious and the command actually works:

arecord -D hw:2 -f s24_3le -r 48000 -c 2 /tmp/out.wav

To capture audio via ffmpeg you have to know the right incantation, which is:

% ffmpeg -hide_banner -c:a pcm_s24le -f alsa -i hw:2 -y /tmp/out2.wav
[aist#0:0/pcm_s24le @ 0x55ea681da440] Guessed Channel Layout: stereo
Input #0, alsa, from 'hw:2':
  Duration: N/A, start: 1700964473.539720, bitrate: 2304 kb/s
  Stream #0:0: Audio: pcm_s24le, 48000 Hz, 2 channels, s32 (24 bit), 2304 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s24le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '/tmp/out2.wav':
  Metadata:
    ISFT            : Lavf60.16.100
  Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
    Metadata:
      encoder         : Lavc60.31.102 pcm_s16le
...

If you don't provide the -c:a pcm_s24le argument, then ffmpeg will simply not work for the raw hardware device and it could capture from plughw device with conversions but in 16-bit only. For example, the following command, while functional, is not ideal:

% ffmpeg -hide_banner -f alsa -i plughw:CARD=USB,DEV=0 /tmp/out.wav
[aist#0:0/pcm_s16le @ 0x55fc73a90940] Guessed Channel Layout: stereo
Input #0, alsa, from 'plughw:CARD=USB,DEV=0':
  Duration: N/A, start: 1700950609.808469, bitrate: 1536 kb/s
  Stream #0:0: Audio: pcm_s16le, 48000 Hz, 2 channels, s16, 1536 kb/s
Stream mapping:
  Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '/tmp/out.wav':
  Metadata:
    ISFT            : Lavf60.16.100
  Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 2304 kb/s
    Metadata:
      encoder         : Lavc60.31.102 pcm_s24le
...

You might think that this is working fine but careful reading of the above output shows that the input is in pcm_s16le format, meaning ALSA dropped 8 bits out of the 24-bit stream to feed the remaining 16 bits to ffmpeg, and ffmpeg subsequently upconverted that to 24 bits again. Adding -sample_fmt s32 to various places produces no effect, for example the following command based on suggestions I've found on the internet produces exactly the same output:

ffmpeg -hide_banner -f alsa -i plughw:CARD=USB,DEV=0 -sample_fmt s32 -c:a pcm_s24le /tmp/out.wav

To get the card name to use with plughw, look at /proc/asound/card*/id using the card identifier we determined previously from the USB product and vendor IDs:

% cat /proc/asound/card2/id 
USB

Yes, this device is in fact called "USB".

M-Audio Fast Track Pro

This device identifies itself thusly:

[3952710.929815] usb 1-8: new full-speed USB device number 11 using xhci_hcd
[3952711.153439] usb 1-8: New USB device found, idVendor=0763, idProduct=2012, bcdDevice= 1.02
[3952711.153454] usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[3952711.153461] usb 1-8: Product: FastTrack Pro
[3952711.153465] usb 1-8: Manufacturer: M-Audio
[3952711.157343] usb 1-8: Fast Track Pro switching to config #2
[3952711.159298] usb 1-8: Fast Track Pro switching to config #2
[3952711.164490] usb 1-8: Fast Track Pro config OK

lsusb output:

% lsusb 
...
Bus 001 Device 011: ID 0763:2012 M-Audio M-Audio Fast Track Pro
...

ALSA presents this device as follows:

% cat /proc/asound/cards
...
 2 [Pro            ]: USB-Audio - FastTrack Pro
                      M-Audio FastTrack Pro at usb-0000:00:14.0-8, full speed
...

Hardware parameters:

% arecord --dump- -s 1 -D hw:2                                                      
Warning: Some sources (like microphones) may produce inaudible results
         with 8-bit sampling. Use '-f' argument to increase resolution
         e.g. '-f S16_LE'.
HW Params of device "hw:2":
--------------------
ACCESS:  MMAP_INTERLEAVED RW_INTERLEAVED
FORMAT:  S16_LE
SUBFORMAT:  STD
SAMPLE_BITS: 16
FRAME_BITS: 32
CHANNELS: 2
RATE: [8000 48000]
PERIOD_TIME: [1000 1000000]
PERIOD_SIZE: [16 48000]
PERIOD_BYTES: [64 192000]
PERIODS: [2 1024]
BUFFER_TIME: (666 2000000]
BUFFER_SIZE: [32 96000]
BUFFER_BYTES: [128 384000]
TICK_TIME: ALL
--------------------
arecord: set_params:1371: Sample format non available
Available formats:
- S16_LE

This card presents two streams, stream0 and stream1:

% cat /proc/asound/card2/stream0
M-Audio FastTrack Pro at usb-0000:00:14.0-8, full speed : USB Audio

Playback:
  Status: Stop
  Interface 2
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x03 (3 OUT) (ADAPTIVE)
    Rates: 44100, 48000
    Bits: 16
    Channel map: FL FR

Capture:
  Status: Stop
  Interface 4
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x85 (5 IN) (SYNC)
    Rates: 8000 - 48000 (continuous)
    Bits: 16
    Channel map: FL FR
hank3% cat /proc/asound/card2/stream1
M-Audio FastTrack Pro at usb-0000:00:14.0-8, full speed : USB Audio #1

Playback:
  Status: Stop
  Interface 3
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x04 (4 OUT) (ADAPTIVE)
    Rates: 44100, 48000
    Bits: 16
    Channel map: FL FR

Capture:
  Status: Stop
  Interface 5
    Altset 1
    Format: S16_LE
    Channels: 2
    Endpoint: 0x86 (6 IN) (SYNC)
    Rates: 8000 - 48000 (continuous)
    Bits: 16
    Channel map: FL FR

I don't know what the difference is between the streams or what each would be used for.

Amusingly, the card cannot decide whether it should call itself "Fast Track" or "FastTrack" Pro.

This device weighs 20.4 oz (578 g).

Since the card supports S16_LE, capturing audio from it requires no arguments to ffmpeg:

ffmpeg -f alsa -i hw:2 out.wav

... or the usual set of everything to arecord:

arecord -D hw:2 -f S16_LE -c 2 -r 48000 out.wav