Capturing Audio via ALSA from USB Interfaces on Linux
Alesis iMultiMix 8
[3895402.372731] usb 1-8: new full-speed USB device number 9 using xhci_hcd
[3895402.524261] usb 1-8: New USB device found, idVendor=08bb, idProduct=2900, bcdDevice= 1.00
[3895402.524276] usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[3895402.524283] usb 1-8: Product: USB Audio CODEC
[3895402.524288] usb 1-8: Manufacturer: Burr-Brown from TI
[3895402.554962] input: Burr-Brown from TI USB Audio CODEC as /devices/pci0000:00/0000:00:14.0/usb1/1-8/1-8:1.3/0003:08BB:2900.0006/input/input25
[3895402.613217] hid-generic 0003:08BB:2900.0006: input,hidraw1: USB HID v1.00 Device [Burr-Brown from TI USB Audio CODEC ] on usb-0000:00:14.0-8/input3
lsusb
:
Bus 001 Device 009: ID 08bb:2900 Texas Instruments PCM2900 Audio Codec
To map this device to the ALSA card, we can examine
/proc/asound/card*/usbid
. In my case I have:
% cat /proc/asound/card2/usbid
08bb:2900
So, this mixer is "card 2".
Stream information is very extensive compared to other interfaces:
% cat /proc/asound/card2/stream0
Burr-Brown from TI USB Audio CODEC at usb-0000:00:14.0-8, full speed : USB Audio
Playback:
Status: Stop
Interface 1
Altset 1
Format: S16_LE
Channels: 2
Endpoint: 0x02 (2 OUT) (ADAPTIVE)
Rates: 32000, 44100, 48000
Bits: 16
Channel map: FL FR
Interface 1
Altset 2
Format: S16_LE
Channels: 1
Endpoint: 0x02 (2 OUT) (ADAPTIVE)
Rates: 32000, 44100, 48000
Bits: 16
Channel map: MONO
Interface 1
Altset 3
Format: S8
Channels: 2
Endpoint: 0x02 (2 OUT) (ADAPTIVE)
Rates: 32000, 44100, 48000
Bits: 8
Channel map: FL FR
Interface 1
Altset 4
Format: S8
Channels: 1
Endpoint: 0x02 (2 OUT) (ADAPTIVE)
Rates: 32000, 44100, 48000
Bits: 8
Channel map: MONO
Interface 1
Altset 5
Format: U8
Channels: 2
Endpoint: 0x02 (2 OUT) (ADAPTIVE)
Rates: 32000, 44100, 48000
Bits: 8
Channel map: FL FR
Interface 1
Altset 6
Format: U8
Channels: 1
Endpoint: 0x02 (2 OUT) (ADAPTIVE)
Rates: 32000, 44100, 48000
Bits: 8
Channel map: MONO
Capture:
Status: Stop
Interface 2
Altset 1
Format: S16_LE
Channels: 2
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 48000
Bits: 16
Channel map: FL FR
Interface 2
Altset 2
Format: S16_LE
Channels: 1
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 48000
Bits: 16
Channel map: MONO
Interface 2
Altset 3
Format: S16_LE
Channels: 2
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 44100
Bits: 16
Channel map: FL FR
Interface 2
Altset 4
Format: S16_LE
Channels: 1
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 44100
Bits: 16
Channel map: MONO
Interface 2
Altset 5
Format: S16_LE
Channels: 2
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 32000
Bits: 16
Channel map: FL FR
Interface 2
Altset 6
Format: S16_LE
Channels: 1
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 32000
Bits: 16
Channel map: MONO
Interface 2
Altset 7
Format: S16_LE
Channels: 2
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 22050
Bits: 16
Channel map: FL FR
Interface 2
Altset 8
Format: S16_LE
Channels: 1
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 22050
Bits: 16
Channel map: MONO
Interface 2
Altset 9
Format: S16_LE
Channels: 2
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 16000
Bits: 16
Channel map: FL FR
Interface 2
Altset 10
Format: S16_LE
Channels: 1
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 16000
Bits: 16
Channel map: MONO
Interface 2
Altset 11
Format: S8
Channels: 2
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 16000
Bits: 8
Channel map: FL FR
Interface 2
Altset 12
Format: S8
Channels: 1
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 16000
Bits: 8
Channel map: MONO
Interface 2
Altset 13
Format: S8
Channels: 2
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 8000
Bits: 8
Channel map: FL FR
Interface 2
Altset 14
Format: S8
Channels: 1
Endpoint: 0x84 (4 IN) (ASYNC)
Rates: 8000
Bits: 8
Channel map: MONO
Interface 2
Altset 15
Format: S16_LE
Channels: 2
Endpoint: 0x84 (4 IN) (SYNC)
Rates: 11025
Bits: 16
Channel map: FL FR
Interface 2
Altset 16
Format: S16_LE
Channels: 1
Endpoint: 0x84 (4 IN) (SYNC)
Rates: 11025
Bits: 16
Channel map: MONO
Interface 2
Altset 17
Format: S8
Channels: 2
Endpoint: 0x84 (4 IN) (SYNC)
Rates: 11025
Bits: 8
Channel map: FL FR
Interface 2
Altset 18
Format: S8
Channels: 1
Endpoint: 0x84 (4 IN) (SYNC)
Rates: 11025
Bits: 8
Channel map: MONO
I wonder if the mono capture involves the mixer mixing left and right main outputs solely for the capture over USB? Impressive, if so.
ALSA hardware parameters are as follows:
% arecord --dump-hw-params -s 1 /dev/null -D hw:2
Warning: Some sources (like microphones) may produce inaudible results
with 8-bit sampling. Use '-f' argument to increase resolution
e.g. '-f S16_LE'.
HW Params of device "hw:2":
--------------------
ACCESS: MMAP_INTERLEAVED RW_INTERLEAVED
FORMAT: S8 S16_LE
SUBFORMAT: STD
SAMPLE_BITS: [8 16]
FRAME_BITS: [8 32]
CHANNELS: [1 2]
RATE: [8000 48000]
PERIOD_TIME: [1000 1000000]
PERIOD_SIZE: [16 48000]
PERIOD_BYTES: [64 192000]
PERIODS: [2 1024]
BUFFER_TIME: (666 2000000]
BUFFER_SIZE: [32 96000]
BUFFER_BYTES: [64 384000]
TICK_TIME: ALL
--------------------
arecord: set_params:1371: Sample format non available
Available formats:
- S8
- S16_LE
Recording from this mixer via arecord
requires specifying the format
explicitly and, since arecord
defaults to 8000 Hz sample rate, a higher
sample rate should practically be specified. You should also specify the
number of channels you want, especially given that the mixer apparently can
output either mono or stereo:
arecord -D hw:2 -f s16_le -r 48000 -c 2 /tmp/out.wav
ffmpeg
is able to work this all out by itself and it just needs to be told
the device number to record from:
% ffmpeg -hide_banner -f alsa -i hw:2 /tmp/out.wav
[aist#0:0/pcm_s16le @ 0x560f97a1a2c0] Guessed Channel Layout: stereo
Input #0, alsa, from 'hw:2':
Duration: N/A, start: 1700935015.064600, bitrate: 1536 kb/s
Stream #0:0: Audio: pcm_s16le, 48000 Hz, 2 channels, s16, 1536 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '/tmp/out.wav':
Metadata:
ISFT : Lavf60.16.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
Metadata:
encoder : Lavc60.31.102 pcm_s16le
...
PreSonus AudioBox USB
This device identifies itself thusly:
[3907114.631345] usb 1-8: new full-speed USB device number 10 using xhci_hcd
[3907115.962601] usb 1-8: New USB device found, idVendor=194f, idProduct=0302, bcdDevice= 2.70
[3907115.962616] usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[3907115.962622] usb 1-8: Product: AudioBox USB
[3907115.962626] usb 1-8: Manufacturer: PreSonus Audio
To find which ALSA card corresponds to this, we can look up its USB ids thusly:
% grep -r 194f /proc/asound
/proc/asound/card2/usbid:194f:0302
/proc/asound/card2/usbmixer:USB Mixer: usb_id=0x194f0302, ctrlif=1, ctlerr=0
Hardware info:
% cat /proc/asound/card2/stream0
PreSonus Audio AudioBox USB at usb-0000:00:14.0-8, full speed : USB Audio
Playback:
Status: Stop
Interface 2
Altset 1
Format: S24_3LE
Channels: 2
Endpoint: 0x01 (1 OUT) (ADAPTIVE)
Rates: 44100, 48000
Bits: 24
Channel map: FL FR
Capture:
Status: Stop
Interface 3
Altset 1
Format: S24_3LE
Channels: 2
Endpoint: 0x82 (2 IN) (SYNC)
Rates: 44100, 48000
Bits: 24
Channel map: FL FR
We can see from this output that the AudioBox USB only supports 24-bit samples, in the S24_3LE sample format.
% arecord --dump-hw-params -s 1 -D hw:2
Warning: Some sources (like microphones) may produce inaudible results
with 8-bit sampling. Use '-f' argument to increase resolution
e.g. '-f S16_LE'.
HW Params of device "hw:2":
--------------------
ACCESS: MMAP_INTERLEAVED RW_INTERLEAVED
FORMAT: S24_3LE
SUBFORMAT: STD
SAMPLE_BITS: 24
FRAME_BITS: 48
CHANNELS: 2
RATE: [44100 48000]
PERIOD_TIME: [1000 1000000]
PERIOD_SIZE: [45 48000]
PERIOD_BYTES: [270 288000]
PERIODS: [2 1024]
BUFFER_TIME: [1875 2000000]
BUFFER_SIZE: [90 96000]
BUFFER_BYTES: [540 576000]
TICK_TIME: ALL
--------------------
arecord: set_params:1371: Sample format non available
Available formats:
- S24_3LE
This device weighs 25.7 oz (728 g). It has chunky aluminum side panels to give it a hefty feeling, which unfortunately does make the device heavier if you want to carry it with you somewhere.
Audio can be captured using arecord
in a straightforward manner, in that
all of the parameters have to be specified but what to set them to is
obvious and the command actually works:
arecord -D hw:2 -f s24_3le -r 48000 -c 2 /tmp/out.wav
To capture audio via ffmpeg
you have to know the right incantation, which is:
% ffmpeg -hide_banner -c:a pcm_s24le -f alsa -i hw:2 -y /tmp/out2.wav
[aist#0:0/pcm_s24le @ 0x55ea681da440] Guessed Channel Layout: stereo
Input #0, alsa, from 'hw:2':
Duration: N/A, start: 1700964473.539720, bitrate: 2304 kb/s
Stream #0:0: Audio: pcm_s24le, 48000 Hz, 2 channels, s32 (24 bit), 2304 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s24le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '/tmp/out2.wav':
Metadata:
ISFT : Lavf60.16.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s16, 1536 kb/s
Metadata:
encoder : Lavc60.31.102 pcm_s16le
...
If you don't provide the -c:a pcm_s24le
argument, then ffmpeg
will
simply not work for the raw hardware device and it could capture from
plughw
device with conversions but in 16-bit only. For example,
the following command, while functional, is not ideal:
% ffmpeg -hide_banner -f alsa -i plughw:CARD=USB,DEV=0 /tmp/out.wav
[aist#0:0/pcm_s16le @ 0x55fc73a90940] Guessed Channel Layout: stereo
Input #0, alsa, from 'plughw:CARD=USB,DEV=0':
Duration: N/A, start: 1700950609.808469, bitrate: 1536 kb/s
Stream #0:0: Audio: pcm_s16le, 48000 Hz, 2 channels, s16, 1536 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s24le (native))
Press [q] to stop, [?] for help
Output #0, wav, to '/tmp/out.wav':
Metadata:
ISFT : Lavf60.16.100
Stream #0:0: Audio: pcm_s24le ([1][0][0][0] / 0x0001), 48000 Hz, stereo, s32, 2304 kb/s
Metadata:
encoder : Lavc60.31.102 pcm_s24le
...
You might think that this is working fine but careful reading of the above
output shows that the input is in pcm_s16le
format, meaning ALSA dropped
8 bits out of the 24-bit stream to feed the remaining 16 bits to ffmpeg
,
and ffmpeg
subsequently upconverted that to 24 bits again. Adding
-sample_fmt s32
to various places produces no effect, for example the
following command based on suggestions I've found on the internet
produces exactly the same output:
ffmpeg -hide_banner -f alsa -i plughw:CARD=USB,DEV=0 -sample_fmt s32 -c:a pcm_s24le /tmp/out.wav
To get the card name to use with plughw
, look at /proc/asound/card*/id
using the card identifier we determined previously from the USB product and
vendor IDs:
% cat /proc/asound/card2/id
USB
Yes, this device is in fact called "USB".
M-Audio Fast Track Pro
This device identifies itself thusly:
[3952710.929815] usb 1-8: new full-speed USB device number 11 using xhci_hcd
[3952711.153439] usb 1-8: New USB device found, idVendor=0763, idProduct=2012, bcdDevice= 1.02
[3952711.153454] usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[3952711.153461] usb 1-8: Product: FastTrack Pro
[3952711.153465] usb 1-8: Manufacturer: M-Audio
[3952711.157343] usb 1-8: Fast Track Pro switching to config #2
[3952711.159298] usb 1-8: Fast Track Pro switching to config #2
[3952711.164490] usb 1-8: Fast Track Pro config OK
lsusb
output:
% lsusb
...
Bus 001 Device 011: ID 0763:2012 M-Audio M-Audio Fast Track Pro
...
ALSA presents this device as follows:
% cat /proc/asound/cards
...
2 [Pro ]: USB-Audio - FastTrack Pro
M-Audio FastTrack Pro at usb-0000:00:14.0-8, full speed
...
Hardware parameters:
% arecord --dump- -s 1 -D hw:2
Warning: Some sources (like microphones) may produce inaudible results
with 8-bit sampling. Use '-f' argument to increase resolution
e.g. '-f S16_LE'.
HW Params of device "hw:2":
--------------------
ACCESS: MMAP_INTERLEAVED RW_INTERLEAVED
FORMAT: S16_LE
SUBFORMAT: STD
SAMPLE_BITS: 16
FRAME_BITS: 32
CHANNELS: 2
RATE: [8000 48000]
PERIOD_TIME: [1000 1000000]
PERIOD_SIZE: [16 48000]
PERIOD_BYTES: [64 192000]
PERIODS: [2 1024]
BUFFER_TIME: (666 2000000]
BUFFER_SIZE: [32 96000]
BUFFER_BYTES: [128 384000]
TICK_TIME: ALL
--------------------
arecord: set_params:1371: Sample format non available
Available formats:
- S16_LE
This card presents two streams, stream0
and stream1
:
% cat /proc/asound/card2/stream0
M-Audio FastTrack Pro at usb-0000:00:14.0-8, full speed : USB Audio
Playback:
Status: Stop
Interface 2
Altset 1
Format: S16_LE
Channels: 2
Endpoint: 0x03 (3 OUT) (ADAPTIVE)
Rates: 44100, 48000
Bits: 16
Channel map: FL FR
Capture:
Status: Stop
Interface 4
Altset 1
Format: S16_LE
Channels: 2
Endpoint: 0x85 (5 IN) (SYNC)
Rates: 8000 - 48000 (continuous)
Bits: 16
Channel map: FL FR
hank3% cat /proc/asound/card2/stream1
M-Audio FastTrack Pro at usb-0000:00:14.0-8, full speed : USB Audio #1
Playback:
Status: Stop
Interface 3
Altset 1
Format: S16_LE
Channels: 2
Endpoint: 0x04 (4 OUT) (ADAPTIVE)
Rates: 44100, 48000
Bits: 16
Channel map: FL FR
Capture:
Status: Stop
Interface 5
Altset 1
Format: S16_LE
Channels: 2
Endpoint: 0x86 (6 IN) (SYNC)
Rates: 8000 - 48000 (continuous)
Bits: 16
Channel map: FL FR
I don't know what the difference is between the streams or what each would be used for.
Amusingly, the card cannot decide whether it should call itself "Fast Track" or "FastTrack" Pro.
This device weighs 20.4 oz (578 g).
Since the card supports S16_LE, capturing audio from it requires no arguments
to ffmpeg
:
ffmpeg -f alsa -i hw:2 out.wav
... or the usual set of everything to arecord
:
arecord -D hw:2 -f S16_LE -c 2 -r 48000 out.wav