Extract Audio from AVI to DSS — Free Online Tool
Extract and convert audio from AVI video files into DSS format, the proprietary Digital Speech Standard used by Olympus, Philips, and Grundig dictation devices. The conversion strips the video stream entirely and re-encodes the audio using the ADPCM IMA OKI codec optimized for low-bitrate speech recordings.
to
FFmpeg Command
Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg
Drop your AVI file here
or click to browse
Free — no uploads, no signups. Your files never leave your browser.
Settings
Note: Browser-based encoding uses approximate quality targets. For precise CRF compression, copy the FFmpeg command above and run it on your desktop.
Estimated output:
Conversion Complete!
DownloadHow It Works
During this conversion, FFmpeg discards the AVI file's video stream completely — whether it was encoded with H.264, MJPEG, or PNG — and targets only the audio track. The audio, typically MP3 (libmp3lame) or AAC inside the AVI container, is decoded and then re-encoded from scratch using the ADPCM IMA OKI codec, which is the codec underpinning the DSS format. ADPCM IMA OKI is an Adaptive Differential Pulse-Code Modulation variant tuned for narrow-bandwidth speech, operating at a low fixed sample rate of 8000 Hz and a single mono channel. This means any stereo audio in the source AVI will be downmixed to mono, and any audio content outside the speech frequency range — music, ambient sound, high-frequency detail — will be heavily compressed or lost. The resulting DSS file is extremely compact and designed to be played back on dedicated digital dictation hardware or transcription software.
What Each Flag Does
| Flag | What it does |
|---|---|
ffmpeg
|
Invokes the FFmpeg tool. In the browser-based version of this tool, FFmpeg runs locally via WebAssembly (FFmpeg.wasm) — your AVI file never leaves your device. |
-i input.avi
|
Specifies the input AVI file. FFmpeg reads the interleaved audio and video streams from the AVI container, which may contain H.264, MJPEG, or PNG video alongside MP3 or AAC audio. |
-vn
|
Disables video output entirely — 'vn' stands for 'video none'. This is essential here because DSS is a pure audio format and cannot store any video stream; this flag ensures FFmpeg does not attempt to encode or pass through the AVI's video data. |
-c:a adpcm_ima_oki
|
Sets the audio codec to ADPCM IMA OKI, the only codec used in the DSS format. This re-encodes the decoded AVI audio (whether it was MP3 or AAC) into the fixed-rate, mono, 8 kHz ADPCM stream that DSS dictation devices and transcription software expect. |
output.dss
|
Specifies the output filename with the .dss extension, which tells FFmpeg to use its DSS muxer. The resulting file follows the Digital Speech Standard container structure required by Olympus, Philips, and Grundig dictation hardware and software. |
Common Use Cases
- Transferring a voice memo or spoken dictation that was accidentally recorded as an AVI video on a webcam or screen-capture tool into a DSS file compatible with Olympus or Philips transcription foot pedals and software.
- Archiving recorded interview audio from legacy AVI camcorder footage into the DSS format required by a legal or medical transcription workflow that mandates DSS input files.
- Preparing a spoken narration or verbal notes video clip for import into Olympus DSS Player or Philips SpeechExec, which do not accept AVI input directly.
- Stripping the video from an AVI conference recording to produce a compact DSS audio file for a transcriptionist who works exclusively with dictation-format files.
- Converting AVI field recordings of spoken content — such as oral history interviews — into DSS for long-term storage in dictation archive systems where DSS is the mandated format.
- Extracting a voice track from an AVI training video to feed into a digital dictation review pipeline that accepts only DSS or DS2 files.
Frequently Asked Questions
Almost certainly not. DSS was engineered exclusively for human speech and operates at 8000 Hz mono with the ADPCM IMA OKI codec, which discards most of the frequency spectrum above roughly 4 kHz. Music, ambient sound, or any audio with stereo width or high-frequency content will sound severely degraded — muffled, flat, and mono. DSS is only a suitable output format if your AVI contains a spoken-word track such as a dictation, interview, or narration.
No. DSS is inherently a mono format — it supports only a single audio channel. If your AVI file contains stereo or multi-channel audio, FFmpeg will automatically downmix it to mono during the encoding process. For a voice dictation this rarely matters, but it means any stereo spatial information or dual-channel content (such as a two-speaker interview recorded on separate channels) will be merged into a single mono stream.
No. The DSS format with the ADPCM IMA OKI codec does not expose variable bitrate or quality parameters in FFmpeg — the codec operates at a fixed, predetermined bitrate and sample rate (8000 Hz). Unlike converting to MP3 or AAC where you can set -b:a or -q:a, there are no quality flags to add to this command. The output quality is entirely determined by the codec's fixed specification.
You can use a shell loop to process multiple files. On Linux or macOS, run: for f in *.avi; do ffmpeg -i "$f" -vn -c:a adpcm_ima_oki "${f%.avi}.dss"; done. On Windows Command Prompt, use: for %f in (*.avi) do ffmpeg -i "%f" -vn -c:a adpcm_ima_oki "%~nf.dss". This applies the same conversion — stripping video and encoding audio with ADPCM IMA OKI — to every AVI file in the current directory.
By default, FFmpeg selects the first audio stream (stream index 0:a:0) when no explicit stream mapping is specified. If your AVI contains multiple audio tracks and you need a specific one, add -map 0:a:1 (or the appropriate index) to the command before the output filename: ffmpeg -i input.avi -vn -map 0:a:1 -c:a adpcm_ima_oki output.dss. AVI does technically support multiple audio tracks, so this is worth checking if your source file has layered audio.
No. DSS is a highly constrained proprietary format with no standard metadata container equivalent to ID3 tags or Matroska metadata blocks. FFmpeg does not map AVI metadata fields into the DSS output, and any title, author, date, or comment tags present in the source AVI will be silently dropped during conversion. If metadata preservation is important, you should document the original AVI file's metadata separately before converting.
Technical Notes
DSS (Digital Speech Standard) is a proprietary format jointly developed by Olympus, Philips, and Grundig specifically for portable digital dictation devices, and its technical constraints reflect that narrow purpose. The sole codec available in FFmpeg for DSS output is adpcm_ima_oki, an ADPCM variant derived from the OKI MSM6585 chip used in early digital speech hardware. It encodes at a fixed 8000 Hz sample rate in mono, making it entirely unsuitable for any audio content beyond spoken voice. The AVI source format introduces its own complexity: AVI uses interleaved audio/video with legacy index structures, and the default audio codec in many AVI files is libmp3lame (MP3). FFmpeg must fully decode the MP3 or AAC audio from the AVI wrapper before re-encoding to ADPCM IMA OKI — this is a full transcode, not a remux, and involves two lossy codec generations (the original AVI audio encoding, plus the DSS encoding), which further reduces audio fidelity. The resulting DSS files are extremely small — typically a few kilobytes per minute of audio — but the format's proprietary structure means playback requires Olympus DSS Player, Philips SpeechExec, or a compatible transcription application; most general-purpose media players do not support DSS natively. FFmpeg's DSS muxer support is read and write capable but community-maintained, so edge cases in DSS header structure may affect compatibility with specific firmware versions of dictation hardware.