Extract Audio from MP4 to DSS — Free Online Tool
Extract audio from an MP4 video and convert it to DSS (Digital Speech Standard) format using the ADPCM IMA OKI codec — the proprietary compressed audio format used by Olympus, Philips, and Grundig digital dictation devices. This tool is ideal for preparing speech recordings for compatibility with professional dictation hardware and transcription workflows.
to
FFmpeg Command
Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg
Drop your MP4 file here
or click to browse
Free — no uploads, no signups. Your files never leave your browser.
Settings
Note: Browser-based encoding uses approximate quality targets. For precise CRF compression, copy the FFmpeg command above and run it on your desktop.
Estimated output:
Conversion Complete!
DownloadHow It Works
During this conversion, FFmpeg discards the MP4's video stream entirely and re-encodes the audio stream using the ADPCM IMA OKI codec, which is the sole audio codec supported by the DSS container. DSS was designed specifically for low-bitrate speech capture on digital dictation recorders, so the codec applies aggressive compression optimized for the narrow frequency range of human voice. Unlike a simple remux, this is a full audio transcode — the source audio (typically AAC or MP3 inside the MP4) is decoded and then re-encoded from scratch into ADPCM IMA OKI. No video data is written to the output file, and DSS does not support video, chapters, subtitles, or multiple audio tracks.
What Each Flag Does
| Flag | What it does |
|---|---|
ffmpeg
|
Invokes the FFmpeg program, which handles all decoding, filtering, and encoding operations. In this conversion it will decode the MP4 container, strip the video, and re-encode the audio into DSS format. |
-i input.mp4
|
Specifies the input file — an MP4 container that typically carries a video stream (e.g., H.264) and an audio stream (e.g., AAC). FFmpeg will parse both streams from this container, but only the audio will be used for the DSS output. |
-vn
|
Disables video output entirely. Since DSS is an audio-only format with no video support, this flag ensures FFmpeg does not attempt to write any video data and avoids errors from the DSS muxer encountering a video stream. |
-c:a adpcm_ima_oki
|
Selects the ADPCM IMA OKI encoder for the audio stream — the only codec the DSS container supports. FFmpeg will decode the source MP4 audio (typically AAC) and re-encode it using this codec at the fixed 8 kHz mono specification required by the DSS format. |
output.dss
|
Defines the output filename and, through the .dss extension, tells FFmpeg to use the DSS muxer for writing the container. The resulting file will be a valid DSS audio file suitable for use with Olympus, Philips, or Grundig dictation hardware and compatible transcription software. |
Common Use Cases
- Transferring a speech recording or voice memo captured as an MP4 (e.g., from a smartphone) into a DSS file for playback or archival on an Olympus or Philips digital dictation device
- Preparing recorded interviews or dictated notes from a video source for ingestion into professional transcription software that requires DSS input, such as Philips SpeechExec
- Converting a recorded lecture or meeting saved as MP4 into a compact DSS file optimized for speech, reducing file size for storage on dictation hardware with limited memory
- Migrating a legacy voice workflow where a client or legal professional requires DSS deliverables but source recordings were made on a camera or screen recorder producing MP4 output
- Stripping the video from an MP4 interview recording to produce a DSS audio file for a medical or legal transcriptionist whose workstation software only accepts DSS format
Frequently Asked Questions
Yes — DSS is a lossy, speech-optimized format using ADPCM IMA OKI compression, which is far more aggressive than the AAC or MP3 audio typically found inside an MP4. The format was engineered for intelligible voice reproduction at very low bitrates, not for music or broadband audio. If your MP4 contains music, sound effects, or high-fidelity audio, those will sound noticeably degraded in the DSS output. For spoken-word content the result is generally acceptable, matching the quality expected from a physical dictation recorder.
The DSS container is a proprietary, closed format jointly developed by Olympus, Philips, and Grundig specifically for their dictation hardware ecosystem. Its specification mandates ADPCM IMA OKI as the audio codec — there is no provision for alternative codecs within the format. This is unlike general-purpose containers such as MP4 or MKV, which support a range of codecs. FFmpeg's DSS muxer reflects this constraint, so adpcm_ima_oki is the only valid codec choice.
No. Unlike the MP4 input format, DSS has no user-configurable quality or bitrate parameter exposed through FFmpeg. The ADPCM IMA OKI codec operates at a fixed rate determined by the format specification itself, which is why the FFmpeg command contains no -b:a or -q:a flag. The resulting file size and audio quality are fixed properties of the DSS format, not something you can tune per-conversion.
All of them are discarded. The -vn flag explicitly instructs FFmpeg to drop the video stream, and since DSS supports only a single mono audio track with no subtitle, chapter, or metadata extensions, none of that information can be carried into the output. If your MP4 contains embedded subtitles or chapter markers you want to preserve, you would need to extract them separately before running this conversion.
You can run the command in a loop from your terminal. On Linux or macOS: for f in *.mp4; do ffmpeg -i "$f" -vn -c:a adpcm_ima_oki "${f%.mp4}.dss"; done. On Windows Command Prompt: for %f in (*.mp4) do ffmpeg -i "%f" -vn -c:a adpcm_ima_oki "%~nf.dss". Each file is processed sequentially, producing a matching .dss file. This is particularly useful when you have a folder of recorded interviews or dictations that all need to be prepared for a DSS-based transcription system.
Compatibility depends on the specific software version and device firmware. FFmpeg produces a DSS file using the ADPCM IMA OKI codec per the format specification, which should be recognized by major DSS-compatible applications such as Olympus Dictation Management System (ODMS) and Philips SpeechExec. However, some hardware devices or older software versions may enforce additional proprietary header requirements beyond what FFmpeg writes. It is advisable to test playback with your specific device or application before relying on batch-converted files in a production workflow.
Technical Notes
DSS (Digital Speech Standard) is a highly specialized format with virtually no flexibility at the codec level — the ADPCM IMA OKI codec is hardcoded into the container specification, operating at a fixed sample rate of 8000 Hz in mono, which is deliberately constrained to the voice frequency range (roughly 300–3400 Hz). Any stereo audio in the source MP4 will be downmixed to mono during the transcode. The source MP4's AAC or MP3 audio is fully decoded and then re-encoded, meaning this is not a lossless extraction — every generation of transcoding introduces additional lossy compression artifacts. There is no mechanism to embed ID3-style metadata (artist, title, album) in a DSS file through FFmpeg, so tags present in the MP4 will not appear in the output. File sizes are typically very small due to the low fixed bitrate, making DSS efficient for storing large volumes of voice recordings. Note that because DSS is a proprietary format not designed for general media consumption, playback support outside of dedicated dictation software is limited — most consumer media players do not support it natively.