Extract Audio from MKV to DSS — Free Online Tool
Extract audio from MKV video files and convert it to DSS (Digital Speech Standard) format using the ADPCM IMA OKI codec — the same compressed audio encoding used in Olympus, Philips, and Grundig digital dictation devices. This tool runs entirely in your browser with no file uploads required.
to
FFmpeg Command
Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg
Drop your MKV file here
or click to browse
Free — no uploads, no signups. Your files never leave your browser.
Settings
Note: Browser-based encoding uses approximate quality targets. For precise CRF compression, copy the FFmpeg command above and run it on your desktop.
Estimated output:
Conversion Complete!
DownloadHow It Works
MKV files commonly carry audio encoded as AAC, MP3, Opus, Vorbis, or FLAC. During this conversion, FFmpeg strips the video stream entirely and re-encodes the audio track using the ADPCM IMA OKI codec, which is the proprietary compression algorithm at the heart of the DSS format. ADPCM IMA OKI is a form of Adaptive Differential Pulse-Code Modulation optimized for speech at very low bitrates and a fixed sample rate of 8000 Hz. This means any audio in the MKV — regardless of its original sample rate or channel configuration — will be downsampled to 8 kHz mono, which is sufficient for voice intelligibility but will discard stereo separation and high-frequency content entirely. The result is a compact DSS file designed to be played back on digital dictation hardware or compatible transcription software.
What Each Flag Does
| Flag | What it does |
|---|---|
ffmpeg
|
Invokes the FFmpeg command-line tool, which handles all the demuxing, decoding, re-encoding, and muxing required to convert the MKV audio stream into a DSS file. |
-i input.mkv
|
Specifies the input Matroska file, which may contain video, one or more audio streams (AAC, MP3, Opus, Vorbis, or FLAC), subtitles, chapters, and metadata — all of which FFmpeg will parse before extracting the default audio track. |
-vn
|
Disables video output entirely, which is essential here because DSS is a pure audio format with no video container support. Without this flag, FFmpeg would attempt to include the video stream and fail since DSS cannot carry it. |
-c:a adpcm_ima_oki
|
Instructs FFmpeg to encode the audio using the ADPCM IMA OKI codec — the specific compression algorithm used in the Digital Speech Standard format, optimized for low-bitrate mono speech at 8000 Hz as required by DSS dictation devices. |
output.dss
|
Defines the output filename with the .dss extension, which tells FFmpeg to mux the encoded audio into the DSS container format compatible with Olympus, Philips, and Grundig digital dictation hardware and transcription software. |
Common Use Cases
- Converting a recorded MKV interview or meeting video into DSS format for import into professional transcription software like Philips SpeechExec or Olympus DSS Player
- Preparing voice recordings captured as MKV screencasts or lecture videos for playback on a digital dictation device that only accepts DSS input
- Archiving spoken-word MKV content — such as oral history recordings or dictated notes — into the DSS format used by legal or medical transcription workflows
- Stripping video from an MKV dictation session recorded via webcam and converting the audio to DSS for distribution to a remote typist using dictation hardware
- Reducing file size of a long MKV voice memo or conference recording to the minimal DSS footprint for storage on older dictation device memory cards
Frequently Asked Questions
Yes, and this is expected by design. The DSS format uses ADPCM IMA OKI encoding at a fixed 8000 Hz sample rate in mono, which is far below CD-quality audio. Your MKV source may have audio at 44.1 kHz or 48 kHz stereo — all of that high-frequency content and stereo imaging will be lost in the conversion. DSS was engineered specifically for speech intelligibility on dictation devices, not music or high-fidelity audio. If your MKV contains spoken voice, the result will still be clearly understandable; if it contains music or rich soundscapes, the quality degradation will be severe.
Compatibility depends on the specific version of the dictation software and whether it validates DSS file headers strictly. FFmpeg produces DSS files using the ADPCM IMA OKI codec, which is the correct underlying codec for DSS, but some proprietary dictation applications like Olympus DSS Player or Philips SpeechExec may expect additional metadata or header fields specific to their hardware-generated files. Basic DSS-compatible players and transcription tools will generally open these files without issue, but hardware-specific features like voice-activation markers or priority flags present in device-generated DSS files will not be present.
The source codec in the MKV — whether AAC, FLAC, Opus, or MP3 — does not meaningfully affect the final DSS output quality, because the output is constrained entirely by the DSS format's own ceiling of 8 kHz mono speech-optimized audio. A lossless FLAC source and a 128k MP3 source will produce nearly identical DSS output, since both are being downsampled and re-encoded to the same narrow specification. The only practical difference is that a very low-quality source (such as a heavily compressed MP3) may introduce additional artifacts before the DSS re-encoding step.
All of them are discarded. The DSS format supports only a single mono audio stream — it has no container-level support for subtitles, chapters, metadata tags, or secondary audio tracks. FFmpeg's -vn flag drops the video stream, and only the first (default) audio track from the MKV is selected for encoding. If your MKV has multiple audio tracks — for example, a commentary track alongside the main audio — and you need a specific non-default track, you would need to modify the FFmpeg command to explicitly select it using the -map flag.
No. The DSS format as implemented through the ADPCM IMA OKI codec in FFmpeg does not expose adjustable bitrate or quality parameters. Unlike converting to MP3 or AAC where you can set -b:a to a specific bitrate, DSS output is determined entirely by the fixed codec specification: 8000 Hz sample rate, mono channel, ADPCM IMA OKI encoding. There are no flags you can add to the command to increase or decrease the output quality — the format simply does not support it.
On Linux or macOS, you can use a shell loop: for f in *.mkv; do ffmpeg -i "$f" -vn -c:a adpcm_ima_oki "${f%.mkv}.dss"; done. On Windows Command Prompt, use: for %f in (*.mkv) do ffmpeg -i "%f" -vn -c:a adpcm_ima_oki "%~nf.dss". This is especially useful for processing collections of MKV recordings larger than 1 GB, which exceed the browser tool's file size limit. Make sure FFmpeg is installed and available in your system PATH before running these commands.
Technical Notes
DSS (Digital Speech Standard) is a proprietary format co-developed by Olympus, Philips, and Grundig specifically for portable digital dictation. It is not a general-purpose audio format — its sole design goal is compact storage of intelligible speech. The ADPCM IMA OKI codec used in DSS files encodes audio at 8000 Hz sample rate in mono, resulting in extremely small file sizes but very limited frequency response (up to approximately 4000 Hz, per the Nyquist limit). This makes it entirely unsuitable for music, sound effects, or any content requiring stereo or wideband audio. MKV, by contrast, is a highly capable open container that can hold virtually any codec combination with full metadata, chapters, and multiple tracks — essentially the opposite end of the flexibility spectrum from DSS. During conversion, FFmpeg must perform a full re-encode (not a remux) because ADPCM IMA OKI will almost certainly not match any audio codec present in the MKV. The -vn flag is critical here to prevent FFmpeg from attempting to handle the video stream, which DSS cannot contain. No metadata from the MKV — including title, artist, or chapter markers — will survive in the DSS output, as the format's header structure does not accommodate general-purpose metadata fields.