Extract Audio from HEVC to DSS — Free Online Tool

Extract audio from HEVC/H.265 video files and convert it to DSS (Digital Speech Standard) format using the ADPCM IMA OKI codec — a format purpose-built for digital dictation devices from Olympus, Philips, and Grundig. This tool is ideal when you need to repurpose speech recorded in H.265 video for playback on professional dictation hardware or transcription workflows.

FFmpeg Command

Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg

Free — no uploads, no signups. Your files never leave your browser.

Estimated output:

Conversion Complete!

Download

How It Works

During this conversion, FFmpeg strips the H.265 video stream entirely and extracts only the audio track from the HEVC container. That audio is then re-encoded using the ADPCM IMA OKI codec, which is the proprietary compression algorithm used by DSS files. ADPCM IMA OKI is a lossy, low-bitrate codec optimized for voice frequencies — it aggressively compresses audio in a way that preserves speech intelligibility while discarding high-frequency content and musical detail. Because the source is HEVC video (which may carry any audio codec, or none at all), re-encoding to ADPCM IMA OKI is always required rather than a simple stream copy.

What Each Flag Does

Flag What it does
ffmpeg Invokes the FFmpeg tool, which handles all demuxing, decoding, encoding, and muxing in this conversion pipeline.
-i input.hevc Specifies the input HEVC file. FFmpeg will parse the H.265 video stream and any accompanying audio stream from this file; both are read, but only the audio will be used in the output.
-vn Stands for 'video none' — this flag tells FFmpeg to discard the H.265 video stream entirely and exclude it from the output. Since DSS is an audio-only format, this is essential to prevent FFmpeg from attempting to encode video into a container that cannot hold it.
-c:a adpcm_ima_oki Instructs FFmpeg to encode the extracted audio using the ADPCM IMA OKI codec, which is the proprietary compression algorithm at the core of the DSS (Digital Speech Standard) format. This codec is specifically designed for low-bitrate voice recording and is required for producing valid DSS files compatible with Olympus, Philips, and Grundig dictation systems.
output.dss Defines the output filename and container. The .dss extension tells FFmpeg to mux the ADPCM IMA OKI audio stream into a DSS container, producing a file ready for use with professional digital dictation devices and transcription software.

Common Use Cases

  • Converting a dictation session accidentally recorded as an H.265 video on a smartphone into a DSS file compatible with Olympus or Philips dictation transcription software
  • Extracting spoken notes or voice memos from HEVC footage captured on a modern mirrorless camera for upload to a digital dictation management system
  • Repurposing interview recordings shot in H.265 4K video into compact DSS files for legal or medical transcription workflows that require DSS input
  • Archiving spoken-word content from HEVC conference recordings into DSS format for compatibility with legacy dictation playback devices
  • Reducing file size of speech-only content captured in high-efficiency HEVC video by converting to the ultra-low-bitrate DSS format for storage on dictation device memory cards

Frequently Asked Questions

Yes — DSS uses the ADPCM IMA OKI codec, which is a very low-bitrate lossy format designed specifically for voice intelligibility, not audio fidelity. Music, ambient sound, or complex audio in your HEVC file will sound significantly degraded or distorted in the DSS output. For speech and dictation content, intelligibility is generally preserved well, which is exactly what DSS was engineered for.
No. DSS is a mono format — stereo or multichannel audio from your HEVC source will be downmixed to a single channel during conversion. This is consistent with DSS's intended use case of single-speaker dictation, where stereo separation provides no practical benefit and wastes storage on dictation devices.
No — if your HEVC file contains no audio stream, FFmpeg will produce an error because there is nothing to extract and encode. The DSS output format is audio-only, so this conversion depends entirely on the presence of an audio track in the source HEVC file. You can check whether your file has audio by running 'ffmpeg -i input.hevc' and examining the stream list in the output.
You can wrap the FFmpeg command in a shell loop to process multiple files. On Linux or macOS, use: 'for f in *.hevc; do ffmpeg -i "$f" -vn -c:a adpcm_ima_oki "${f%.hevc}.dss"; done'. On Windows Command Prompt, use: 'for %f in (*.hevc) do ffmpeg -i "%f" -vn -c:a adpcm_ima_oki "%~nf.dss"'. This will process each HEVC file in the current directory and produce a corresponding DSS file.
Several factors combine to make the DSS file dramatically smaller. First, the entire video stream is discarded — often 90–99% of an HEVC file's data. Second, ADPCM IMA OKI is an extremely low-bitrate codec targeting speech, producing files far smaller than even compressed audio codecs like AAC or MP3. A long HEVC video recording might yield a DSS file that is only a fraction of a percent of the original size.
DSS files produced with the ADPCM IMA OKI codec are intended for compatibility with Olympus Dictation Management System (ODMS), Philips SpeechExec, and Grundig DigtaSoft, as well as hardware dictation devices from those manufacturers. However, compatibility can vary between DSS variants (DSS vs. DSS Pro), and some software may expect specific sampling rates or header metadata. It is worth testing the output in your target application before processing large batches.

Technical Notes

The DSS format exists in two variants — original DSS and DSS Pro — and FFmpeg's implementation via the adpcm_ima_oki codec targets the original DSS specification. The ADPCM IMA OKI codec operates at a fixed low sample rate (typically 8000 Hz), meaning any higher-quality audio in the HEVC source — whether it was encoded in AAC, AC-3, Opus, or another codec — will be downsampled significantly. There are no tunable audio quality parameters for this output format; the codec and bitrate are essentially fixed by the DSS specification, so FFmpeg offers no -q:a or -b:a controls that will meaningfully alter the output. Metadata such as chapter markers, subtitles, or multiple audio tracks present in the HEVC source are all discarded — DSS supports none of these features. Because HEVC files are commonly used for HDR and high-frame-rate video content, it is worth noting that none of those video-side attributes have any bearing on the audio extraction process. The conversion is entirely determined by whatever audio stream exists in the source file.

Related Tools