Extract Audio from M4V to DSS — Free Online Tool

Extract audio from M4V video files and convert it to DSS (Digital Speech Standard) format using the ADPCM IMA OKI codec — optimized for digital dictation devices from Olympus, Philips, and Grundig. This tool strips the AAC or MP3 audio track from your iTunes-compatible M4V container and re-encodes it into the compressed, speech-focused DSS format entirely in your browser.

FFmpeg Command

Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg

Free — no uploads, no signups. Your files never leave your browser.

Estimated output:

Conversion Complete!

Download

How It Works

M4V files typically carry AAC audio (sometimes MP3) inside an MPEG-4 container alongside an H.264 or H.265 video stream. This conversion first discards the video stream entirely, then decodes the audio and re-encodes it using the ADPCM IMA OKI codec into a DSS container. ADPCM IMA OKI is a lossy adaptive delta pulse-code modulation codec purpose-built for low-bitrate speech recording — it operates at a fixed, narrow sample rate and does not support configurable bitrate or quality parameters like AAC does. The result is a highly compressed mono audio file sized for digital dictation hardware, not for music or general media playback.

What Each Flag Does

Flag What it does
ffmpeg Invokes the FFmpeg tool, which handles all demuxing, decoding, encoding, and muxing for this conversion. In the browser version, this runs via FFmpeg.wasm compiled to WebAssembly — no data leaves your machine.
-i input.m4v Specifies the input M4V file. FFmpeg reads the MPEG-4 container, identifying the video stream (typically H.264) and audio stream (typically AAC) inside the M4V wrapper.
-vn Disables video output entirely, instructing FFmpeg to ignore the H.264 or H.265 video stream from the M4V. Since DSS is a pure audio format with no video support, this flag is required and ensures no video data is processed or written.
-c:a adpcm_ima_oki Selects the ADPCM IMA OKI audio encoder, which is the codec used by the Digital Speech Standard (DSS) format. This re-encodes the decoded M4V audio (originally AAC) into the narrow-band, speech-optimized compressed format that DSS dictation devices and software expect.
output.dss Defines the output filename and triggers FFmpeg to use the DSS container muxer. The .dss extension tells FFmpeg to wrap the ADPCM IMA OKI encoded audio in the DSS format, making it compatible with Olympus, Philips, and Grundig dictation software.

Common Use Cases

  • Convert a recorded M4V video interview or lecture into DSS format for transcription using Olympus or Philips dictation software that only accepts DSS files.
  • Prepare spoken-word content recorded on an iPhone (saved as M4V via iTunes) for playback and review on a legacy digital dictation device that does not support AAC or MP4.
  • Archive a video deposition or legal proceeding's audio track in DSS format to integrate with a legal transcription workflow built around dictation hardware.
  • Strip the audio from an iTunes-downloaded M4V podcast video and deliver it in DSS format to a transcriptionist whose software requires the DSS or DS2 standard.
  • Extract a recorded voice memo or spoken narration embedded in an M4V screencast and convert it to DSS for review on a handheld Grundig digital dictaphone.
  • Batch-prepare audio tracks from a series of M4V training videos for upload into a DSS-based transcription management system used in medical or legal offices.

Frequently Asked Questions

Expect a significant reduction in audio quality. M4V files typically carry AAC audio at 128 kbps or higher with a full 44.1 kHz sample rate, while the DSS format using ADPCM IMA OKI is engineered for narrow-band speech at a much lower sample rate (typically 8 kHz). Music, sound effects, or any high-frequency content in the original M4V audio will be lost. DSS is purpose-built for the intelligibility of human speech in dictation contexts, not for fidelity to the original recording.
No — DSS with the ADPCM IMA OKI codec does not support configurable audio quality or bitrate parameters the way AAC in M4V does. The codec operates at a fixed compression rate determined by the format specification. Unlike converting M4V to MP3 where you can set -b:a to 192k or 320k, the DSS output quality is fixed and cannot be adjusted in the FFmpeg command.
All video data is discarded via the -vn flag and never written to the output. DSS does not support chapters, subtitles, or multiple audio tracks — all of which M4V can carry. Only the first (or default) audio stream from the M4V is decoded and converted. If your M4V has multiple audio tracks (such as a director's commentary or a foreign-language dub), only one will be carried through; you would need to specify -map to select a non-default track manually in the FFmpeg command.
Yes — DRM-protected M4V files purchased from the iTunes Store cannot be processed by FFmpeg or any browser-based tool. FairPlay DRM encryption prevents the audio stream from being decoded without Apple's authorization. This tool will work correctly only with DRM-free M4V files, such as those you created yourself, exported from iMovie or Final Cut Pro, or obtained from DRM-free sources.
Add a -map flag before the output filename to select a specific audio stream. For example, to select the second audio track (index 1), use: ffmpeg -i input.m4v -vn -map 0:a:1 -c:a adpcm_ima_oki output.dss. You can identify available audio streams by running ffmpeg -i input.m4v and reading the stream listing in the output. M4V files from iTunes or professional productions sometimes include multiple language tracks that are not selected by default.
DSS files are primarily designed for Olympus DSS Player, Philips SpeechExec, and Grundig Digta software, as well as professional transcription platforms like Express Scribe. General media players such as VLC have limited or no support for ADPCM IMA OKI DSS files. If you need audio from an M4V for general playback rather than dictation hardware or transcription software, a format like MP3 or WAV would be a more practical target.

Technical Notes

The DSS format is a proprietary standard developed jointly by Olympus, Philips, and Grundig and is deeply tied to the digital dictation industry — it is not a general-purpose audio container. FFmpeg's DSS muxer encodes audio using the ADPCM IMA OKI codec, which is a variant of Adaptive Differential Pulse Code Modulation optimized for low-bitrate speech reproduction. The codec operates at 8 kHz sample rate and mono channel output, meaning any stereo content in the M4V's AAC audio will be downmixed. There are no audio quality options (-b:a has no effect here), no metadata fields supported in the DSS output container, and no way to embed artwork, track titles, or chapter markers. Because M4V's AAC audio is a full-bandwidth lossy format and DSS/ADPCM IMA OKI is a narrow-band lossy format, this conversion involves two stages of lossy compression with a dramatic downsampling step — the output is appropriate only for speech intelligibility use cases such as dictation review and transcription, not for music or broadcast audio. Files over 1GB can be handled by downloading the FFmpeg command and running it locally, which is particularly relevant if you are batch-converting long M4V recordings.

Related Tools