Convert Y4M to DSS — Free Online Tool

Convert Y4M (YUV4MPEG2) video files to DSS (Digital Speech Standard) audio, extracting and encoding the audio stream using the ADPCM IMA OKI codec optimized for speech dictation. Since Y4M is a lossless, uncompressed intermediate video format and DSS is a low-bitrate proprietary audio format designed for Olympus, Philips, and Grundig dictation devices, this tool handles the format mismatch entirely in your browser.

FFmpeg Command

Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg

Free — no uploads, no signups. Your files never leave your browser.

Estimated output:

Conversion Complete!

Download

How It Works

Y4M files carry raw, uncompressed YUV video frames and may contain an audio stream. During this conversion, FFmpeg discards the raw video stream entirely and extracts the audio, then encodes it using the ADPCM IMA OKI codec — the codec native to the DSS (Digital Speech Standard) container. ADPCM IMA OKI is a lossy adaptive delta pulse-code modulation algorithm tuned for narrow-band speech frequencies, which means audio fidelity outside the human voice range is significantly reduced. The output DSS file is extremely small compared to the source Y4M file and is intended for playback on dedicated digital dictation hardware or compatible transcription software.

What Each Flag Does

Flag What it does
ffmpeg Invokes the FFmpeg media processing engine, which in this browser tool runs as a WebAssembly (FFmpeg.wasm) build — the same logic and codec support as the native desktop binary.
-i input.y4m Specifies the input file as a Y4M (YUV4MPEG2) file. FFmpeg reads the raw uncompressed video frames and any audio stream present in the Y4M container. If no audio stream exists in the file, the conversion will not produce a valid DSS output.
-c:a adpcm_ima_oki Sets the audio codec to ADPCM IMA OKI, the specific lossy speech codec required by the DSS (Digital Speech Standard) container. This encodes audio at 4 bits per sample using adaptive delta PCM, reducing the audio to narrow-band dictation quality suitable for Olympus and Philips dictation devices.
output.dss Defines the output file as a DSS file. FFmpeg infers the DSS container format from the .dss extension and wraps the ADPCM IMA OKI encoded audio into the Digital Speech Standard container, discarding the Y4M raw video stream in the process.

Common Use Cases

  • Archiving voice-over narration or spoken commentary recorded into a Y4M intermediate file for use with Olympus or Philips digital dictation workflows
  • Extracting a spoken audio track from a lossless Y4M production file to send to a transcription service that accepts DSS dictation files
  • Converting a Y4M capture from a screen recording of a video conference into a DSS file for dictation device playback or note review
  • Preparing spoken-word audio from a Y4M file for import into dictation software such as Philips SpeechExec or Olympus Dictation Management System
  • Reducing the storage footprint of a Y4M file's audio content for long-term archival of speech-only material in a format natively readable by dictation hardware

Frequently Asked Questions

DSS is a pure audio container — it was designed exclusively for digital speech dictation and has no provision for storing video streams. The ADPCM IMA OKI codec it uses encodes only audio. FFmpeg will automatically drop the raw video stream from your Y4M file and encode only the audio into the DSS output. If your Y4M file contains no audio track, the output DSS file will be empty or invalid.
The quality drop can be significant. DSS uses the ADPCM IMA OKI codec, which is optimized for narrow-band speech — typically sampled at 8000 Hz — and applies aggressive lossy compression. Music, ambient sound, or wide-frequency audio in your Y4M file will sound muffled and degraded. The format is designed to make spoken voice intelligible, not to preserve audio fidelity. If your Y4M audio contains anything other than speech, DSS is a poor choice of output format.
The exact command is: ffmpeg -i input.y4m -c:a adpcm_ima_oki output.dss. Install FFmpeg from ffmpeg.org, replace 'input.y4m' with the full path to your file and 'output.dss' with your desired output path, then run it in your terminal or command prompt. This is identical to what runs in your browser via WebAssembly, so results will match exactly.
Yes, with a shell loop. On Linux or macOS: for f in *.y4m; do ffmpeg -i "$f" -c:a adpcm_ima_oki "${f%.y4m}.dss"; done. On Windows Command Prompt: for %f in (*.y4m) do ffmpeg -i "%f" -c:a adpcm_ima_oki "%~nf.dss". The browser-based tool processes one file at a time, so the command-line approach is recommended for batch jobs.
Y4M (YUV4MPEG2) is primarily a video format and audio support in Y4M files is uncommon — many Y4M files contain no audio track at all. If your Y4M file was produced by a tool that piped only raw video frames (e.g., x264 or ffmpeg piping), there will be no audio to extract, and the conversion will either fail or produce an empty DSS file. You can verify whether your Y4M file has audio by running ffprobe input.y4m and checking for an audio stream in the output.
DSS with the ADPCM IMA OKI codec typically operates at 8000 Hz mono, which is standard for digital dictation. If your Y4M file's audio is at a higher sample rate (e.g., 48000 Hz stereo), FFmpeg will automatically resample and downmix it to match the codec's requirements during encoding. This resampling is part of what causes the audible quality reduction — you are going from a potentially high-fidelity uncompressed audio source down to narrow-band dictation-grade audio.

Technical Notes

Y4M (YUV4MPEG2) is a headerless, uncompressed video format widely used as a lossless intermediate in video processing pipelines — tools like x264, VPxenc, and various film grain synthesis utilities read and write it. Its audio support is vestigial at best; most Y4M files encountered in the wild contain no audio stream at all, making this conversion a niche but valid use case when Y4M is used as a capture or intermediate format in a production environment that incidentally records audio. The output format, DSS (Digital Speech Standard), is a proprietary container co-developed by Olympus, Philips, and Grundig in the 1990s for digital pocket dictation recorders. Its sole audio codec in FFmpeg's implementation is ADPCM IMA OKI — a variant of Dialogic/OKI ADPCM that encodes 4 bits per sample and targets 8 kHz mono audio. There are no adjustable quality parameters for this codec in FFmpeg; bitrate and sample format are fixed by the codec specification. Metadata from the Y4M source (there is essentially none beyond frame rate and color space) will not carry over to the DSS file. DSS playback outside of dedicated dictation hardware and software (Philips SpeechExec, Olympus DSS Player, Dragon NaturallySpeaking) is limited — most general-purpose media players do not support it natively.

Related Tools