Extract Audio from Y4M to VOC — Free Online Tool
Extract audio from a Y4M (YUV4MPEG2) video file and save it as a VOC file encoded with 8-bit unsigned PCM — the classic audio format developed by Creative Labs for Sound Blaster hardware. Since Y4M is a lossless uncompressed intermediate format that rarely carries audio, this tool is especially useful when working with piped video workflows where an accompanying audio stream needs to be isolated and preserved in a retro-compatible format.
to
FFmpeg Command
Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg
Drop your Y4M file here
or click to browse
Free — no uploads, no signups. Your files never leave your browser.
Settings
Note: Browser-based encoding uses approximate quality targets. For precise CRF compression, copy the FFmpeg command above and run it on your desktop.
Estimated output:
Conversion Complete!
DownloadHow It Works
Y4M is a raw, uncompressed video container primarily used as an intermediary between video processing tools — it encodes video as rawvideo frames and, when it does carry audio, stores it in an uncompressed PCM form. During this conversion, FFmpeg discards the video stream entirely and extracts only the audio stream, then encodes it as unsigned 8-bit PCM (pcm_u8) wrapped in the VOC container. VOC is a simple, headerless-style format with minimal framing overhead, originally designed for Creative Labs Sound Blaster cards. Because both the source and destination use uncompressed or minimally processed PCM audio, no lossy compression is introduced — the conversion is a straightforward re-wrapping and bit-depth mapping of raw audio data into the VOC structure.
What Each Flag Does
| Flag | What it does |
|---|---|
ffmpeg
|
Invokes the FFmpeg command-line tool, which handles all media demuxing, stream selection, codec processing, and muxing for this conversion entirely on your local machine. |
-i input.y4m
|
Specifies the input file as a Y4M (YUV4MPEG2) container. FFmpeg reads the raw uncompressed video frames and any accompanying audio stream from this file for processing. |
-vn
|
Disables video output entirely, telling FFmpeg to ignore the rawvideo stream in the Y4M file. This ensures only the audio stream is processed and written to the VOC output — no video frames are extracted or encoded. |
-c:a pcm_u8
|
Encodes the output audio as unsigned 8-bit PCM — the native and default codec for the VOC format, matching the original Sound Blaster hardware audio depth. This produces uncompressed, lossless-in-encoding audio at reduced bit depth compared to the original source. |
output.voc
|
Defines the output file name and triggers FFmpeg to use the VOC muxer based on the .voc extension. The resulting file will be a Creative Labs VOC audio file containing the extracted PCM audio stream, ready for playback in DOS applications, DOSBox, or retro audio hardware. |
Common Use Cases
- Extracting audio from a Y4M intermediate file produced by a lossless video pipeline (e.g., a rawvideo ffmpeg pipe) to use as a sound effect or sample in a DOS-era game or retro application.
- Converting uncompressed audio carried in a Y4M file into VOC format for playback on original Sound Blaster hardware or DOSBox-based emulation environments.
- Archiving audio content from a Y4M lossless video master into VOC for use in a retro game modding project that requires Sound Blaster-compatible audio assets.
- Isolating the audio track from a Y4M file generated by a video processing tool (such as VapourSynth or AviSynth) to verify or audit the audio content independently of the video frames.
- Preparing audio assets from a raw video source for integration into a vintage multimedia CD-ROM or demoscene production that relies on VOC as its native audio format.
- Stripping the video component from a Y4M test signal or benchmark clip to produce a standalone PCM audio file in VOC format for acoustic testing on legacy sound hardware.
Frequently Asked Questions
Y4M is primarily a video-only format and most Y4M files in the wild carry no audio track at all — it is designed for lossless video piping between tools like FFmpeg, VapourSynth, and x264, where audio is handled separately. If your Y4M file has no embedded audio stream, FFmpeg will produce an error or an empty output file because there is nothing to extract. You can check whether your Y4M file contains audio by running 'ffmpeg -i input.y4m' and inspecting the stream list before attempting the conversion.
The VOC format's default audio codec in FFmpeg is pcm_u8 — 8-bit unsigned PCM — which reflects the original Sound Blaster hardware's native audio depth from the early 1990s. Eight-bit PCM has a dynamic range of roughly 48 dB, compared to about 96 dB for 16-bit audio, meaning quieter sounds may lose detail and background noise becomes more audible. If your source audio has significant dynamic range, you will notice a reduction in fidelity. VOC also supports pcm_s16le (signed 16-bit little-endian PCM), which you can specify with '-c:a pcm_s16le' in the command for better quality at the cost of slightly reduced compatibility with the oldest Sound Blaster cards.
Yes — replace '-c:a pcm_u8' with '-c:a pcm_s16le' in the command to produce a 16-bit signed little-endian PCM VOC file, which is the other codec supported by the VOC container in FFmpeg. The full command would become: 'ffmpeg -i input.y4m -vn -c:a pcm_s16le output.voc'. This doubles the per-sample bit depth and substantially improves dynamic range, but note that some very old DOS applications and original Sound Blaster drivers only supported 8-bit VOC playback, so compatibility with legacy software may be reduced.
The Y4M file is almost certainly vastly larger than the resulting VOC file, because Y4M stores every video frame as raw uncompressed pixel data — a few seconds of 1080p Y4M video can occupy several gigabytes. The VOC output contains only the audio stream encoded as uncompressed 8-bit PCM, so its size is determined solely by audio duration and sample rate: for example, mono audio at 44100 Hz and 8-bit depth produces roughly 2.5 MB per minute. Expect the VOC file to be a tiny fraction of the size of the original Y4M source.
The VOC format stores basic audio parameters — sample rate and channel count — in its file header, and FFmpeg will carry these values over from the source audio stream during conversion. However, VOC does not support rich metadata like ID3 tags, track titles, artist names, or chapter markers, so any such metadata embedded in the source will be lost. If your Y4M file's audio is, for example, stereo at 48000 Hz, the VOC file will reflect those same parameters in its header as long as the playback application respects them.
Yes — on Linux or macOS you can loop over multiple files in a shell with a command like: 'for f in *.y4m; do ffmpeg -i "$f" -vn -c:a pcm_u8 "${f%.y4m}.voc"; done'. On Windows Command Prompt you can use: 'for %f in (*.y4m) do ffmpeg -i "%f" -vn -c:a pcm_u8 "%~nf.voc"'. This will apply the same extraction and encoding settings to every Y4M file in the current directory and produce a corresponding VOC file for each. The browser-based tool on this page processes one file at a time, so the FFmpeg command is the recommended approach for bulk workflows.
Technical Notes
Y4M (YUV4MPEG2) encodes video as raw planar YUV frames with a minimal ASCII header per frame — it is strictly a lossless intermediate format and carries no compression for either video or audio. When audio is present in a Y4M file, it is typically raw PCM data, making the extraction step straightforward with no transcoding penalty on the audio side beyond the bit-depth mapping to pcm_u8. The VOC container itself is a very simple format: it uses a file signature ('Creative Voice File'), a header specifying the data offset and version, and then one or more data blocks each tagged with a block type byte, length field, sample rate divisor, and raw PCM samples. FFmpeg's VOC muxer handles this block structure automatically. One known limitation is that VOC's sample rate encoding in the legacy 8-bit block format uses an 8-bit divisor (rate = 1,000,000 / (256 - divisor)), which limits representable sample rates — common rates like 8000, 11025, 22050, and 44100 Hz are well-supported, but unusual rates may be rounded. Stereo support in VOC depends on the block type used; FFmpeg targets the extended block type for stereo compatibility. No subtitle, chapter, or video transparency data is relevant to either format in this conversion.