Extract Audio from Y4M to CAF — Free Online Tool
Extract audio from a Y4M (YUV4MPEG2) video file and save it as a CAF (Core Audio Format) file with PCM 16-bit audio — ideal for bringing lossless intermediate video content into Apple-native audio workflows. Since Y4M carries no compressed audio by default, this tool targets any audio stream present and packages it into Apple's extensible CAF container.
to
FFmpeg Command
Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg
Drop your Y4M file here
or click to browse
Free — no uploads, no signups. Your files never leave your browser.
Settings
Note: Browser-based encoding uses approximate quality targets. For precise CRF compression, copy the FFmpeg command above and run it on your desktop.
Estimated output:
Conversion Complete!
DownloadHow It Works
Y4M is a raw, uncompressed video format used primarily as an intermediate format when piping video between tools like FFmpeg, x264, or VapourSynth — it stores rawvideo frames with no compression. Because Y4M files are not designed as a delivery format, any audio stream they carry is typically uncompressed PCM. This tool strips the raw video stream entirely using the -vn flag and encodes the audio into a CAF file using the PCM signed 16-bit little-endian codec (pcm_s16le). CAF is Apple's container format built to surpass the 4GB file size ceiling of AIFF and WAV, making it well-suited for high-resolution or long-duration audio. The result is a lossless-quality audio file ready for use in macOS and iOS audio applications.
What Each Flag Does
| Flag | What it does |
|---|---|
ffmpeg
|
Invokes the FFmpeg binary — the open-source multimedia processing engine that powers this conversion both in the browser via WebAssembly and on your local desktop installation. |
-i input.y4m
|
Specifies the input file in Y4M (YUV4MPEG2) format, which contains an uncompressed rawvideo stream and optionally an uncompressed PCM audio stream written by tools like VapourSynth, AviSynth, or FFmpeg itself. |
-vn
|
Disables video output entirely, telling FFmpeg to ignore the rawvideo stream in the Y4M file. This is essential here because Y4M's raw uncompressed video cannot be meaningfully included in a CAF audio-only container, and skipping it avoids unnecessary decoding overhead. |
-c:a pcm_s16le
|
Sets the audio codec to PCM signed 16-bit little-endian, the default and most compatible uncompressed audio format for the CAF container. This produces lossless audio output equivalent in quality to CD audio, and is natively readable by all Core Audio-based macOS and iOS applications. |
-b:a 128k
|
Sets the target audio bitrate to 128 kilobits per second. For the pcm_s16le codec used here, this flag has no practical effect since PCM is a fixed bit-depth format whose bitrate is determined by sample rate and channel count — it becomes relevant only if you switch to a variable-bitrate codec like AAC or libopus. |
output.caf
|
Specifies the output filename and triggers FFmpeg to write a CAF (Core Audio Format) container, Apple's extensible audio format that supports files beyond the 4GB limit of WAV and AIFF and integrates natively with Logic Pro, GarageBand, and Core Audio on macOS and iOS. |
Common Use Cases
- Extracting audio from a Y4M file produced by a VapourSynth or AviSynth pipeline to feed into a separate audio mastering workflow on macOS
- Pulling the PCM audio track out of a Y4M intermediate used during video encoding to archive it as a standalone CAF file for Apple Logic Pro or GarageBand
- Isolating audio from a Y4M file generated by a screen capture or test signal tool for quality analysis in macOS audio software
- Converting a Y4M file's audio track to CAF for use in an iOS or macOS app that relies on Core Audio's native CAF container support
- Extracting audio from a Y4M intermediate to create a reference audio file for A/B comparison against a compressed output in an audio post-production pipeline
- Archiving the audio component of a lossless Y4M master as a CAF file before discarding the large raw video frames to save storage
Frequently Asked Questions
Y4M (YUV4MPEG2) is primarily a raw video container and the format specification does not officially define an audio track. However, some tools and pipelines that write Y4M files do embed a PCM audio stream alongside the raw video frames. This tool will attempt to extract whatever audio stream FFmpeg detects in the file. If no audio stream exists — which is common for pure Y4M intermediates — the conversion will produce no output or an error, since there is nothing to extract.
If the source audio in the Y4M file is already 16-bit PCM, the conversion to CAF using pcm_s16le is completely lossless — a sample-accurate copy is made. If the source audio has a higher bit depth, such as 24-bit or 32-bit float, downsampling to 16-bit will introduce a minor, largely inaudible reduction in dynamic range. To preserve higher bit depths, you can modify the FFmpeg command to use pcm_s24le or pcm_f32le, both of which are supported by the CAF container.
WAV and AIFF both have a 4GB maximum file size limit, which can be a real constraint when extracting audio from large Y4M files that represent long recordings or high-sample-rate audio. CAF was designed by Apple specifically to remove this ceiling, supporting files of virtually unlimited size. CAF is also the native container for Core Audio on macOS and iOS, meaning it integrates seamlessly with applications like Logic Pro, Final Cut Pro, and AVFoundation-based iOS apps.
Yes. The CAF container supports multiple codecs including AAC, FLAC, libopus, libvorbis, and various PCM formats. To use FLAC for lossless compression, change -c:a pcm_s16le to -c:a flac in the FFmpeg command. For compressed output suitable for smaller file sizes, use -c:a aac along with a bitrate like -b:a 256k. Note that AAC and FLAC in CAF may have more limited compatibility with third-party tools compared to standalone .flac or .aac files.
The -b:a 128k flag in the command sets the target audio bitrate. For PCM codecs like pcm_s16le, the bitrate is determined by sample rate and bit depth rather than this flag, so changing -b:a has no meaningful effect on pcm_s16le output. The bitrate flag becomes relevant if you switch to a variable-bitrate codec like AAC or libopus. In that case, you can change 128k to values like 64k, 192k, or 320k depending on your quality needs.
Yes. On macOS or Linux, you can wrap the command in a shell loop: for f in *.y4m; do ffmpeg -i "$f" -vn -c:a pcm_s16le -b:a 128k "${f%.y4m}.caf"; done. On Windows Command Prompt, use a for loop: for %f in (*.y4m) do ffmpeg -i "%f" -vn -c:a pcm_s16le -b:a 128k "%~nf.caf". This is particularly useful when processing a batch of Y4M intermediates from a video encoding pipeline where you want to archive all audio tracks separately.
Technical Notes
Y4M files store rawvideo frames at full uncompressed resolution, which means even short clips can occupy gigabytes of disk space — audio, if present, is similarly uncompressed. When FFmpeg reads a Y4M file, it identifies the rawvideo stream and any accompanying PCM audio. The -vn flag ensures the video stream is completely ignored and not decoded, making the extraction process fast even for very large Y4M files since no video decoding overhead is incurred. The output CAF file using pcm_s16le will be uncompressed, so file size is proportional to audio duration and sample rate (a 48kHz stereo 16-bit stream produces approximately 5.5MB per minute). CAF supports channel layouts beyond stereo, so multichannel audio — if present in the Y4M source — will be preserved. One known limitation is that Y4M files generated by some piping tools may not write a proper audio stream header, which can cause FFmpeg to fail to detect audio even if raw PCM bytes are present; in such cases, you may need to demux the audio using -f s16le or similar raw format flags before re-muxing into CAF.