Extract Audio from Y4M to M4B — Free Online Tool

Extract audio from a Y4M (YUV4MPEG2) video file and save it as an M4B audiobook file encoded in AAC at 128k bitrate. Since Y4M is a raw, uncompressed intermediate format that typically carries no audio, this tool is ideal for workflows where audio has been piped alongside raw video and needs to be isolated into a bookmarkable, chapter-ready M4B container.

FFmpeg Command

Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg

Free — no uploads, no signups. Your files never leave your browser.

Estimated output:

Conversion Complete!

Download

How It Works

Y4M is a headerless, uncompressed raw video format commonly used as an intermediary in lossless video pipelines — it rarely contains an audio stream by design, but when it does (via piped workflows or custom toolchains), that audio is untouched raw PCM or similar uncompressed data. This tool strips the raw video stream entirely using the -vn flag and re-encodes any detected audio into AAC, the native codec for the MPEG-4 container family. The resulting M4B file is a fully standards-compliant audiobook container: it supports chapter markers, bookmarking (so listeners can resume where they left off), and is optimized for streaming with the +faststart flag, which moves the MOOV atom to the front of the file.

What Each Flag Does

Flag What it does
ffmpeg Invokes the FFmpeg binary — the open-source multimedia processing engine that powers this conversion. In the browser-based version of this tool, FFmpeg runs entirely via WebAssembly (ffmpeg.wasm) with no server involvement.
-i input.y4m Specifies the input Y4M file. FFmpeg reads the raw YUV4MPEG2 video stream and any audio stream present. Because Y4M is an uncompressed format, FFmpeg can parse it directly without a decoder step for the video portion.
-vn Disables video output entirely, telling FFmpeg to ignore the raw YUV video stream from the Y4M file. Since M4B is an audio-only container and we only want the audio, this flag is essential to prevent FFmpeg from attempting to encode the uncompressed video frames.
-c:a aac Encodes the audio stream using FFmpeg's built-in AAC encoder. AAC is the native and default codec for the M4B/MPEG-4 container family and is required for compatibility with Apple Books, iOS, and most podcast players that handle M4B files.
-b:a 128k Sets the AAC audio bitrate to 128 kilobits per second. This is a standard quality level for audiobooks and podcasts — sufficient for clear, intelligible speech reproduction with a manageable file size. You can raise this to 192k or 256k for music or higher-fidelity requirements.
-movflags +faststart Relocates the MOOV atom (the file's structural metadata and index) to the beginning of the M4B file. This enables progressive streaming of the audiobook over the web or podcast feeds — without it, the entire file must be downloaded before playback can begin.
output.m4b Defines the output filename and signals to FFmpeg that the container format should be MPEG-4 with the .m4b extension. The .m4b extension is recognized by Apple Books and compatible podcast apps as an audiobook file, enabling chapter navigation and resume-from-bookmark functionality.

Common Use Cases

  • Extracting narration or voice-over audio recorded alongside raw Y4M video in a professional video production pipeline and packaging it as a resumable M4B audiobook for distribution
  • Converting a lossless Y4M video master — produced by tools like ffmpeg2theora or AviSynth — into an M4B file to archive just the audio commentary track in a podcast-compatible format
  • Isolating a spoken-word audio track from a Y4M intermediate file generated during a video processing chain and delivering it to Apple Books or podcast platforms that prefer M4B
  • Pulling audio out of a Y4M file produced by a Linux-based video synthesis or demuxing tool (such as mjpegtools) and packaging it with fast-start streaming metadata for web playback
  • Creating an audiobook-ready M4B from the audio component of a Y4M research or archival video, leveraging AAC compression to significantly reduce file size from the raw original

Frequently Asked Questions

Standard Y4M (YUV4MPEG2) files are technically a video-only format and most Y4M files you encounter will have no audio stream at all. However, some custom piping workflows or tools can mux audio alongside Y4M video frames. If your Y4M file has no audio stream, FFmpeg will produce an empty or zero-duration M4B file. You can verify whether your Y4M file contains audio by running 'ffmpeg -i input.y4m' and checking the stream list before converting.
Yes, the conversion from any raw/uncompressed audio in a Y4M pipeline to M4B involves lossy AAC encoding at 128k bitrate by default. AAC at 128k is generally considered transparent for speech content like audiobooks and podcasts, but it is not lossless. If your original audio was uncompressed PCM, you will experience some compression artifact — though at 128k AAC, this is typically inaudible for voice recordings. If preserving exact audio fidelity is critical, consider increasing the bitrate to 192k or 256k.
Replace the '128k' value in the '-b:a 128k' flag with your desired bitrate. For example, use '-b:a 64k' for smaller files suitable for speech-only audiobooks, '-b:a 192k' for higher fidelity, or '-b:a 256k' for near-transparent AAC quality. The full command with a higher bitrate would look like: ffmpeg -i input.y4m -vn -c:a aac -b:a 192k -movflags +faststart output.m4b. Bitrates above 192k offer diminishing returns for AAC audio, especially for spoken-word content.
The M4B container itself fully supports chapters and bookmarking — these are core features of the format used by Apple Books, VLC, and podcast apps. However, this FFmpeg command does not inject chapter metadata automatically; it only extracts the audio and places it in a chapter-capable container. To add chapter markers, you would need to supply a chapter metadata file to FFmpeg using the '-i chapters.txt' flag and the ffmetadata format, or add them post-conversion using a tool like mp4chaps.
The -movflags +faststart flag moves the MOOV atom (the file's index and metadata block) from the end of the MP4/M4B file to the beginning. This is essential for web streaming and podcast delivery because it allows media players to begin playback before the entire file has downloaded. Without it, a remote M4B file cannot be streamed progressively — the player must download the whole file first. If you are only using the M4B file locally on your device, this flag has no practical effect, but it is good practice to always include it.
Yes. On Linux or macOS, you can use a shell loop: 'for f in *.y4m; do ffmpeg -i "$f" -vn -c:a aac -b:a 128k -movflags +faststart "${f%.y4m}.m4b"; done'. On Windows Command Prompt, use: 'for %f in (*.y4m) do ffmpeg -i "%f" -vn -c:a aac -b:a 128k -movflags +faststart "%~nf.m4b"'. Keep in mind that Y4M files are typically very large due to their uncompressed nature, so batch processing may be I/O intensive and slow on spinning-disk drives.

Technical Notes

Y4M's default_audio_codec is null — meaning the format specification does not define an audio codec, and most Y4M files produced by standard tools like ffmpeg, rawvideo pipelines, or mjpegtools contain no audio stream whatsoever. The rawvideo codec used for Y4M's video stream is not remuxed into M4B (which has no video codec support); it is discarded entirely via the -vn flag. Any audio present is re-encoded to AAC using FFmpeg's native AAC encoder, which is ITU-T standardized and broadly compatible with Apple devices, Android, and web browsers. M4B is structurally identical to M4A and MP4 at the container level (all are MPEG-4 Part 12), but the .m4b extension signals audiobook intent to platforms like Apple Books, enabling resume-on-close bookmarking behavior. Metadata tags (title, artist, album) from the source Y4M file, if any exist, may not transfer reliably since Y4M has extremely minimal metadata support — you may want to add ID3-compatible metadata to the M4B using FFmpeg's -metadata flag or a post-processing tool. File size will drop dramatically: a Y4M file is raw uncompressed video (potentially gigabytes for even short clips), while the output M4B contains only AAC-compressed audio, often a fraction of a percent of the original size.

Related Tools