Extract Audio from Y4M to AMR — Free Online Tool

Extract audio from Y4M (YUV4MPEG2) video files and convert it to AMR format using the libopencore_amrnb codec — optimized for speech and voice recordings at ultra-low bitrates. Ideal for pulling dialogue or narration from lossless intermediate video files into a mobile-ready, telephony-compatible audio format.

FFmpeg Command

Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg

Free — no uploads, no signups. Your files never leave your browser.

Estimated output:

Conversion Complete!

Download

How It Works

Y4M is a raw, uncompressed video container commonly used as an intermediate format between video processing applications. However, because Y4M stores only rawvideo and typically carries no audio stream by default, this conversion first checks whether your Y4M file has an embedded audio track. If audio is present, FFmpeg discards the video stream entirely using the -vn flag, then encodes the audio using the libopencore_amrnb codec — the Adaptive Multi-Rate Narrowband encoder — at 12,200 bps, the highest quality AMR-NB bitrate. The result is a compact .amr file optimized for speech reproduction, suitable for mobile devices and telephony systems. Because Y4M carries no compression artifacts in its video, any audio present is passed through a single encode step to AMR with no intermediate re-encoding losses from video processing.

What Each Flag Does

Flag What it does
ffmpeg Invokes the FFmpeg tool, the open-source multimedia processing engine used to handle the Y4M input and AMR output in this conversion.
-i input.y4m Specifies the input Y4M (YUV4MPEG2) file. FFmpeg reads the raw uncompressed video stream and any audio track embedded in the Y4M container.
-vn Disables video output entirely — since the goal is to extract audio only, the raw video stream from the Y4M file is discarded and not included in the AMR output.
-c:a libopencore_amrnb Encodes the audio using the libopencore_amrnb codec, which implements the Adaptive Multi-Rate Narrowband (AMR-NB) standard — a speech-optimized codec used in mobile telephony and the native encoder for .amr files.
-b:a 12200 Sets the AMR-NB encoding bitrate to 12,200 bps, which is the highest quality mode available in the AMR-NB standard and provides the best speech intelligibility within the format's 8 kHz bandwidth constraint.
output.amr Specifies the output filename with the .amr extension, telling FFmpeg to write the encoded AMR-NB audio stream into an AMR container file ready for playback on mobile devices and telephony systems.

Common Use Cases

  • Extracting voice-over narration recorded alongside a Y4M intermediate file during a video production pipeline, for archiving as a lightweight AMR file on a mobile device
  • Pulling dialogue or spoken commentary from a lossless Y4M file produced by a tool like ffmpeg pipe or avisynth, to share as an AMR voice message on a mobile platform
  • Converting interview audio captured in a Y4M intermediate format into AMR for integration into a mobile telephony or IVR system
  • Archiving spoken audio from a Y4M test or reference file in a low-bitrate AMR format to save storage while preserving intelligibility for quality assurance review
  • Extracting audio from a Y4M file generated by a broadcast or research tool and converting it to AMR for playback on legacy mobile handsets that natively support the format
  • Preparing voice content from a Y4M video in an AMR file for upload to platforms or apps that require AMR-encoded audio for speech recognition or telephony workflows

Frequently Asked Questions

Y4M (YUV4MPEG2) is primarily a raw video format designed for lossless video transport and piping between applications — it does not officially specify an audio stream in its standard header, and most Y4M files contain no audio whatsoever. If your Y4M file has no audio track, FFmpeg will produce an error or an empty AMR file rather than a usable output. You can verify whether your Y4M file contains audio by running 'ffmpeg -i input.y4m' and checking the stream list before attempting the conversion.
AMR-NB (Narrowband, encoded here with libopencore_amrnb) operates on an 8 kHz audio bandwidth, making it optimized for telephone-quality speech. AMR-WB (Wideband, encoded with libopencore_amrwb) uses a 16 kHz bandwidth and delivers noticeably better speech clarity. This tool defaults to AMR-NB because it is the most universally supported variant across legacy and modern mobile devices. If your Y4M source audio has higher-quality speech and your target platform supports AMR-WB, you would need to modify the FFmpeg command to use '-c:a libopencore_amrwb' and a compatible bitrate such as 23850.
AMR-NB at 12,200 bps — the highest available AMR-NB bitrate — is still a very aggressive compression designed specifically for speech. If your source audio is music or broadband audio, expect significant quality loss: the codec's 8 kHz bandwidth cuts high frequencies, and its speech-focused psychoacoustic model performs poorly on non-speech content. For voice and spoken-word audio, intelligibility is generally good at 12,200 bps, though the audio will sound narrowband and telephony-like compared to the original.
AMR-NB supports only eight fixed bitrate modes — 4,750, 5,150, 5,900, 6,700, 7,400, 7,950, 10,200, and 12,200 bps — defined by the 3GPP telephony standard. These are not arbitrary limits but rather the complete set of encoding modes built into the codec. There is no way to exceed 12,200 bps within the AMR-NB format; 12,200 bps is already the highest quality mode available. If you need higher-quality audio output, you would need to choose a different output format such as AAC, MP3, or FLAC.
To use a lower AMR-NB bitrate, replace the '12200' value in the '-b:a 12200' flag with one of the supported modes: 4750, 5150, 5900, 6700, 7400, 7950, or 10200. For example, to encode at the lowest bitrate for maximum file size reduction, use: 'ffmpeg -i input.y4m -vn -c:a libopencore_amrnb -b:a 4750 output.amr'. Lower bitrates reduce file size but increase speech degradation, so 12,200 bps is recommended unless storage is extremely constrained.
The single-file command shown on this page processes one Y4M file at a time. To batch convert multiple Y4M files on Linux or macOS, you can use a shell loop: 'for f in *.y4m; do ffmpeg -i "$f" -vn -c:a libopencore_amrnb -b:a 12200 "${f%.y4m}.amr"; done'. On Windows Command Prompt, use: 'for %f in (*.y4m) do ffmpeg -i "%f" -vn -c:a libopencore_amrnb -b:a 12200 "%~nf.amr"'. This is particularly useful when processing Y4M files generated in bulk by a video encoding pipeline.

Technical Notes

Y4M files rarely carry audio tracks — the format's primary purpose is raw lossless video transport, and many tools that generate Y4M output simply omit audio entirely. Before using this tool, confirm your Y4M file has an embedded audio stream. When audio is present, it is typically stored as PCM (uncompressed), so the conversion involves a single encode step directly from raw PCM to AMR-NB with no intermediate lossy decode, preserving as much source quality as AMR's codec constraints allow. The libopencore_amrnb encoder enforces strict channel and sample rate requirements: AMR-NB accepts only mono audio at 8,000 Hz sample rate — FFmpeg will automatically downmix stereo to mono and resample to 8 kHz if needed, which further narrows the audio bandwidth. The output .amr file will have no embedded metadata beyond what the AMR container header supports, which is minimal: no title, artist, or chapter information is preserved, as AMR has no standardized metadata container equivalent to ID3 or Vorbis comments. File sizes will be extremely small — at 12,200 bps, one minute of AMR-NB audio occupies roughly 90 KB — making AMR practical for storage-constrained mobile environments but unsuitable for any audio fidelity use case beyond speech intelligibility.

Related Tools