Extract Audio from WTV to AMR — Free Online Tool

Extract audio from Windows Media Center TV recordings (.wtv) and convert it to AMR format using the libopencore_amrnb codec — ideal for isolating speech from broadcast recordings for mobile or telephony use. AMR's narrow-band compression is purpose-built for voice, making it a compact choice when video and music content from the WTV file is irrelevant.

FFmpeg Command

Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg

Free — no uploads, no signups. Your files never leave your browser.

Estimated output:

Conversion Complete!

Download

How It Works

WTV files recorded by Windows Vista/7/8 Media Center typically contain an H.264 video stream and an AAC or MP3 audio stream, along with broadcast metadata such as program title, channel, and air time. During this conversion, FFmpeg discards the video stream entirely using the -vn flag, then re-encodes the audio stream — regardless of whether it was AAC or MP3 — into AMR Narrow-Band using the libopencore_amrnb codec at 12,200 bps (the highest AMR-NB bitrate, corresponding to the 12.2 kbps mode). Because AMR-NB is optimized for speech frequencies and uses a fixed 8 kHz sample rate, FFmpeg will also downsample the audio if the source was recorded at a higher rate (e.g., 44.1 kHz or 48 kHz). The result is a small, speech-optimized audio file stripped of all video, chapters, and most metadata.

What Each Flag Does

Flag What it does
ffmpeg Invokes the FFmpeg binary. In this browser-based tool, this runs via FFmpeg.wasm compiled to WebAssembly, processing your WTV file entirely within your browser without any server upload.
-i input.wtv Specifies the input Windows Television file recorded by Windows Media Center. FFmpeg will parse the WTV container to identify all streams, including the H.264 video, AAC or MP3 audio, and any embedded broadcast metadata.
-vn Disables video output entirely, instructing FFmpeg to ignore the H.264 video stream from the WTV recording. Since AMR is an audio-only format and we only want the speech content, the video track is dropped rather than transcoded.
-c:a libopencore_amrnb Selects the libopencore_amrnb encoder to produce AMR Narrow-Band audio output. This codec implements the 3GPP AMR-NB standard used in mobile telephony and re-encodes the broadcast audio (originally AAC or MP3) into AMR's speech-optimized compression scheme at 8 kHz mono.
-b:a 12200 Sets the AMR-NB encoding bitrate to 12,200 bits per second (Mode 7), the highest quality mode available in the AMR-NB standard. Unlike most codecs, AMR-NB bitrates are specified as raw integer bps values — 12200 here means 12.2 kbps and produces the clearest speech reproduction possible within the AMR-NB format.
output.amr Defines the output filename and signals to FFmpeg that the result should be written in the AMR single-channel file format. The .amr extension causes FFmpeg to use the AMR muxer, which writes the standard '#!AMR' file header recognised by mobile devices and telephony systems.

Common Use Cases

  • Extracting spoken-word content from a recorded news broadcast or talk show for playback on a basic mobile phone that only supports AMR audio
  • Archiving the audio commentary track from a recorded sports broadcast in a compact format for review on low-storage devices
  • Preparing interview segments recorded via Windows Media Center for upload to a telephony or IVR system that requires AMR-NB input
  • Isolating a recorded radio programme (captured via a TV tuner card to WTV) as an AMR file for offline listening on older Nokia or feature phones
  • Generating lightweight voice reference clips from DVR recordings for transcription workflows that accept AMR input
  • Stripping the audio from a recorded documentary to create a compact speech file for accessibility or dubbing review purposes

Frequently Asked Questions

AMR Narrow-Band operates at a fixed 8 kHz sample rate, which means it captures only frequencies up to 4 kHz — adequate for speech intelligibility but noticeably degraded for music, sound effects, or anything with a wide frequency range. WTV recordings from broadcast TV are typically encoded at 44.1 kHz or 48 kHz AAC or MP3, so the downsampling to 8 kHz will significantly reduce audio fidelity. For recorded news, interviews, or talk shows the result is perfectly usable; for music or drama with a full sound mix, the quality loss will be very apparent.
AMR-NB uses non-standard bitrate values expressed in raw bits per second rather than kilobits per second, so the correct value is 12200 (not 12200k or 12.2k). The available AMR-NB modes range from 4750 bps to 12200 bps, and FFmpeg's libopencore_amrnb codec accepts these exact integer values with the -b:a flag. Using 12200 selects the highest-quality AMR-NB mode (Mode 7), which provides the best speech reproduction while remaining within the AMR standard's ceiling.
Replace the value after -b:a with any of the eight valid AMR-NB bitrates: 4750, 5150, 5900, 6700, 7400, 7950, 10200, or 12200. For example, use '-b:a 7400' for a smaller file with acceptable speech quality, or '-b:a 4750' for the smallest possible file (useful for very long recordings). Note that all eight modes decode to the same 8 kHz audio — lower bitrates simply apply more aggressive compression to the speech signal, which can introduce warbling or robotic artefacts at the lower end.
No. AMR files have no support for rich metadata containers, so programme title, channel, description, air date, and any EPG (Electronic Programme Guide) data embedded in the WTV file are lost during conversion. If preserving this metadata matters, you should record it manually before converting, or first remux the WTV to another format like MKV that supports metadata, and keep that as your archive copy.
By default, FFmpeg selects the first audio stream in the WTV file, which is typically the primary language track. If you want to extract a different audio track — for example, a secondary language or audio description track common in broadcast recordings — you need to specify it explicitly using the -map flag, such as adding '-map 0:a:1' to select the second audio stream. Without this flag, only the default stream will be encoded into the AMR output.
Yes. On Linux or macOS you can use a shell loop: 'for f in *.wtv; do ffmpeg -i "$f" -vn -c:a libopencore_amrnb -b:a 12200 "${f%.wtv}.amr"; done'. On Windows Command Prompt, use: 'for %f in (*.wtv) do ffmpeg -i "%f" -vn -c:a libopencore_amrnb -b:a 12200 "%~nf.amr"'. This is especially useful if you have a library of Media Center recordings and want to extract spoken-word audio from all of them in one pass.

Technical Notes

WTV (Windows Television) is a proprietary Microsoft container used by Windows Media Center, typically wrapping H.264 video with AAC audio at broadcast-standard sample rates of 44.1 kHz or 48 kHz. The libopencore_amrnb encoder used in this conversion implements the 3GPP AMR Narrow-Band standard, which is hard-limited to 8,000 Hz sample rate and mono audio — FFmpeg will automatically handle the sample rate conversion and downmix stereo broadcast audio to mono as part of the encoding process. The AMR-NB format does not support stereo, chapters, subtitles, or embedded metadata, so all of these are silently dropped. Because both the source AAC/MP3 audio and the output AMR are lossy formats, this conversion involves a generation loss: the audio is decoded from its broadcast codec and then re-encoded into AMR, which compounds compression artefacts. The libopencore_amrnb library must be present in the FFmpeg build for this command to work; many default Linux package manager builds of FFmpeg omit it due to patent licensing considerations, so you may need a build from ffmpeg.org/download or a third-party package that includes non-free codecs. The output .amr file uses the single-channel AMR storage format (magic number '#!AMR
'), which is widely supported by mobile players and telephony systems.

Related Tools