Extract Audio from WTV to CAF — Free Online Tool

Extract audio from Windows Media Center WTV recordings and save it as a Core Audio Format (CAF) file with uncompressed PCM audio. This tool strips the video stream and re-encodes the AAC audio from your DVR recording into 16-bit PCM inside Apple's extensible CAF container — ideal for high-fidelity audio archiving within the Apple ecosystem.

FFmpeg Command

Copy this command to run the same conversion locally with FFmpeg on your desktop. Download FFmpeg

Free — no uploads, no signups. Your files never leave your browser.

Estimated output:

Conversion Complete!

Download

How It Works

WTV files recorded by Windows Media Center typically contain H.264 video and AAC audio captured from digital broadcast sources. This conversion discards the video stream entirely using FFmpeg's stream mapping controls, then decodes the AAC audio track and re-encodes it as 16-bit signed little-endian PCM (pcm_s16le) inside a CAF container. Unlike a simple remux, the audio is fully decoded and re-encoded from lossy AAC into uncompressed PCM — meaning you get a lossless representation of whatever audio quality existed in the original broadcast recording, with no further lossy compression applied. CAF's design removes the 4GB file size ceiling found in WAV and AIFF, making it suitable for long TV recordings that could otherwise overflow those containers.

What Each Flag Does

Flag What it does
ffmpeg Invokes the FFmpeg tool, which is running here as a WebAssembly (FFmpeg.wasm) instance entirely within your browser — no file data leaves your device during processing.
-i input.wtv Specifies the input WTV file recorded by Windows Media Center, which typically contains an H.264 video stream and an AAC audio stream captured from a digital broadcast source.
-vn Disables video output entirely, telling FFmpeg to ignore the H.264 video stream from the WTV recording so that only the audio is processed and written to the CAF file.
-c:a pcm_s16le Decodes the AAC audio from the WTV file and re-encodes it as 16-bit signed little-endian uncompressed PCM — the standard uncompressed audio format compatible with Core Audio and Apple's CAF container.
-b:a 128k Specifies a target audio bitrate, though this flag has no practical effect on uncompressed pcm_s16le output since PCM bitrate is determined entirely by sample rate and bit depth, not a compression target. It is included for consistency with other conversion presets.
output.caf Defines the output file as a CAF (Core Audio Format) container — Apple's extensible audio format that supports large file sizes beyond the 4GB limit of WAV and AIFF, making it suitable for long WTV recordings converted to uncompressed audio.

Common Use Cases

  • Extracting the audio commentary track from a recorded sports broadcast to archive or share independently of the video
  • Pulling speech or dialogue from a recorded TV documentary to use as a high-quality audio source for transcription or captioning workflows on macOS
  • Converting a Windows Media Center DVR recording of a live concert or music performance into an uncompressed CAF file for editing in Logic Pro or GarageBand
  • Archiving the audio from long-form recorded TV content (multi-hour events) into CAF, which handles file sizes beyond the 4GB limit that would break WAV or AIFF
  • Migrating DVR audio content from a Windows-centric WTV archive into a format natively supported by Core Audio on macOS and iOS for use in Apple-platform audio tools
  • Extracting a clean uncompressed audio master from a WTV news recording before further editing or re-encoding into other delivery formats

Frequently Asked Questions

The conversion from WTV to CAF involves decoding the original AAC audio track and re-encoding it as uncompressed PCM — so no additional lossy compression is introduced. However, because the WTV's source audio was already encoded as lossy AAC during the original broadcast recording, that prior quality loss cannot be recovered. The PCM output is a lossless snapshot of whatever fidelity the AAC encoding preserved, which is typically very good for broadcast-quality content at the bitrates used by digital TV.
By default, FFmpeg selects only the best single audio stream from the WTV file — typically the primary audio track. WTV files from Windows Media Center can include multiple audio tracks (for example, secondary language streams or secondary audio program tracks from digital broadcasts), but CAF does not support multiple audio tracks in a single file. If you need a non-default track, you would need to modify the FFmpeg command to add a flag like '-map 0:a:1' to select the second audio stream specifically.
The original audio in a WTV file is stored as AAC, which is a lossy compressed format typically encoded at around 128–256 kbps. The CAF output uses pcm_s16le, which is fully uncompressed at CD quality (16-bit, and whatever sample rate the source used — commonly 48 kHz for broadcast). Uncompressed PCM for a 48 kHz stereo signal requires roughly 1.5 MB per minute, compared to just 1–2 MB per minute for AAC at 192 kbps, so the CAF file will be substantially larger, often 8–10 times the size of the equivalent compressed audio.
CAF is an Apple-native format and is not natively supported by Windows applications. While some third-party tools on Windows (such as VLC or specific audio editors) can open CAF files, the format is primarily designed for macOS, iOS, and other Apple platforms where it integrates with Core Audio. If you need broad cross-platform compatibility, converting the WTV to a WAV or FLAC file instead would be more practical for Windows workflows.
Since the output codec is pcm_s16le (uncompressed PCM), the '-b:a' bitrate flag has no meaningful effect — uncompressed audio quality is determined by bit depth and sample rate, not bitrate. To change the bit depth you can swap pcm_s16le for pcm_s24le or pcm_s32le in the command. To resample the audio, add '-ar 44100' (for CD sample rate) or '-ar 48000' (standard broadcast rate) to the command before the output filename. For example: 'ffmpeg -i input.wtv -vn -c:a pcm_s24le -ar 48000 output.caf'.
WTV files can embed rich broadcast metadata such as program titles, channel information, and episode descriptions captured from the digital TV guide. CAF has limited metadata support compared to WTV, and FFmpeg does not map WTV-specific DVR metadata fields to CAF tags during this conversion. Standard audio tags like title or artist may transfer partially, but broadcast-specific metadata such as recording timestamps and program guide data will generally be lost in the output file.

Technical Notes

WTV is a DVR container developed by Microsoft for Windows Vista Media Center and later, designed specifically for storing digital broadcast recordings including over-the-air and cable content. Its audio streams are almost universally AAC or MP3 encoded at broadcast bitrates. CAF (Core Audio Format) was designed by Apple to address structural limitations of AIFF and WAV — chiefly the 4GB file size ceiling — while supporting a wide range of audio codecs from uncompressed PCM to AAC, FLAC, and Opus. In this conversion, FFmpeg decodes the WTV's AAC audio stream to raw PCM and writes it into CAF using the pcm_s16le codec, which is 16-bit signed little-endian PCM matching CD audio bit depth. The sample rate is inherited from the source WTV file and is typically 48 kHz for broadcast content. Note that CAF files are not playable in most non-Apple software without third-party support, and the format is best suited for use within macOS and iOS audio production environments. Because pcm_s16le is uncompressed, output files will be significantly larger than the source WTV audio portion. Multiple audio tracks present in the WTV file (a common feature of digital broadcast recordings with secondary audio programs) cannot be preserved in a single CAF output file due to format limitations.

Related Tools