April 23rd, 2012
Starting a primer into video encoding
For many, even many video professionals, encoding and modern video compression is basically just a black box full of voodo computer magic. Most just pick a preset and hope for the best and sadly so many of the presets are really not all that great. I get asked constantly for advice in this area and over the years have learned a lot. This needs to be a more complete article, and eventually I will create one - but for now I´m going to address a few or the more salient points. Which means, the voodo will have to remain, for now, at the deepest levels...
Attention Scan Readers - you can scroll to the bottom now to blindly follow recipies without understanding them :) Go Team Black-Box Voodo!
Some background... My University education was in Electrical Engineering specalizing in communications. Two course topics really stood out for me from that period: Fibre Optic Communications (which I used during my ´other life´ in the telecom engineering world) and Computer Compression Technology. At that point it focised more on image compression but we looked at HDTV, MPEG and other associated video compression technology as well. Much has changed since then, but the fundamentals, as is so frequently the case, remain the same.
When it comes to video you really need to understand that any compressed video file really is built from 3 "layers" (I´m going to simplify like crazy here so bear with me if you know this from a more technical side)
The
Source Video Data at some
resolution which gets compressed by the
codec into a
Compressed Video Stream which in turn gets stored in a file based on The
File Format
To understand what a video file is or is going to look like you at least need to know what all those are. To further complicate matters there is the concept of
Pixel Aspect Ratio and basically the same progression with the audio side of the mix.
I am not going to go into all the theoretical explinations of these things (voodo) - Instead I´ll just list a few examples and give an anaology.
Typical Video Data resolutions (source and output) are things like 360p, 480p, 720p, 1080p (these numbers stand for the vertical part of the resolution of a single frame of the video and the "p" means progressive - as opposed to "i" for interlaced - again not going into this here - ack more voodo) most computer video will be something progressive and very typically 480p (standard definition TV = 720x480 pixels) 720p (sometimes called HD as opposed to "Full" HD at 1280x720pixels) or 1080p (Full-HD at 1920x1080 pixels) of course the resolution can be almost anything.
Typical codecs are H.264, MPEG-2, VP6, (and hundreds more)
Typical file formats ate mp4, wmv, avi, mov, (and hundreds more)
This is where it gets interesting because, you can put the encoded video stream from many different codes into many different file formats eg:
AVI can contain H.264, MPEG-2, or many other formats MOV can too etc etc etc So the next time someone tells you "Oh just send me an AVI" you´ll begin to realize how relatively useless this request is. Do they mean an H.264 encoded AVI or a MPEG2 one or a MJPEG one or or or...Here is an anaology:
Let´s compare Raw Video Data to fruit - Apples, Pears, Plums etc...
Codecs are how you prepair the fruit and
file formats in this anology can be some kind of tastey dessert like a cake or a pie
I can slice the apples (codec) and then make apple pie or apple cake (same source, same codec, two different output formats)
or
I can dice the apples (codec) and then make pie or cake (same source, different codec, same two output formats as above)
The combinations and permutations here really become endless (especially when you add audio to the mix) What this all means is: If you actually want to know what is contained in a compressed video file you really need to know
at least all of the following:
the file format
the video resolution (including pixel aspect ratio)
the video frame rate
the video bitrate
the video codec
the audio bit-depth
the audio bitrate
the audio codec
I could write a seperate in-depth article on each of these, but I simply don´t have time just yet (yet more voodo)
So now on to the question that prompted all this: "How should I output my video so it will look good on YouTube" - first and foremost it depends what the source video is but I´ll answer this with a group of examples.
A note about resolution and modern codecs. Most modern codecs are designed to work at the ´defined´ resolutions (it´s more complicated then this
voodo but let´s keep it simple) if you´re working on editing a video in something like Adobe Premiere, Vegas, or Final Cut - it´s best to choose your output resolution
first and then work on the project in that resolution even if your source is slightly off. For example if I had older video footage from a web cam that was 1024x768 I would edit at either 480p (720x480) or 720p (1280x720) because those are standard sizes. I recommend you work in 1080p or 720p unless you´re specifically/intentionally only ever wanting something smaller.
Youube has several internal formats it always converts videos into:
240p, 360p, 480p, 720p, 1080p
First - YouTube will never "Up-scale" your video and neither should you - just like with images one can´t increase the quality when there is no data there to use so try and start with high resolution source video and either scale down or let YouTube scale it down for you.
Second - 1080p (Full-HD) is great but it´s large and slow to work with and unless you really need it probably overkill for many applications. If you´re filming a kid´s birthday party 1080p is overkill most likely - go 720p in that case.
Third - don´t neglect the audio (Help keep my dad, the veteran audio-engineer of 40 years in the TV industry, from pulling out his hair when he sees another great video with audio so bad it ruins the experience and take care with your audio)
The YouTube Recipies
(and a big hello to the scan readers that skipped down to this point)
Source Video: 720p = 1280 x 720pixels at 30 (29.97 to be more precise) frames per second
Usually this is the best all-around format to upload to Youtube given the fact that most modern video is a HD source)
Compression settings:
Codec: H.264
Resolution: 720p (1280x720)
Frame Rate: 29.97
File Format MP4
Encoding: VBR 2 pass
Target Bitrate: 3Mbps
Max Bitrare: 7Mbps
Audio Bit-depth: 16Bit/48Khz
Audio Codec: AVC
Audio Bitrate: 160 KBps
Source Video: 1080p = 1920x1080pixels at 30 frames per second
Full HD and basically best possible YouTube Quality
Compression settings:
Codec: H.264
Resolution: 1080p (1920x1080)
Frame Rate: 30
File Format MP4
Encoding: VBR 2 pass
Target Bitrate: 10Mbps
Max Bitrare: 16Mbps
Audio Bit-depth: 16Bit/48Khz
Audio Codec: AVC
Audio Bitrate: 196 KBps
Source Video: 480p = 720x480pixels at 30 frames per second
This is typical NTSC TV / DVD Quality
Compression settings:
Codec: H.264
Resolution: 480p (720x480)
Frame Rate: 30
File Format MP4
Encoding: VBR 2 pass
Target Bitrate: 1Mbps
Max Bitrare: 3Mbps
Audio Bit-depth: 16Bit/44.1Khz
Audio Codec: AVC
Audio Bitrate: 96 KBps
Finally one more thing: Encoding Method:
VBR 1-pass (Variable Bit Rate) ,
VBR 2-pass, and
CBR (Constant Bit Rate)
Because of how codes actually compress the video they need to take the source data and do soemthing to it (voodo again) to make it smaller. The resulting quality is controled to some extent by bitrate.
In
Constant Bit Rate compression, the entire video is encoded from beginning to end at the same rate throughout. It´s "technically" easier to process video like this but since codecs actually work off the movement/change in video it results in some parts of the video without much motion wasting space and others with a lot of motion not being very good quality.
Variable Bit Rate compression, allows the rate to change (within some threshold) to allow flexibility in the comporession. In almost every case this is better then CBR when outputting video for file-based playback.
Variable Bit Rate 2-pass compression, is designed to improve upon VBR by allowing the codec to first ´watch´ the video and learn where the fast and slow bits are, then tune the compression plan to make the best possible choices and theoretically produce the best output size vs quality compression. Because it needs to go through it 2 times to do this - it takes about twice as long as VBR single pass.
Being someone who is happy to wait for the best possible quality, I always recommend 2-pass VBR unless you´re in a rush. But you can use single pass VBR in all the recipies above instead of 2 pass without changing any of the settings and the result will almost be just as good (only probably slightly larger)
I hope that helps in starting to scratch the surface of video compression. I will do more on this topic because it´s been a thorn in too many people´s sides for too long but I need to think through the best way to demystify the voodo before going into those details.