Original Link: https://www.anandtech.com/show/2586



For the past couple of years NVIDIA has been telling us about how amazing its GPUs are at non-gaming applications. We kept seeing slides like this one that showed exactly how fast NVIDIA's GPUs were compared to dual and quad-core CPUs:


The performance advancements were incredible, NVIDIA was promising upwards of 100x gains over the fastest workstation CPUs. Unfortunately we couldn't get too terribly excited as most of these applications were far beyond the reach of the typical desktop user. Medical imaging and scientific analysis benefitted tremendously from GPU acceleration, but it's rare that a gamer with a $400 GPU is going to be searching for oil deposits in his/her spare time on the same machine.

What NVIDIA needed was that killer application that actually had relevance to the desktop user, and it was that sort of application that NVIDIA lacked. Until recently.

Enter Elemental Technologies, the makers of an unfortunately named application called Badaboom (cue cheesy mobster promo videos). Elemental took one of the most time consuming tasks on the desktop, and offloaded a great deal of it to the GPU.

We've seen the benefits of GPU accelerated video decoding, especially with the recent transition to high definition video formats encoded in MPEG-2/VC-1/H.264. Blu-ray movies went from completely unplayable on many machines to a total non-issue with the advent of hardware HD video decode acceleration in GPUs. Both ATI and NVIDIA resorted to specialized hardware to provide support for the full decode pipeline of codecs like H.264, but the small addition of die space was more than worth it. It would take at least 8 of Intel's cores to have the same video decoding power of a single GPU outfitted with either ATI's UVD or NVIDIA's PureVideo HD decode engine - the GPU approach is simply more sensible.

As soon as we got GPU video decode acceleration we wanted to know if/when it would be possible to accelerate video encoding on the GPU. For years ATI and NVIDIA have been telling us that video encoding could be accelerated on the GPU, but for years we were given nothing more than that statement. With the latest round of GPU releases things seemed different. Alongside its GT200 GPU, NVIDIA sent out very early copies of Elemental's Badaboom GPU accelerated video transcoder. Badaboom promised the unimaginable - GPU accelerated H.264 video transcoding at many times the performance of the fastest Intel CPUs.

The initial beta showed promise: the performance gains were significant, but we couldn't really measure quality. Today we have a near-final build of Elemental's Badaboom and we're able to look at the full picture. Today it's more than just about performance, we're looking at the feasibility of the first mainstream GPU-accelerated video transcoding application.



The Application

Badaboom relies on its interface to be one of its biggest strengths, and admittedly it does look very good. On the left you have your sources: optical drive(s), a VIDEO_TS folder from a ripped DVD or a file. In the middle you’ve got a preview of the video itself and on the right you have your output formats with presets for the iPhone, iPod Touch, iPod Classic, Apple TV, Xbox 360 and PS3. Choose your source, choose your output and hit start - that’s all you really need to do.

DVD support is a bit more elegant than your run of the mill video files. You can choose to transcode individual titles or chapters from the DVD, but do keep in mind that Badaboom won't perform any decryption for you - you'll have to break any security on your own.

The standard version of Badaboom will let you use any of these presets but you can’t adjust things like resolution, the pro version gives you an advanced button that let’s you configure a bit more. The configurable options are limited to resolution, bitrate, audio, 3:2 detect and deinterlacing. You can’t even specify the name or location of the output file, although thankfully you can cancel a transcode in the middle of it.

During a transcode you get a small preview of the video in the center of the application and an instantaneous frame rate as well as estimated time. There’s no summary window after the transcode has completed indicating average frame rate, total completion time or other vitals about the process.

In a nutshell, that’s the application - it transcodes things and doesn’t let you adjust much. Which brings us to its limitations...



Source Limitations

We've finally got a GPU accelerated video transcoding application, let's transcode some video then, shall we? Not so fast.

Badaboom lacks full Blu-ray support, despite Elemental listing .m2ts files on its supported list. Badaboom won’t decrypt a Blu-ray disc for you so you’ll have to rely on AnyDVD HD to strip out the content protection; unfortunately once you have, you’ll either be met with a crash upon trying to convert the content or an unusable output file.


This happens a lot if you're trying to transcode a DivX file or Blu-ray m2ts

With Blu-ray support out of the question for the initial release, I turned to plain old DVDs, after all that’s what most people have these days. Thankfully DVD support is much better with Badaboom, albeit far from flawless.

While I could transcode my copy of Bad Boys just fine (and ended up using it for most of the benchmarks), attempting to transcode Star Wars - Episode VI: Return of the Jedi left me with an unusable output file. The source movie was recorded at 24 fps but the transcoded file was a 22 fps movie, resulting in the movie playing back smoothly, but slowly.

DivX support was pretty much hit or miss. While some videos would transcode just fine, others would crash the program. Elemental told us that DivX support is spotty at this point, so the behavior wasn’t unexpected.

Input audio formats are also very limited - only MPEG-1 Layer II and PCM are supported, there’s no support for AAC, MP3, DD/DTS or anything else.

And that’s just the list of issues with various formats we’re trying to transcode...

Functional Limitations

When I first spoke to Elemental about the limitations in the early beta of Badaboom I looked at a couple of months ago, I was told that the professional version would answer a lot of my complaints - offering customizable resolutions, bit rates and more.

In playing around with the review copy I found myself frustrated, once more, by the lack of customization options offered by the program, but I figured the pro-version would fix everything. Until it turned out that what I was reviewing was the professional version.

This table should help explain the differences between the standard and professional versions:

  Badaboom Badaboom Pro
Price $29.99 $99.99
Maximum Input Resolution 720 x 576 1920 x 1080
Maximum Output Resolution 720 x 576 1920 x 1080
AVCHD Support Not Supported Supported
HDV Support Not Supported Supported

 

You can’t set custom resolutions in either version, you’re left with the predefined resolutions that Elemental ships with the program. The standard version is limited to 720 x 576 while the pro version will go up to 1920 x 1080. I’ve also had problems where Badaboom will insert a thin black border around the video and slightly squish the aspect ratio when upscaling video.


Those are all of the resolution options you get

The maximum bitrate supported by Badaboom is 5Mbps if you select the AppleTV, Xbox 360 or PS3 profile, there’s no way to define a custom profile - you have to modify an existing one. The lack of full Blu-ray support at this point means that the 5Mbps cap isn’t a huge deal but the combination of the two severely limits the usefulness of the application.

Profile Maximum Bitrate
iPhone 2.5Mbps
iPod Touch 2.5Mbps
iPod Classic 1.5 Mbps
iPod Nano 1.5 Mbps
Apple TV 5 Mbps
Xbox 360 5 Mbps
Playstation 3 5 Mbps

 

The only output format is .mp4, encoded using the Baseline H.264 profile - there’s no support for the main or high profiles of the codec. Combined with the 5Mbps bitrate cap this isn’t too bad, but again it limits the usefulness of the application.


You don't get a full implementation of the H.264 codec, only the Baseline profile with hardware levels up to 3.1

Transcoding a movie? There’s no way to keep Dolby Digital or DTS audio tracks, the only audio output format supported by Badaboom is AAC. Thankfully you can get multi-channel AAC but that’s it. Elemental is working on getting a DD license.


Ten points to the first person to apply a Bad Boys quote to this limited list of supported inputs/outputs

If you look at the laundry list of options you can set when encoding a video using x264 you’ll see that Badaboom comes quite ill-equipped. While I appreciate the simplicity of the interface, the “advanced” button should allow for much more customization than it actually does.



Image Quality

Given that one of the best H.264 codecs is actually open source (x264) the image quality target is clear, and free. There are a number of front ends that use x264, I chose Handbrake as it was the most Badaboom-like in its interface but the x264 codec itself is doing all of the work.

The Handbrake interface is very much like Badaboom’s, except no where nearly as polished. I compared encode quality on a single-pass of the x264 codec to the output from Badaboom using a couple of settings (5Mbps Xbox 360 profile and 1.5Mbps iPhone profile). The image below is taken using a single-pass encode from the x264 codec, hold your mouse over the image to see what Badaboom's encoder can output. Forgive the lack of a pixel-perfect comparison, as I mentioned before Badaboom always seemed to muck with my aspect ratio whenever I was up or downscaling.:



Hold mouse over image to see Badaboom's Image Quality



What about Performance?

If you can find a source file that Badaboom will accept and transcode fine, the process is pretty quick.

The first test I ran was to convert Chapter 3 of Bad Boys on DVD to a 5Mbps VBR .mp4 file using the Xbox 360 profile. I upscaled the video to 1280 x 720.

The Core 2 Quad Q6600 completed the test in 245 seconds using the x264 codec, outputting a file that was similar in size and quality to what Badaboom managed (the file was a bit smaller 109MB vs. 116MB and the quality a bit better).

 

The entry level and midrange 8/9 series GPUs couldn’t actually do much better. The GeForce 9500 GT was actually slower, as were the 8500 GT and the 8600 GTS. The GeForce 8800 GT changed things though, at 103 seconds it encoded the test in less than half the time. NVIDIA’s fastest, the GeForce GTX 280 managed it in just over 60 seconds.

Next I tried outputting a lower resolution file for use on an iPhone, encoded at 1.5Mbps. Despite the default resolution being 480 x 320 the actual output resolution was 480 x 272:

 

The file outputted was obviously smaller at 35MB and the time to transcode went down significantly. Now our Q6600 took 36.5 seconds and the 8800 GT’s advantage was cut down, it ended up being only about 7 seconds faster (or about 28%). The GTX 280 still pulled ahead, processing the encode in just under 19 seconds.

What this chart shows is that the load on the GPU varies, much as it does in 3D games, depending on what we're doing. Just as higher resolutions tend to be more GPU bound than CPU bound, it would seem that smaller, simpler content at lower transcoding bitrates don't show as big of an advantage. The benefit of GPU accelerated transcoding is clear, but the performance gains will vary depending on the load.

For the final test I repeated the iPhone conversion but instead of only converting Chapter 3 of the DVD I selected the first ten chapters:

 

In the 5Mbps Xbox 360 test the GeForce GTX 280 ended up being around 3.5x faster than the Core 2 Quad Q9450, in the single-chapter iPhone test the advantage was reduced to 2.1x and here we find that the gap grows slightly to 2.2x but still not quite as high as the original test. It's looking like a range of 2 - 4x the speed of a reasonably fast quad-core CPU is what we can expect from Badaboom if you use NVIDIA's fastest GPU.

If you look at a more reasonably priced GPU, the 9800 GTX ends up being around 2 - 3x faster than the same quad-core CPU. The value of the entry level GPUs isn't that great unless you've got a dual-core CPU, otherwise quad-core chips will be able to encode faster and with better quality.

Next up I wanted to see how fast of a CPU we needed to keep the GeForce GTX 280 fed in its most CPU-bound test, the single-chapter iPhone conversion:

 

This graph should make NVIDIA pretty happy, you only really need a Core 2 Duo E4500 to keep the GTX 280 fed resulting in performance better than any quad-core Intel CPU can offer. The upside to GPU accelerated video transcode is huge, we just need a better app to deliver it.



Energy Efficiency

The Badaboom transcoding process actually taxes the CPU as well as the GPU as we’ve already seen, so it’s not too surprising that power consumption when you’re using the GPU as well as the CPU is actually greater than when it’s all CPU based.

The factor you have to take into account is not only how much power is consumed by the system, but for what duration. While the Core 2 Quad Q9450 system only drew about 160W, the entire encode task took 211 seconds. The same system with a GeForce GTX 280 performing the transcode finished the task in 61.9 seconds despite drawing 210W at the wall. Multiply the two out and you get total energy consumed in Joules:

  Instantaneous Power Consumption Energy Use over Benchmark Duration
Intel Core 2 Quad Q9450 160W 33760J
NVIDIA GeForce GTX 280 210W 12999J

 

While offloading the transcode task to the GeForce GTX 280 takes more power, it uses less than 40% of the energy since it can complete the transcode so much faster. GPU accelerated video transcode appears to be, as we first suspected, the more efficient way of doing things.



Final Words

I hate to be so negative about a product like Badaboom because it holds so much potential, unfortunately it just left me disappointed. There’s no way to set custom resolutions, there’s a 5Mbps bitrate cap, there’s no support for Main/High H.264 profiles, there’s no support for Dolby Digital/DTS audio, you can't convert Blu-ray movies, DivX support is flaky at best and there are output issues with some DVD titles.

For the first GPU-accelerated video transcoding application written in CUDA I expected much more from Badaboom. A simple user interface is great, but it lacks the power and customization behind it. NVIDIA has a year before Larrabee hits and you can be sure that Intel will leverage its relationships with the major codec developers (DivX anyone?) to ensure that there’s full Larrabee support right away.

The performance expectations are also interesting. Just as the 8800 GT is pretty much the minimum requirement for decent, speedy gaming in the latest titles, that ends up being the minimum requirement for solid transcoding performance. The GeForce 9500 GT and slower are only really upgrades if you have a slow dual-core CPU, the quad-core offerings are faster than any of NVIDIA's lower end GPUs. The 8800 GT, 9800 GTX and GTX 200 class of products all offer somewhere in the 2 - 4x range of a performance improvement over Intel's quad-core CPUs. While eight-core Nehalems will help close that gap, it's clear that GPUs are much more energy efficient for video transcoding.

NVIDIA needs to do more with companies like Elemental to make sure that launches like this don’t happen. Badaboom held so much promise but disappoints as it is nothing more than a quick way of getting some videos onto your MP3 player or game console without terrible concern for quality or features.

It takes users 10 - 30 hours to transcode an entire Blu-ray movie at the best quality settings on some of the fastest Intel CPUs, that’s where we need GPU acceleration. Target the top and trickle down to address the rest of the market, it’s the NVIDIA approach and it’s one that Elemental doesn’t embrace with Badaboom. This application is reasonable, at best, for the mainstream and does nothing for those serious about transcoding video.

Fix the compatibility problems, fix the crashes, fix the frame rate output issues and then we’ll have a decent app for the mainstream user just looking to put content on their iPhone/iPod. For an app that promises to fix the issue of video codec compatibility, it sure does a poor job of making sure that it itself is compatible with even the codecs it is supposed to support.

AMD has its own response to Badaboom coming before the end of the year. Cyberlink's PowerDirector is supposed to enable GPU accelerated video transcode, but it's a sad day when a video enthusiast has to look to Cyberlink to save the day. What both AMD and NVIDIA need to do is help the open source community and existing codec developers include GPU acceleration in their software today.

I want a CUDA enabled version of x264 or of the MainConcept H.264 encoder. While it's admirable that companies like Elemental would attempt their own codec and front end, there are better alternatives out there today.

There's clearly potential for GPU-accelerated H.264 video encoding, but the first attempt was honestly a bust. Let's hope Elemental or someone else gets it right for round two...

Log in

Don't have an account? Sign up now