Converting video with GPU acceleration tested

Matti Robinson
20 Jan 2009 9:46

Introduction


During the past couple of years the possibilities of video cards have increased to help in more than just 3D modelling and video games. Nowadays video cards can be used in for example breaking password protections, medical research and calculations, as well as video processing.
The processing power of video cards cannot be used automatically in generic software. In order to use the additional power provided by GPU the program needs to include code and support for the appropriate interface. The most popular of these interfaces is NVIDIA's CUDA, which is officially supported by the company's video cards. Other alternatives include ATI's Stream, OpenCL which recently introduces version 1.0 and Compute Shader provided by Microsoft's DirectX 11.

Software


We decided to get familiar with CUDA and its improvements to video processing because CUDA is now supported out of the box by TMPGEnc 4.0 XPress and PowerDirector.
The goal of the test was to see how much CUDA speeds up the compression of video and if the video quality of the final product is worse than the one with CPU-only compression. This could be the case at least with the default software provided by ATI and NVIDIA.
TMPGEnc 4.0 XPress uses the CUDA for MPEG-1 and MPEG-2 decoding and processing of certain video filters. CUDA is therefore not used in every situation possible, but according to TMPGEnc developers, further support for CUDA is in development. In some cases the processing speed can increase by 800 percent when the system includes a CUDA supporting video card.
CyberLink introduced a new version of PowerDirector which includes support for both CUDA and Stream. The PowerDirector -- released just before CES 2009 -- takes advantage of video card muscle in not only filter processing but also in H.264/AVC encoding. This is a major improvement to TMPGEnc's limited CUDA support and should show some definite results.
Neither of the programs use CUDA as a default. In both programs enabling CUDA is simply done by checking couple of check boxes (pictured below).

Enabling CUDA support in TMPGEnc 4.0 XPress

Enabling CUDA support for video effects in PowerDirector

Enabling CUDA support for H-264/AVC encoding in PowerDirector
To compare the results of CUDA processing we decided to use the open source AVIdemux as a control. It doesn't have support for CUDA and should provide a clear comparison in video quality when encoded with x264.

Tests


To test the video encoding we used a Sony HDW-F900 footage found from W6RZ.net. Video is in TS container and MPEG-2 format with 1080p resolution. The bitrate of the video data reaches up to 30 Mbps and the average bitrate is around 18 Mbps so the video should provide enough load for the setup.
The test video was encoded to H.264/AVC format with all three software. The first goal was to produce a 720p video with 2 pass variable bitrate settings for Internet distribution. We set the average bitrate to 3 000 kbps and audio bitrate to 64 kbps -- other settings were default.

TMPGEnc 4.0 XPress -- 720p encoding setup

PowerDirector -- 720p encoding setup

AVIdemux -- 720p encoding setup
For the second test we converted the video to 640x360 resolution (iPhone and iPod Touch) video with 1 pass constant bitrate at 1 Mbps and 64 kbps audio bitrate.

TMPGEnc 4.0 XPress -- iPhone encoding setup

AVIdemux -- iPhone encoding setup
Unfortunately PowerDirector doesn't support MP4 format with H.264/AVC video so the video was converted from m2ts file produced by PowerDirector to MP4 by using mencoder to separate the video and converted it into MP4 format with MP4box. PowerDirector has quite a restricted resolution support and therefore the iPod Touch/iPhone video was not converted at all. It also doesn't support AAC audio for m2ts files so there is no audio in the final versions produced by PowerDirector. For some reason the PowerDirector had problems with the aspect ratio of the video even though it was forced to 16:9 mode.
Both conversions were tested in all of the three programs without filters and with Color Correction filter. The conversion times are not comparable with the other programs because they use different libraries or methods for H.264/AVC encoding and filter processing. You can however compare the times between filters on and off in the particular program.
Test setup was as follows:
PC
-1,6 GHz Intel Pentium Dual E2140
-2 GB< DDR2
-Club 3D Geforce 9600GT (NVIDIA's Forceware 181.20 drivers)
-Windows Vista SP1
Software
-TMPGEnc 4.0 XPress v4.6.3.268
-PowerDirector v7.0.2416a
-AVIdemux v2.4.3 (r4494) with x264 library r1080
The times were measured with a stop watch.

Test results


720p without filters

SoftwareTime without CUDATime with CUDATime improvement with CUDA
TMPGEnc 4.0 XPress 16 min 16 sec 15 min 17 sec 59 sec
PowerDirector 4 min 54 sec 3 min 51 sec 1 min 3 sec
AVIdemux 11 min 10 sec--

720p with filters
SoftwareTime without CUDATime with CUDATime improvement with CUDA
TMPGEnc 4.0 XPress 25 min 22 sec 16 min 15 sec 9 min 7 sec
PowerDirector 6 min 2 sec 4 min 59 sec 1 min 3 sec
AVIdemux 11 min 18 sec - -

iPhone without filters
SoftwareTime without CUDATime with CUDATime improvement with CUDA
TMPGEnc 4.0 XPress 3 min 44 sec 3 min 54 sec -10 sec
AVIdemux 2 min 34 sec - -

iPhone with filters
SoftwareTime without CUDATime with CUDATime improvement with CUDA
TMPGEnc 4.0 XPress 8 min 26 sec 5 min 56 sec 2 min 30 sec
AVIdemux 2 min 36 sec--

Video files (MP4)
TMPGEnc 4.0 XPress 720p without CUDA
TMPGEnc 4.0 XPress 720p with CUDA
PowerDirector 720p without CUDA
PowerDirector 720p with CUDA
AVIdemux 720p
TMPGEnc 4.0 XPress iPhone without CUDA
TMPGEnc 4.0 XPress iPhone with CUDA
AVIdemux iPhone
Screen captures (720p)
(click for a larger PNG image)

TMPGEnc 4.0 XPress -- 720p @ 33 sec (without CUDA)

TMPGEnc 4.0 XPress -- 720p @ 33 sec (with CUDA)

PowerDirector -- 720p @ 33 sec (without CUDA)

PowerDirector -- 720p @ 33 sec (with CUDA)

AVIdemux -- 720p @ 33 sec


TMPGEnc 4.0 XPress -- 720p @ 60 sec (without CUDA)

TMPGEnc 4.0 XPress -- 720p @ 60 sec (with CUDA)

PowerDirector -- 720p @ 60 sec (without CUDA)

PowerDirector -- 720p @ 60 sec (with CUDA)

AVIdemux -- 720p @ 60 sec


TMPGEnc 4.0 XPress -- 720p @ 90 sec (without CUDA)

TMPGEnc 4.0 XPress -- 720p @ 90 sec (with CUDA)

PowerDirector -- 720p @ 90 sec (without CUDA)

PowerDirector -- 720p @ 90 sec (with CUDA)

AVIdemux -- 720p @ 90 sec
Screen captures (iPhone)
(click for a larger PNG image)

TMPGEnc 4.0 XPress -- iPhone @ 33 sec (without CUDA)

TMPGEnc 4.0 XPress -- iPhone @ 33 sec (with CUDA)

AVIdemux -- iPhone @ 33 sec

TMPGEnc 4.0 XPress -- iPhone @ 60 sec (without CUDA)

TMPGEnc 4.0 XPress -- iPhone @ 60 sec (with CUDA)

AVIdemux -- iPhone @ 60 sec

TMPGEnc 4.0 XPress -- iPhone @ 90 sec (without CUDA)

TMPGEnc 4.0 XPress -- iPhone @ 90 sec (with CUDA)

AVIdemux -- iPhone @ 90 sec

Conclusion


TMPGEnc 4.0 Xpress showed improvements in processing times with CUDA especially when filters were used -- just like TMPGEnc's own tests anticipated. The use of CUDA in TMPGEnc 4.0 Xpress didn't affect much to the quality of the video so it is safe to recommend it for anyone with a CUDA supporting setup. In some cases CUDA does slow down the process a bit so you might want to try it out for a couple of videos before committing to it, especially if you use TMPGEnc often for same format videos produced by for example your digital video camera.
PowerDirector used different encoding profile when CUDA was enabled. This resulted into a better quality and an approximately three megabytes larger file because of the higher bitrate. Even though video card didn't itself improve the quality, it did improve the speed of the conversion and CUDA is recommended to be enabled in PowerDirector as well.
Between the three software PowerDirector's video had clearly the lowest quality, it does however do the compressing quickly. The problems with aspect ratios and limited settings don't paint a rosy picture of PowerDirector either. The x264 file produced by AVIDemux takes the crown for video quality with flying colors and we can only hope that it will get help from GPU processing in the future.

More from us
We use cookies to improve our service.