Skip to content

Conversation

@gedoensmax
Copy link
Contributor

The speedup that we see is very significant GPU compared to CPU, this scales well for higher resolutions.
When used with FFmpeg this is especially important as also omits a needed PCI copy when using the hardware decoders. When i find more time i will do the same for SSIM but this is a little more work.

./libvmaf/build/tools/vmaf --reference ../data/reference_1080p_yuv420p.yuv --distorted ../data/distorted_1080p_yuv420p.yuv --width 1920 --height 1080 --pixel_format 420 --bitdepth 8 -o res/test_gpu.json --json --feature psnr_cuda
>>> VMAF version f52a8d72
>>> 128 frames ⠋⠉ 303.32 FPS
>>>  vmaf_v0.6.1: 99.867883

./libvmaf/build/tools/vmaf --reference ../data/reference_1080p_yuv420p.yuv --distorted ../data/distorted_1080p_yuv420p.yuv --width 1920 --height 1080 --pixel_format 420 --bitdepth 8 -o res/test.json --json --feature psnr
>>> VMAF version f52a8d72
>>> 128 frames ⠋⠉ 204.50 FPS
>>> vmaf_v0.6.1: 99.867883

@gedoensmax
Copy link
Contributor Author

Oh this will also contribute to ffmpeg as a colleague of mine has been experimenting with 8K footage and saw that there is no GPU accelerated PSNR as of now in ffmpeg. (At least not to our knowledge)

@gedoensmax
Copy link
Contributor Author

Based on #1174

@BlueSwordM
Copy link

This looks interesting, but this doesn't have a lot of value considering it's still PSNR at the end of the day.

Instead, I believe some focus should be on GPU accelerating much more powerful metrics like butteraugli and ssimulacra2 respectively:
https://github.com/cloudinary/ssimulacra2

@gedoensmax
Copy link
Contributor Author

The motivation behind this is to not hold CUDA VMAF backe because of PSNR. If video is decoded accelerated it is already in GPU memory and would have to be downloaded to CPU just to calculate PSNR.

@gedoensmax
Copy link
Contributor Author

@kylophone could you give this a review/test ?

@kylophone
Copy link
Collaborator

I tested this and there was a speed regression for vmaf only with raw inputs, likely due to the chroma copy.

@gedoensmax
Copy link
Contributor Author

Yes that can be true, in ffmpeg that should not be happening. Can you put any numbers behind that speed regression?

@gedoensmax
Copy link
Contributor Author

@kylophone any update on this ? As said the big benefit comes from using this with ffmpeg: GPU decode + GPU filter. If PSNR has to be calculated on the CPU the GPU data has to be downloaded and blocks processing a lot.

@gedoensmax
Copy link
Contributor Author

@kylophone Do you see the speed regression on the standalone tool as a blocker ? In ffmpeg this would not lead to a compression due to either using HW decode or overlapping with the kernels which the standalone tool cannot do (blocking fread in the main thread).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants