Profile-guided optimizations (PGO)

I was wondering whether you have considered using PGO for optimizing `rav1d`. Both for the straightforward "let's generate more optimized artifacts" approach, but also for the less obvious "let's do PGO, see what it has optimized and try to backport these optimizations to the original source code".

`rav1d` contains a lot of assembly that most likely won't be touched by LLVM's PGO pipeline, but out of curiosity, I tried to apply https://github.com/Kobzol/cargo-pgo to `rav1d`, and it still seems to produce a quite nice ~2% speedup:

```
hyperfine --warmup 0 --runs 1 "./rav1d -q -i Chimera-AV1-8bit-1280x720-3363kbps.ivf -o /dev/null --threads=1" "./rav1d-pgo -q -i Chimera-AV1-8bit-1280x720-3363kbps.ivf -o /dev/null --threads=1"
Benchmark 1: ./rav1d -q -i Chimera-AV1-8bit-1280x720-3363kbps.ivf -o /dev/null --threads=1
  Time (abs ≡):        50.679 s               [User: 50.599 s, System: 0.077 s]
 
Benchmark 2: ./rav1d-pgo -q -i Chimera-AV1-8bit-1280x720-3363kbps.ivf -o /dev/null --threads=1
  Time (abs ≡):        49.915 s               [User: 49.840 s, System: 0.074 s]
 
Summary
  ./rav1d-pgo -q -i Chimera-AV1-8bit-1280x720-3363kbps.ivf -o /dev/null --threads=1 ran
    1.02 times faster than ./rav1d -q -i Chimera-AV1-8bit-1280x720-3363kbps.ivf -o /dev/null --threads=1
```

Anyway, I don't have more than that, just was curious what do you think of this approach.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Profile-guided optimizations (PGO) #1408

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Profile-guided optimizations (PGO) #1408

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions