-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in convolutional decoder #736
Conversation
Signed-off-by: Clayton Smith <[email protected]>
Signed-off-by: Clayton Smith <[email protected]>
I also removed the unused |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really can't claim I understand the SPIRAL code, but your forcing of renormalization seems to make sense to me, unless this leads to rounding of many different errors to 0 and thus reduced performance in high-SNR?
Anyway, if this actually does correctly work now, we're one big step ahead. And we should totally merge this PR.
There's no rounding; the renormalization is just an integer subtraction to keep the path metrics from overflowing an |
Now that I've read and understood it, I'm not sure it's doing much good compared to a human-readable version. It's not that different, except it unrolls the main loop (so that two butterfly operations are done in each iteration) and it splits out every single operation onto its own line, e.g.:
A human would write |
@argilo as mentioned on Matrix, I did all my tests with this PR. Both using |
@Aang23 I've pushed the AVX2 to fixes to this branch: https://github.com/argilo/volk/tree/k7-r2-avx2 I'm not sure whether to include those in this PR. Maybe they should wait until later. Unfortunately the AVX2 code is slower than the spiral implementation. The renormalization code is getting to be quite costly since it happens on every iteration, and isn't very SIMD-friendly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. FEC code is hard.
Fix bugs in convolutional decoder
Fixes #733.
Fixes #473.
Reverts #475.
VOLK's convolutional decoder has many bugs:
spiral
andneonspiral
protokernels only renormalize the branch metrics if the first branch metric is greater than 210. This threshold is much too high, and results in constant overflows and incorrect decisions. The threshold was previously removed from the generic protokernel, so I've done the same for the others. This causes a ~20% performance penalty, but it's necessary for correct output.METRICSHIFT
made in Fixup underperforming GENERIC kernel for volk_8u_x4_conv_k7_r2_8u #475 changed the scaling of metrics for the generic protokernel but not the SIMD protokernels. This causes inaccuracy, because the SIMD protokernels use the generic calculation to when processing the tail end of the input.(metric1 >> 1) + (metric2 >> 1) = (metric1 + metric2) >> 1
._mm_avg_epu8
, which calculates(metric1 + metric2 + 1) >> 1
.>
for metric comparison, while the SIMD protokernels use>=
.After all these bugs are fixed, the convolutional decoder is tested with random input and produces identical output between the three protokernels.
I left the
avx2
protokernel commented out for now because I haven't yet investigated why it's broken. But it at least now fails if it is uncommented. (Previously it passed when it received zeroes as input.)