Fix bugs in convolutional decoder #736

argilo · 2023-12-17T23:28:57Z

Fixes #733.
Fixes #473.
Reverts #475.

VOLK's convolutional decoder has many bugs:

The test puppet has the input & output reversed, so it always tests the kernel with zeroes as input. Once this is corrected, many more bugs become apparent.
The spiral and neonspiral protokernels only renormalize the branch metrics if the first branch metric is greater than 210. This threshold is much too high, and results in constant overflows and incorrect decisions. The threshold was previously removed from the generic protokernel, so I've done the same for the others. This causes a ~20% performance penalty, but it's necessary for correct output.
The adjustment to METRICSHIFT made in Fixup underperforming GENERIC kernel for volk_8u_x4_conv_k7_r2_8u #475 changed the scaling of metrics for the generic protokernel but not the SIMD protokernels. This causes inaccuracy, because the SIMD protokernels use the generic calculation to when processing the tail end of the input.
The protokernels differ in how they add metrics for the two parity bits:
- The generic protokernel uses (metric1 >> 1) + (metric2 >> 1) = (metric1 + metric2) >> 1.
- The SIMD protokernels use _mm_avg_epu8, which calculates (metric1 + metric2 + 1) >> 1.
The generic protokernel uses > for metric comparison, while the SIMD protokernels use >=.

After all these bugs are fixed, the convolutional decoder is tested with random input and produces identical output between the three protokernels.

I left the avx2 protokernel commented out for now because I haven't yet investigated why it's broken. But it at least now fails if it is uncommented. (Previously it passed when it received zeroes as input.)

Signed-off-by: Clayton Smith <[email protected]>

argilo · 2023-12-17T23:29:51Z

I also removed the unused threshold argument from the renormalize function, which cleans up one of the few remaining compiler errors.

marcusmueller

I really can't claim I understand the SPIRAL code, but your forcing of renormalization seems to make sense to me, unless this leads to rounding of many different errors to 0 and thus reduced performance in high-SNR?

Anyway, if this actually does correctly work now, we're one big step ahead. And we should totally merge this PR.

argilo · 2023-12-18T00:26:24Z

your forcing of renormalization seems to make sense to me, unless this leads to rounding of many different errors to 0

There's no rounding; the renormalization is just an integer subtraction to keep the path metrics from overflowing an unsigned char.

argilo · 2023-12-18T00:32:20Z

I really can't claim I understand the SPIRAL code

Now that I've read and understood it, I'm not sure it's doing much good compared to a human-readable version. It's not that different, except it unrolls the main loop (so that two butterfly operations are done in each iteration) and it splits out every single operation onto its own line, e.g.:

        a73 = (4 * i9);
        b6 = (syms + a73);
        a75 = *(b6);

A human would write a75 = syms[4 * i9] and I'm sure the compiler would do just as well (if not better) with that.

Aang23 · 2023-12-20T21:22:07Z

@argilo as mentioned on Matrix, I did all my tests with this PR. Both using generic and spiral. Everything behaves as expected on my end :-)

argilo · 2023-12-23T22:48:28Z

@Aang23 I've pushed the AVX2 to fixes to this branch: https://github.com/argilo/volk/tree/k7-r2-avx2

I'm not sure whether to include those in this PR. Maybe they should wait until later. Unfortunately the AVX2 code is slower than the spiral implementation. The renormalization code is getting to be quite costly since it happens on every iteration, and isn't very SIMD-friendly.

jdemel

LGTM. FEC code is hard.

Fix bugs in convolutional decoder

argilo added 2 commits December 17, 2023 17:53

Fix conv_k7_r2 kernel and puppet

6f71d88

Signed-off-by: Clayton Smith <[email protected]>

Remove unused argument from renormalize

87cb93d

Signed-off-by: Clayton Smith <[email protected]>

marcusmueller approved these changes Dec 18, 2023

View reviewed changes

argilo mentioned this pull request Jan 3, 2024

fec: cc enc/dec in ber_curve_gen example yields different results in 3.8 gnuradio/gnuradio#2642

Closed

jdemel approved these changes Jan 7, 2024

View reviewed changes

jdemel merged commit a84c7e3 into gnuradio:main Jan 7, 2024
33 checks passed

argilo mentioned this pull request Jan 8, 2024

Re-enable AVX2 convolutional decoder #741

Merged

Alesha72003 pushed a commit to Alesha72003/volk that referenced this pull request May 15, 2024

Merge pull request gnuradio#736 from argilo/fix-k7-r2

498f369

Fix bugs in convolutional decoder

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix bugs in convolutional decoder #736

Fix bugs in convolutional decoder #736

argilo commented Dec 17, 2023

argilo commented Dec 17, 2023

marcusmueller left a comment

argilo commented Dec 18, 2023

argilo commented Dec 18, 2023

Aang23 commented Dec 20, 2023

argilo commented Dec 23, 2023

jdemel left a comment

Fix bugs in convolutional decoder #736

Fix bugs in convolutional decoder #736

Conversation

argilo commented Dec 17, 2023

argilo commented Dec 17, 2023

marcusmueller left a comment

Choose a reason for hiding this comment

argilo commented Dec 18, 2023

argilo commented Dec 18, 2023

Aang23 commented Dec 20, 2023

argilo commented Dec 23, 2023

jdemel left a comment

Choose a reason for hiding this comment