Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in convolutional decoder #736

Merged
merged 2 commits into from
Jan 7, 2024
Merged

Fix bugs in convolutional decoder #736

merged 2 commits into from
Jan 7, 2024

Conversation

argilo
Copy link
Member

@argilo argilo commented Dec 17, 2023

Fixes #733.
Fixes #473.
Reverts #475.

VOLK's convolutional decoder has many bugs:

  1. The test puppet has the input & output reversed, so it always tests the kernel with zeroes as input. Once this is corrected, many more bugs become apparent.
  2. The spiral and neonspiral protokernels only renormalize the branch metrics if the first branch metric is greater than 210. This threshold is much too high, and results in constant overflows and incorrect decisions. The threshold was previously removed from the generic protokernel, so I've done the same for the others. This causes a ~20% performance penalty, but it's necessary for correct output.
  3. The adjustment to METRICSHIFT made in Fixup underperforming GENERIC kernel for volk_8u_x4_conv_k7_r2_8u #475 changed the scaling of metrics for the generic protokernel but not the SIMD protokernels. This causes inaccuracy, because the SIMD protokernels use the generic calculation to when processing the tail end of the input.
  4. The protokernels differ in how they add metrics for the two parity bits:
    • The generic protokernel uses (metric1 >> 1) + (metric2 >> 1) = (metric1 + metric2) >> 1.
    • The SIMD protokernels use _mm_avg_epu8, which calculates (metric1 + metric2 + 1) >> 1.
  5. The generic protokernel uses > for metric comparison, while the SIMD protokernels use >=.

After all these bugs are fixed, the convolutional decoder is tested with random input and produces identical output between the three protokernels.

I left the avx2 protokernel commented out for now because I haven't yet investigated why it's broken. But it at least now fails if it is uncommented. (Previously it passed when it received zeroes as input.)

@argilo
Copy link
Member Author

argilo commented Dec 17, 2023

I also removed the unused threshold argument from the renormalize function, which cleans up one of the few remaining compiler errors.

Copy link
Member

@marcusmueller marcusmueller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really can't claim I understand the SPIRAL code, but your forcing of renormalization seems to make sense to me, unless this leads to rounding of many different errors to 0 and thus reduced performance in high-SNR?

Anyway, if this actually does correctly work now, we're one big step ahead. And we should totally merge this PR.

@argilo
Copy link
Member Author

argilo commented Dec 18, 2023

your forcing of renormalization seems to make sense to me, unless this leads to rounding of many different errors to 0

There's no rounding; the renormalization is just an integer subtraction to keep the path metrics from overflowing an unsigned char.

@argilo
Copy link
Member Author

argilo commented Dec 18, 2023

I really can't claim I understand the SPIRAL code

Now that I've read and understood it, I'm not sure it's doing much good compared to a human-readable version. It's not that different, except it unrolls the main loop (so that two butterfly operations are done in each iteration) and it splits out every single operation onto its own line, e.g.:

        a73 = (4 * i9);
        b6 = (syms + a73);
        a75 = *(b6);

A human would write a75 = syms[4 * i9] and I'm sure the compiler would do just as well (if not better) with that.

@Aang23
Copy link
Contributor

Aang23 commented Dec 20, 2023

@argilo as mentioned on Matrix, I did all my tests with this PR. Both using generic and spiral. Everything behaves as expected on my end :-)

@argilo
Copy link
Member Author

argilo commented Dec 23, 2023

@Aang23 I've pushed the AVX2 to fixes to this branch: https://github.com/argilo/volk/tree/k7-r2-avx2

I'm not sure whether to include those in this PR. Maybe they should wait until later. Unfortunately the AVX2 code is slower than the spiral implementation. The renormalization code is getting to be quite costly since it happens on every iteration, and isn't very SIMD-friendly.

Copy link
Contributor

@jdemel jdemel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. FEC code is hard.

@jdemel jdemel merged commit a84c7e3 into gnuradio:main Jan 7, 2024
33 checks passed
Alesha72003 pushed a commit to Alesha72003/volk that referenced this pull request May 15, 2024
Fix bugs in convolutional decoder
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants