HARQ feature for 5G LDPC #944

rabihchrabieh · 2025-06-27T08:52:30Z

Description

Added HARQ support for 5G LDPC codes.
Created a Jupyter notebook describing the functionality and tests it.

Currently, the HARQ functionality does not support pruning (TODO for RV0, etc).
Graph mode is working but should be revisited.

All (or most) existing tests related to 5G LDPC pass.
No new unit test was created (TODO).

The additional API does not impact the existing API.

Fixed a few unrelated tests that were previously broken (cn_type changed to cn_update).

Checklist

Detailed description
Added references to issues and discussions
Added / modified documentation as needed
Added / modified unit tests as needed
Passes all tests
Lint the code
Performed a self review
Ensure you Signed-off the commits. Required to accept contributions!
~~[ ] Co-authored with someone? Add Co-authored-by: user@domain and ensure they signed off their commits too.~~

SebastianCa · 2025-07-07T13:21:18Z

Hi @rabihchrabieh,

Thank you for your contribution to Sionna. We will review the code shortly and get back to you.

SebastianCa · 2025-07-23T18:25:24Z

Hi @rabihchrabieh,

I started reviewing the PR, and overall, it looks good.

However, my main concern at this stage is the API design and its compatibility with graph/XLA mode.
The current implementation likely introduces unwanted side effects in graph mode: setting rv outside the graph after compilation will have no effect unless tracing is re-triggered. Additionally, dynamic batch sizes are problematic in this context.

I think we should clarify the API before proceeding further.

Would it be an option to pass rv directly during the call?

Encoder: This should be straightforward since it only changes the slicing position. For example: encoder(u, rv=2)
Decoder: This is trickier because it should remain stateless (no internal memory in graph mode).
One option could be to stack all received RV versions, i.e., the input shape is [bs, num_rvs, n].

Alternatively, the encoder could return a stacked version of all RVs (only if this mode is explicitly enabled), which could then be fed directly into the decoder using the same API.

What do you think?

A few additional remarks:

Do you plan to include the notebook as a Sionna tutorial? If yes, it should provide more background—e.g., explaining HARQ in 5G and the concept of RV. That would be a great contribution.
Since the effective rates differ, does it make sense to plot results in terms of Eb/N0? The current SNR offset feels somewhat ad hoc. Perhaps Es/N0 would be a better choice?
We need unittests before merging. Would it be possible to test against the “classical” decoder (rv=0) as a reference for the equivalent lower rate?
Why is there a modification in test_mimo_flat_fading.py?

rabihchrabieh · 2025-07-26T19:27:10Z

Thank you, @SebastianCa, for the helpful feedback.

We can definitely move the "rv" to the function parameters — that makes sense and ensures compatibility with graph/XLA mode. We might also allow passing "n" optionally, in case the user wants to test combinations like (rv0, n0), (rv1, n1).

Regarding stacking all RVs inside the encoder: I think that adds unnecessary complexity. In most scenarios, we want to test RVs through time-varying channels, so the stacked output would need to be unstacked anyway before transmission — which somewhat defeats the purpose.

For the decoder, I agree it should remain stateless. A simple and flexible approach is to accumulate the received RVs outside the decoder. This keeps the decoder logic clean and allows users to apply any desired accumulation method or weighting strategy.

That said, we could optionally provide a utility method for accumulation within the class (but not tied to the decoder itself), to support convenience without introducing state.

Let me know if this approach works for you or if you'd prefer something more integrated.

I will extend the notebook into a tutorial with background on HARQ and RVs, and switch to Es/N₀ for plotting to better reflect varying code rates. I will also add unittests comparing the HARQ decoder to the classical (rv=0) case under equivalent redundancy.

Good catch: the change to test_mimo_flat_fading.py wasn’t intentional and must have slipped in during testing. I’ll remove it from the PR.

rabihchrabieh · 2025-07-27T06:55:40Z

As an alternative, I’d like to propose a simpler structure when HARQ mode is activated:

encode(u) returns the full codeword (n_cb); no rate matching inside the encoder.
decode(llr) takes the full-length soft input (n_cb); no rate matching or accumulation inside the decoder.
Rate matching (RV slicing) and accumulation are handled externally in the simulation.

This keeps encode and decode clean, stateless, and graph-compatible, and avoids repeated calls to encode(u) when testing multiple RVs — the full n_cb output is stored once and sliced as needed for retransmissions.

A harq_mode flag is provided as a function parameter to both encode and decode. This supports backward compatibility:

When harq_mode=False, we follow the standard RV0 behavior: the first 2*Z bits are skipped, and pruning may be applied.
When harq_mode=True, we start from bit 0, and pruning is disabled — suitable for HARQ retransmissions.

A first transmission can use harq_mode=False, while retransmissions typically use harq_mode=True.

Let me know if this direction would be acceptable.

SebastianCa · 2025-07-28T16:15:48Z

Hi @rabihchrabieh,
my feeling is that an additional HARQ Block makes the API usage more complicated (from a user perspective); I would suggest keeping it inside of the encoder/decoder.

Here are my thoughts:
If we pass a list of rvs to the encoder, the user still has the flexibility to directly generate all rv-versions during the first encoding.
Simple example:

) x = encode(u) returns [bs, n_ldpc] bits
) x = encode(u, rv=["rv3",]) returns [bs, n_ldpc] bits or [bs, 1, n_ldpc] bits (to be discussed)
) x = encode(u, rv=["rv0", "rv1"])returns [bs, 2, n_ldpc] bits

As the decoder has no internal state, we must provide all rv versions anyhow. Here I would suggest to simply stack the inputs. In case a user wants to do advanced experiments, e.g., only decode "rv3", one could stack with zero llrs

x_rv3 = encoder(u, rv=["rv3",])
llr_3 = channel(x_rv3,...)
llr = tf.concat(tf.zeros((bs,2,n_ldpc), llr_rv3, axis=0)
x_hat = decoder(llr)

Otherwise, one could then just stack the different rv versions (e.g., from transmission over different instances of time-varying channels) and run the decoder once. For example

# initial transmission
x_rv1 = encoder(u, rv=["rv1",])
llr_1 = channel(x_rv1,...)
x_hat1 = decoder(llr_1)
# 1st HARQ
x_rv2 = encoder(u, rv=["rv2",])
llr_2 = channel(x_rv1,...)
llr = tf.concat((llr_1, llr_2))
x_hat2 = decoder(llr)
...

Or alternatively, do it as one-shot experiment (assuming the channel model supports this)

x = encoder(u, rv=["rv1","rv2"])
# possibly reshape for compatibility with advanced channel models
# x = tf.reshape(x, (-1,n_ldpc))
llr= channel(x,...)
# undo reshaping
# llr = tf.reshape(llr, (-1,2,n_ldpc))
x_hat = decoder(llr)

The overhead in the decoder should be fairly small as it boils down to gathering and adding llrs before decoding.

I would always disable pruning if harq_mode is active.

What do you think, does that make sense?

rabihchrabieh · 2025-07-28T18:19:26Z

Hi @SebastianCa,

Thanks again for the detailed input.

Before moving forward, I wanted to clarify a few points to make sure I understand the proposed design fully — especially in the context of stacked RVs.

Suppose a user provides [rv0, rv0, rv2, rv1] to the encoder. Then:

Encoder behavior: I assume the encoder runs once, producing the full n_ldpc block, and then applies RV-specific zeroing to generate 4 stacked versions — each masking out the bits not selected by its corresponding RV.
Channel interface: The encoder’s output cannot be transmitted directly, since only n (not n_ldpc) bits are sent per RV. This implies a step is needed to reduce each RV’d version to its n active bits.
Is the plan to perform this reduction:
- Inside the encoder, embedding rate matching directly into encode()?
- Or via a separate method (e.g., get_rv_bits()), to keep encode() clean while keeping rate matching within the same LDPC class?
Decoder input realignment: On the decoder side, if the user stacks received vectors (each of length n), they need to be realigned or zero-padded back to n_ldpc before decoding. This seems like another form of rate matching.
Should this happen:
- Inside decode() itself, using an RV list?
- Or externally, through a method like accumulate_rv_bits() (of the LDPC class) that maps the n bits back into the full n_ldpc domain?
  Also, just to confirm: the decoder should not assume a fixed RV order (e.g., always [rv0, rv1, rv2, ...]). Users may transmit any RVs in any order, or repeat RVs if needed.

Alternatively, is the idea to transmit all n_ldpc bits through the channel, even though only n are meaningful per RV — and then mask the untransmitted bits at the decoder (setting them to 0)? That would deviate from standard behavior and may complicate things if the user is constrained to transmit only n bits through various blocks (like interleaving, IFFT) and channel.

Just trying to get a clearer picture of how you see these components fitting together.

SebastianCa · 2025-07-29T07:41:47Z

Hi @rabihchrabieh,

ah, my bad - sorry for confusion; when saying n_ldpc I was actually referring to n, i.e., the codeword length after rate-matching.
I forgot that n_ldpc is also an attribute of the encoder/decoder.
Short example to clarify:

enc = LDPC5GEncoder(k=64, n=128, harq=True)
dec = LDPC5GDecoder(enc)

u = source([bs,64])
x = encoder(u, rv=["rv1","rv2"])
# shape of x is [bs, 2, 128]
y = channel(x,...) 
# shape of y is [bs, 2, 128]
x_hat = decoder(y)
# shape of x_hat is [bs, 64]

Comment: Mapping/demapping is not shown here

regarding your questions:
1.) encoder behavior: yes, the decoder should internally generate all n_ldpc bits and generates the requested "rv versions" by gathering from the right positions (or slicing&stacking).
2.) channel interface: see above, the encoder should produce rv versions of length n that can be directly transmitted.
3:) decoder re-alignment: see above, it should be n LLRs per rv; rate matching must then happen inside the decoder (as currently done)

In my eyes, that's the best version to give users the flexibility to simulate realistic HARQ scenarios (i.e., re-transmission of full code blocks) without adding a lot of complexity to the API.

rabihchrabieh · 2025-07-29T18:20:39Z

Sounds good — static RV lists.

One suggestion: should the decoder also take the same rv list as input, like the encoder? This would make it easier to test cases like ["rv0", "rv0", "rv2"], and also allows passing an optional weight vector per RV (e.g., based on estimated SNR) for soft combining before decoding.

Let me know.

SebastianCa · 2025-07-30T09:10:14Z

Yes, we could add rv as input to the decoder. Would it be an option to make it optional, i.e., if None is provided, the standard order of rvs is used? I think it's a fairly exotic use-case to have ["rv0", "rv0", "rv2"], right?

Why do we need weighting? shouldn't the LLR already be scaled by the reliability/SNR? I would try to keep the API streamlined. Applying weights could be also done manually before feeding the LLRs to the decoder?

rabihchrabieh · 2025-07-30T13:22:15Z

OK, making "rv" optional in the decoder sounds good. We can default to the standard RV order (["rv0", "rv2", "rv3", "rv1"]), which aligns with typical HARQ scheduling.

Just to clarify: sequences like ["rv0", "rv2"] or ["rv0", "rv2", "rv3"] are quite common. Repeating an RV, such as ["rv0", "rv0"], can also occur — for example, if the initial transmission was interfered with, it's sometimes better to retransmit RV0 rather than move to RV2. So having the option to explicitly specify the RV list is useful — especially for comparing cases like ["rv0", "rv2"] vs. ["rv0", "rv0"].

Regarding weighting: you're right that LLR scaling is ideally handled by the user. In some practical setups, LLRs are quantized to 8 bits after each RV reception. In such cases, to combine with a new RV, the stored LLRs must be dequantized, rescaled (e.g., based on SNR), accumulated and re-quantized. But I agree — it's best to leave this under user control.

I will proceed with this implementation.

SebastianCa · 2025-07-30T15:00:40Z

Sounds good!

rabihchrabieh · 2025-08-01T17:28:02Z

Hi @SebastianCa,

To display Es/No, it seems that it's not currently exposed in the PlotBER class. I believe I need to add a new ebno argument to call and pass it to plot_ber.

Do you have a different suggestion or a preferred way to handle this?

Thanks for your help!

SebastianCa · 2025-08-07T21:47:48Z

The straightforward way is to not call the ebno2db function and calculate the noise variance no manually (or just set the rate to 1); see FEC tutorial for an example.

rabihchrabieh · 2025-08-08T07:29:02Z

The issue is with displaying and printing the word EsNo on graphs and results tables. I had to make small changes.

rabihchrabieh · 2025-08-09T15:07:01Z

I've pushed the updated, squashed commit with the refactored HARQ implementation per feedback. Ready for re-review.

Squashed to one commit rebased on latest main
New encoder/decoder RV API per feedback
Removed old approach
Updated tutorial with introduction to 5G HARQ
Added unit test

- Implemented HARQ mode for LDPC encoder/decoder with RV selection - Added HARQ tutorial with AWGN BLER performance example - Added unit tests Signed-off-by: Rabih Chrabieh <[email protected]>

SebastianCa · 2025-08-13T21:26:12Z

Great - we'll review the code soon.

modified docstrings removed (uneded) public functions minor code changes

src/sionna/phy/fec/ldpc/encoding.py

SebastianCa · 2025-08-20T16:27:02Z

src/sionna/phy/fec/ldpc/encoding.py

            raise ValueError("Last dimension must be of length k.")

-    def call(self, bits):
+    def call(self, bits, rv=None):


The problem with rv as a (Python) list is that it will re-trace the graph whenever the input signature changes, i.e., TF will trace one graph for each (used) combination of rv.
This can be ok as we (typically) have a small amount of possibilities

An alternative would be to provide a list of tf.ints and gather the rv_starts from another tensor. This could work in a dynamic graph and avoids re-tracing. However, the code becomes less readable (and the wrap-arounds may become a bit ugly).

This likely complicates the code, and only partially solves the problem: if we have a different number of rv, then it needs to re-trace. If we set it to always max number of rv (how many retransmissions is that?) then it's a lot of overhead and complex to manage the masked rv. Let me know if you have a particular solution in mind.

src/sionna/phy/fec/ldpc/encoding.py

SebastianCa

I reviewed all files except the ipynb. This can be done once the code is ready

src/sionna/phy/utils/misc.py

src/sionna/phy/utils/plotting.py

src/sionna/phy/fec/ldpc/decoding.py

src/sionna/phy/fec/ldpc/encoding.py

src/sionna/phy/fec/ldpc/decoding.py

SebastianCa · 2025-08-21T12:46:42Z

test/unit/fec/test_ldpc_5g_harq.py

+    def encode_harq():
+        bits = source([batch_size, k])
+        x_ref = encoder_ref(bits)  # Shape: [batch_size, n_cb]
+        x_ref = tf.roll(x_ref, shift=starts["rv0"], axis=-1)  # Undo shift of RV0


why do we need to shift the reference here?

encoder_ref outputs the full n_cb buffer but shifted by 2Z, as this is the original non-HARQ encoder. Here we undo this shift by 2Z for the correct, full and unshifted buffer of n_cb bits; then it's easier to make comparisons to the HARQ mode using rv_list.

I changed the logic a bit

test/unit/fec/test_ldpc_5g_harq.py

SebastianCa · 2025-08-21T14:07:33Z

Hi @rabihchrabieh,
I've reviewed the code and added a lot of comments. Please have a look.

Keep in mind that the LDPC Block is one of Sionna's core components and we should be very careful with changes are not strictly related to HARQ. And it's important that we maintain backwards compatibility (ideally no breaking changes).

A few big picture comments:
1.) The current implementation does not support dynamic reconfiguration of the rv version; this means for each new rv configuration (i.e., new encoder/decoder input signature) TensorFlow will trace a new graph. As an example decoder(llr, "rv0") and decoder(llr, "rv1") will result in two different graphs (2x compilation time). One could alternatively feed tf.ints and use tf.gather to get the start_pos indices, however, if the input shapes changes re-tracing is still required. Not sure what you think?
2.) We want to keep the API doc and the user experience as clean as possible. We should only expose what is really useful for users.

Besides that, the PR looks very promising.

btw: I'm using the Github comment function; Thus, I haven't executed the code locally, my changes may have introduced a few minor incompatibilities.

rabihchrabieh · 2025-08-21T14:13:38Z

Hi @SebastianCa, thanks for your thorough review and good feedback.

I agree with most of your comments and suggestions but as I am currently on travel, it will take a bit of time to respond.

SebastianCa · 2025-08-21T14:15:48Z

No worries - there is no deadline. We'll wait for your response.

SebastianCa · 2025-09-07T13:15:43Z

Hi @rabihchrabieh,

thanks for updating the PR; please expect some delays with the review due to traveling.

Signed-off-by: Rabih Chrabieh <[email protected]>

…_starts methods, added multi-batch test Signed-off-by: Rabih Chrabieh <[email protected]>

Signed-off-by: Rabih Chrabieh <[email protected]>

rabihchrabieh · 2025-11-09T00:18:20Z

Just pushed new commits — sorry for the delay! I've aligned with all your suggestions except one, which I've explained in the thread.

rabihchrabieh force-pushed the harq-clean branch from 632ba8c to aae52c1 Compare August 9, 2025 14:57

Add HARQ support for 5G LDPC codes in Sionna

21f9231

- Implemented HARQ mode for LDPC encoder/decoder with RV selection - Added HARQ tutorial with AWGN BLER performance example - Added unit tests Signed-off-by: Rabih Chrabieh <[email protected]>

rabihchrabieh force-pushed the harq-clean branch from aae52c1 to 21f9231 Compare August 9, 2025 15:09

first code review iteration

a379af3

modified docstrings removed (uneded) public functions minor code changes

SebastianCa reviewed Aug 20, 2025

View reviewed changes

src/sionna/phy/fec/ldpc/encoding.py Outdated Show resolved Hide resolved

SebastianCa reviewed Aug 20, 2025

View reviewed changes

src/sionna/phy/fec/ldpc/encoding.py Show resolved Hide resolved

SebastianCa reviewed Aug 20, 2025

View reviewed changes

src/sionna/phy/fec/ldpc/encoding.py Outdated Show resolved Hide resolved

SebastianCa added 2 commits August 21, 2025 14:37

minor edits

05fcf65

adding compt

8a43299

SebastianCa reviewed Aug 21, 2025

View reviewed changes

rabihchrabieh added 8 commits November 9, 2025 01:04

Restored misc.py and plotting.py to upstream version

69a4faa

Signed-off-by: Rabih Chrabieh <[email protected]>

WIP: removed get_params and params_only

0611ede

Signed-off-by: Rabih Chrabieh <[email protected]>

WIP: removed an unnecessary XLA test and added more test combinations

806abac

Signed-off-by: Rabih Chrabieh <[email protected]>

WIP: removed accumulator method, removed "allow low rates", merged rv…

e4c72e8

…_starts methods, added multi-batch test Signed-off-by: Rabih Chrabieh <[email protected]>

WIP: updated HARQ notebook

b95e4a4

Signed-off-by: Rabih Chrabieh <[email protected]>

WIP: minor edit

df0b930

Signed-off-by: Rabih Chrabieh <[email protected]>

Minor edit

6b77613

Signed-off-by: Rabih Chrabieh <[email protected]>

Corrected HARQ unit test

8199c5a

Signed-off-by: Rabih Chrabieh <[email protected]>

rabihchrabieh force-pushed the harq-clean branch from ae39bb2 to 8199c5a Compare November 9, 2025 00:08

HARQ feature for 5G LDPC #944

Are you sure you want to change the base?

HARQ feature for 5G LDPC #944

Uh oh!

Conversation

rabihchrabieh commented Jun 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist

Uh oh!

SebastianCa commented Jul 7, 2025

Uh oh!

SebastianCa commented Jul 23, 2025

Uh oh!

rabihchrabieh commented Jul 26, 2025

Uh oh!

rabihchrabieh commented Jul 27, 2025

Uh oh!

SebastianCa commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rabihchrabieh commented Jul 28, 2025

Uh oh!

SebastianCa commented Jul 29, 2025

Uh oh!

rabihchrabieh commented Jul 29, 2025

Uh oh!

SebastianCa commented Jul 30, 2025

Uh oh!

rabihchrabieh commented Jul 30, 2025

Uh oh!

SebastianCa commented Jul 30, 2025

Uh oh!

rabihchrabieh commented Aug 1, 2025

Uh oh!

SebastianCa commented Aug 7, 2025

Uh oh!

rabihchrabieh commented Aug 8, 2025

Uh oh!

rabihchrabieh commented Aug 9, 2025

Uh oh!

SebastianCa commented Aug 13, 2025

Uh oh!

Uh oh!

SebastianCa Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

rabihchrabieh Nov 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

SebastianCa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SebastianCa Aug 21, 2025

Choose a reason for hiding this comment

Uh oh!

rabihchrabieh Sep 6, 2025

Choose a reason for hiding this comment

Uh oh!

rabihchrabieh Nov 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

SebastianCa commented Aug 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rabihchrabieh commented Aug 21, 2025

Uh oh!

SebastianCa commented Aug 21, 2025

rabihchrabieh commented Jun 27, 2025 •

edited

Loading

SebastianCa commented Jul 28, 2025 •

edited

Loading

rabihchrabieh Nov 8, 2025 •

edited

Loading

SebastianCa commented Aug 21, 2025 •

edited

Loading