Skip to content

[SCTP] Optimize sctp packet marshalling #364

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jan 2, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion sctp/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
## Unreleased

* Performance improvements
* The lock for the internal association was contended badly because marshaling was done while still in a critical section and also tokio was scheduling tasks badly[#363](https://github.com/webrtc-rs/webrtc/pull/363)
* reuse as many allocations as possible when marshaling [#364](https://github.com/webrtc-rs/webrtc/pull/364)
* The lock for the internal association was contended badly because marshaling was done while still in a critical section and also tokio was scheduling tasks badly [#363](https://github.com/webrtc-rs/webrtc/pull/363)

## v0.7.0

Expand Down
2 changes: 1 addition & 1 deletion sctp/src/chunk/chunk_payload_data.rs
Original file line number Diff line number Diff line change
Expand Up @@ -233,7 +233,7 @@ impl Chunk for ChunkPayloadData {
writer.put_u16(self.stream_identifier);
writer.put_u16(self.stream_sequence_number);
writer.put_u32(self.payload_type as u32);
writer.extend(self.user_data.clone());
writer.extend_from_slice(&self.user_data);

Ok(writer.len())
}
Expand Down
24 changes: 11 additions & 13 deletions sctp/src/packet.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,6 @@ use crate::util::*;

use crate::chunk::chunk_unknown::ChunkUnknown;
use bytes::{Buf, BufMut, Bytes, BytesMut};
use crc::{Crc, CRC_32_ISCSI};
use std::fmt;

///Packet represents an SCTP packet, defined in https://tools.ietf.org/html/rfc4960#section-3
Expand Down Expand Up @@ -155,30 +154,29 @@ impl Packet {
writer.put_u16(self.destination_port);
writer.put_u32(self.verification_tag);

// This is where the checksum will be written
let checksum_pos = writer.len();
writer.extend_from_slice(&[0, 0, 0, 0]);

// Populate chunks
let mut raw = BytesMut::new();
for c in &self.chunks {
let chunk_raw = c.marshal()?;
raw.extend(chunk_raw);
c.marshal_to(writer)?;

let padding_needed = get_padding_size(raw.len());
let padding_needed = get_padding_size(writer.len());
if padding_needed != 0 {
raw.extend(vec![0u8; padding_needed]);
// padding needed if < 4 because we pad to 4
writer.extend_from_slice(&[0u8; 16][..padding_needed]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need 16 bytes then? if max is 4

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually it's a bit dangerous in a sense that PADDING_MULTIPLE may change and this will throw a runtime error. why can't we do [0u8; padding_needed]? because it's slower?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why can't we do [0u8; padding_needed]

Because rust doesn't allow dynamically sized arrays on the stack right?

actually it's a bit dangerous in a sense that PADDING_MULTIPLE may change and this will throw a runtime error

This is a good point, if unlikely since the RFC for SCTP will probably never change in this regard. I can just use PADDING_MULTIPLE as the size though. since padding_needed is guaranteed to be < PADDING_MULTIPLE

}
}
let raw = raw.freeze();

let hasher = Crc::<u32>::new(&CRC_32_ISCSI);
let mut digest = hasher.digest();
let mut digest = ISCSI_CRC.digest();
digest.update(writer);
digest.update(&FOUR_ZEROES);
digest.update(&raw[..]);
let checksum = digest.finalize();

// Checksum is already in BigEndian
// Using LittleEndian stops it from being flipped
writer.put_u32_le(checksum);
writer.extend(raw);
let checksum_place = &mut writer[checksum_pos..checksum_pos + 4];
checksum_place.copy_from_slice(&checksum.to_le_bytes());

Ok(writer.len())
}
Expand Down
5 changes: 3 additions & 2 deletions sctp/src/util.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@ pub(crate) fn get_padding_size(len: usize) -> usize {
/// We need to use it for the checksum and don't want to allocate/clear each time.
pub(crate) static FOUR_ZEROES: Bytes = Bytes::from_static(&[0, 0, 0, 0]);

pub(crate) const ISCSI_CRC: Crc<u32> = Crc::<u32>::new(&CRC_32_ISCSI);

/// Fastest way to do a crc32 without allocating.
pub(crate) fn generate_packet_checksum(raw: &Bytes) -> u32 {
let hasher = Crc::<u32>::new(&CRC_32_ISCSI);
let mut digest = hasher.digest();
let mut digest = ISCSI_CRC.digest();
digest.update(&raw[0..8]);
digest.update(&FOUR_ZEROES[..]);
digest.update(&raw[12..]);
Expand Down