Improve performance when converting python bytes/bytearray to `Vec<u8>`

Hi all, I noticed a performance issue when extracting a `PyBytes` or a `PyByteArray` object into a `Vec<u8>`.

This is an issue one can easily run into without realizing it. Here's a scenario, let's say we'd like to expose a simple checksum function:
 ```rust
#[pyfunction]
fn checksum(data: &[u8]) -> PyResult<u8> {
    let mut result = 0;
    for x in data {
        result ^= x;
    }
    Ok(result)
}
```
See how it performs against the equivalent python implementation, processing 1MB a hundred times:
```
2.65s call     test_checksum.py::test_perf[py-bytes]
0.00s call     test_checksum.py::test_perf[rs-bytes]
```
Looks really fast! However, it won't accept a bytearray as an argument:
```python
TypeError: argument 'data': 'bytearray' object cannot be converted to 'PyBytes'
```
So we update our implementation to take a `Vec<u8>` instead:
```rust
#[pyfunction]
fn checksum(data: Vec<u8>) -> PyResult<u8> {
    let mut result = 0;
    for x in data {
        result ^= x;
    }
    Ok(result)
}
```
And now the results:
```
2.61s call     test_checksum.py::test_perf[py-bytearray]
2.55s call     test_checksum.py::test_perf[py-bytes]
1.92s call     test_checksum.py::test_perf[rs-bytearray]
1.87s call     test_checksum.py::test_perf[rs-bytes]
```
It performs roughly the same as python, which makes sense if we look at the `FromPyObject` implementation for `Vec<T>`:
https://github.com/PyO3/pyo3/blob/bed4f9d6ee60793b8582e70b9b06cc25b318628d/src/types/sequence.rs#L314-L318

The `bytes`/`bytearray` object is iterated and each item (i.e a python integer) is separately extracted into a `u8`.

This could be fixed by specializing the extract logic in the case of a `Vec<u8>` and use specific methods such as `PyBytes::as_bytes().to_vec()` and `PyByteArray::to_vec()`. Here's a possible patch:
https://gist.github.com/vxgmichel/367e01e8504cb9c9e700a22525e8b68d

With this patch applied, the performance is now similar to what we had with the `&[u8]` slice:
```
2.70s call     test_checksum.py::test_perf[py-bytearray]
2.65s call     test_checksum.py::test_perf[py-bytes]
0.00s call     test_checksum.py::test_perf[rs-bytes]
0.00s call     test_checksum.py::test_perf[rs-bytearray]
```

	let mut v = Vec::with_capacity(seq.len().unwrap_or(0));
	for item in seq.iter()? {
	v.push(item?.extract::<T>()?);
	}
	Ok(v)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve performance when converting python bytes/bytearray to `Vec<u8>` #2888

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve performance when converting python bytes/bytearray to Vec<u8> #2888

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Improve performance when converting python bytes/bytearray to `Vec<u8>` #2888