Skip to content

Commit fbd38f3

Browse files
committed
SIMD. fix list (+6 squashed commits)
Squashed commits: [c45b871] update for Pillow-SIMD 3.4.0 [bedd83f] no alpha compositing in this release [e8fe730] update results for latest version add Skia results [a16ff97] add SIMD changes [82ffbd6] fix readme (+4 squashed commits) Squashed commits: [85677f9] fix error [f44ebb1] update results for unrolled implementation [83968c3] fix #4 [cd73c51] update link (+11 squashed commits) Squashed commits: [5882178] correct spelling [a0e5956] Why Pillow-SIMD is even faster [108e72e] Why Pillow itself is so fast [e8eeda1] spelling fixes [e816e9c] spelling [d2eefef] methodology, why not contributed [2e55786] installation and conclusion [9f6415e] more info [67e55b7] more benchmarks test files [471d4c5] remove spaces [904d89d] add performance tests [4fe17fe] simple readme SIMD. clarify Following fork SIMD. update readme SIMD. update versions in readme SIMD. Changes
1 parent d209b7c commit fbd38f3

File tree

2 files changed

+266
-104
lines changed

2 files changed

+266
-104
lines changed

CHANGES.SIMD.rst

Lines changed: 89 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
Changelog (Pillow-SIMD)
2+
=======================
3+
4+
3.4.1.post1
5+
-----------
6+
7+
- Critical memory error for some combinations of source/destination
8+
sizes is fixed.
9+
10+
3.4.1.post0
11+
-----------
12+
13+
- A lot of optimizations in resampling including 16-bit
14+
intermediate color representation and heavy unrolling.
15+
16+
3.3.2.post0
17+
-----------
18+
19+
- Maintenance release
20+
21+
3.3.0.post2
22+
-----------
23+
24+
- Fixed error in RGBa -> RGBA convertion
25+
26+
3.3.0.post1
27+
-----------
28+
29+
Alpha compositing
30+
~~~~~~~~~~~~~~~~~
31+
32+
- SSE4 and AVX2 fixed-point full loading implementation.
33+
Up to 4.6x faster.
34+
35+
3.3.0.post0
36+
-----------
37+
38+
Resampling
39+
~~~~~~~~~~
40+
41+
- SSE4 and AVX2 fixed-point full loading horizontal pass.
42+
- SSE4 and AVX2 fixed-point full loading vertical pass.
43+
44+
Convertion
45+
~~~~~~~~~~
46+
47+
- RGBA -> RGBa SSE4 and AVX2 fixed-point full loading implementations.
48+
Up to 2.6x faster.
49+
- RGBa -> RGBA AVX2 implementation using gather instructions.
50+
Up to 5x faster.
51+
52+
53+
3.2.0.post3
54+
-----------
55+
56+
Resampling
57+
~~~~~~~~~~
58+
59+
- SSE4 and AVX2 float full loading horizontal pass.
60+
- SSE4 float full loading vertical pass.
61+
62+
63+
3.2.0.post2
64+
-----------
65+
66+
Resampling
67+
~~~~~~~~~~
68+
69+
- SSE4 and AVX2 float full loading horizontal pass.
70+
- SSE4 float per-pixel loading vertical pass.
71+
72+
73+
2.9.0.post1
74+
-----------
75+
76+
Resampling
77+
~~~~~~~~~~
78+
79+
- SSE4 and AVX2 float per-pixel loading horizontal pass.
80+
- SSE4 float per-pixel loading vertical pass.
81+
- SSE4: Up to 2x for downscaling. Up to 3.5x for upscaling.
82+
- AVX2: Up to 2.7x for downscaling. Up to 3.5x for upscaling.
83+
84+
85+
Box blur
86+
~~~~~~~~
87+
88+
- Simple SSE4 fixed-point implementations with per-pixel loading.
89+
- Up to 2.1x faster.

README.md

Lines changed: 177 additions & 104 deletions
Original file line numberDiff line numberDiff line change
@@ -1,104 +1,177 @@
1-
<p align="center">
2-
<img width="248" height="250" src="https://raw.githubusercontent.com/python-pillow/pillow-logo/master/pillow-logo-248x250.png" alt="Pillow logo">
3-
</p>
4-
5-
# Pillow
6-
7-
## Python Imaging Library (Fork)
8-
9-
Pillow is the friendly PIL fork by [Alex Clark and
10-
Contributors](https://github.com/python-pillow/Pillow/graphs/contributors).
11-
PIL is the Python Imaging Library by Fredrik Lundh and Contributors.
12-
As of 2019, Pillow development is
13-
[supported by Tidelift](https://tidelift.com/subscription/pkg/pypi-pillow?utm_source=pypi-pillow&utm_medium=readme&utm_campaign=enterprise).
14-
15-
<table>
16-
<tr>
17-
<th>docs</th>
18-
<td>
19-
<a href="https://pillow.readthedocs.io/?badge=latest"><img
20-
alt="Documentation Status"
21-
src="https://readthedocs.org/projects/pillow/badge/?version=latest"></a>
22-
</td>
23-
</tr>
24-
<tr>
25-
<th>tests</th>
26-
<td>
27-
<a href="https://travis-ci.org/python-pillow/Pillow"><img
28-
alt="Travis CI build status (Linux)"
29-
src="https://img.shields.io/travis/python-pillow/Pillow/master.svg?label=Linux%20build"></a>
30-
<a href="https://travis-ci.org/python-pillow/pillow-wheels"><img
31-
alt="Travis CI build status (macOS)"
32-
src="https://img.shields.io/travis/python-pillow/pillow-wheels/master.svg?label=macOS%20build"></a>
33-
<a href="https://ci.appveyor.com/project/python-pillow/Pillow"><img
34-
alt="AppVeyor CI build status (Windows)"
35-
src="https://img.shields.io/appveyor/build/python-pillow/Pillow/master.svg?label=Windows%20build"></a>
36-
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3ALint"><img
37-
alt="GitHub Actions build status (Lint)"
38-
src="https://github.com/python-pillow/Pillow/workflows/Lint/badge.svg"></a>
39-
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3ATest"><img
40-
alt="GitHub Actions build status (Test Linux and macOS)"
41-
src="https://github.com/python-pillow/Pillow/workflows/Test/badge.svg"></a>
42-
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3A%22Test+Windows%22"><img
43-
alt="GitHub Actions build status (Test Windows)"
44-
src="https://github.com/python-pillow/Pillow/workflows/Test%20Windows/badge.svg"></a>
45-
<a href="https://github.com/python-pillow/Pillow/actions?query=workflow%3A%22Test+Docker%22"><img
46-
alt="GitHub Actions build status (Test Docker)"
47-
src="https://github.com/python-pillow/Pillow/workflows/Test%20Docker/badge.svg"></a>
48-
<a href="https://codecov.io/gh/python-pillow/Pillow"><img
49-
alt="Code coverage"
50-
src="https://codecov.io/gh/python-pillow/Pillow/branch/master/graph/badge.svg"></a>
51-
</td>
52-
</tr>
53-
<tr>
54-
<th>package</th>
55-
<td>
56-
<a href="https://zenodo.org/badge/latestdoi/17549/python-pillow/Pillow"><img
57-
alt="Zenodo"
58-
src="https://zenodo.org/badge/17549/python-pillow/Pillow.svg"></a>
59-
<a href="https://tidelift.com/subscription/pkg/pypi-pillow?utm_source=pypi-pillow&utm_medium=badge"><img
60-
alt="Tidelift"
61-
src="https://tidelift.com/badges/package/pypi/Pillow?style=flat"></a>
62-
<a href="https://pypi.org/project/Pillow/"><img
63-
alt="Newest PyPI version"
64-
src="https://img.shields.io/pypi/v/pillow.svg"></a>
65-
<a href="https://pypi.org/project/Pillow/"><img
66-
alt="Number of PyPI downloads"
67-
src="https://img.shields.io/pypi/dm/pillow.svg"></a>
68-
</td>
69-
</tr>
70-
<tr>
71-
<th>social</th>
72-
<td>
73-
<a href="https://gitter.im/python-pillow/Pillow?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge"><img
74-
alt="Join the chat at https://gitter.im/python-pillow/Pillow"
75-
src="https://badges.gitter.im/python-pillow/Pillow.svg"></a>
76-
<a href="https://twitter.com/PythonPillow"><img
77-
alt="Follow on https://twitter.com/PythonPillow"
78-
src="https://img.shields.io/badge/tweet-on%20Twitter-00aced.svg"></a>
79-
</td>
80-
</tr>
81-
</table>
82-
83-
## Overview
84-
85-
The Python Imaging Library adds image processing capabilities to your Python interpreter.
86-
87-
This library provides extensive file format support, an efficient internal representation, and fairly powerful image processing capabilities.
88-
89-
The core image library is designed for fast access to data stored in a few basic pixel formats. It should provide a solid foundation for a general image processing tool.
90-
91-
## More Information
92-
93-
- [Documentation](https://pillow.readthedocs.io/)
94-
- [Installation](https://pillow.readthedocs.io/en/latest/installation.html)
95-
- [Handbook](https://pillow.readthedocs.io/en/latest/handbook/index.html)
96-
- [Contribute](https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md)
97-
- [Issues](https://github.com/python-pillow/Pillow/issues)
98-
- [Pull requests](https://github.com/python-pillow/Pillow/pulls)
99-
- [Changelog](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst)
100-
- [Pre-fork](https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst#pre-fork)
101-
102-
## Report a Vulnerability
103-
104-
To report a security vulnerability, please follow the procedure described in the [Tidelift security policy](https://tidelift.com/docs/security).
1+
# Pillow-SIMD
2+
3+
Pillow-SIMD is "following" Pillow fork (which is PIL fork itself).
4+
"Following" means than Pillow-SIMD versions are 100% compatible
5+
drop-in replacement for Pillow with the same version number.
6+
For example, `Pillow-SIMD 3.2.0.post3` is drop-in replacement for
7+
`Pillow 3.2.0` and `Pillow-SIMD 3.3.3.post0` for `Pillow 3.3.3`.
8+
9+
For more information about original Pillow, please
10+
[read the documentation][original-docs],
11+
[check the changelog][original-changelog] and
12+
[find out how to contribute][original-contribute].
13+
14+
15+
## Why SIMD
16+
17+
There are many ways to improve the performance of image processing.
18+
You can use better algorithms for the same task, you can make better
19+
implementation for current algorithms, or you can use more processing unit
20+
resources. It is perfect when you can just use more efficient algorithm like
21+
when gaussian blur based on convolutions [was replaced][gaussian-blur-changes]
22+
by sequential box filters. But a number of such improvements are very limited.
23+
It is also very tempting to use more processor unit resources
24+
(via parallelization) when they are available. But it is handier just
25+
to make things faster on the same resources. And that is where SIMD works better.
26+
27+
SIMD stands for "single instruction, multiple data". This is a way to perform
28+
same operations against the huge amount of homogeneous data.
29+
Modern CPU have different SIMD instructions sets like
30+
MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON.
31+
32+
Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default)
33+
and AVX2 support.
34+
35+
36+
## Status
37+
38+
[![Uploadcare][uploadcare.logo]][uploadcare.com]
39+
40+
Pillow-SIMD can be used in production. Pillow-SIMD has been operating on
41+
[Uploadcare][uploadcare.com] servers for more than 1 year.
42+
Uploadcare is SAAS for image storing and processing in the cloud
43+
and the main sponsor of Pillow-SIMD project.
44+
45+
Currently, following operations are accelerated:
46+
47+
- Resize (convolution-based resampling): SSE4, AVX2
48+
- Gaussian and box blur: SSE4
49+
- Alpha composition: SSE4, AVX2
50+
- RGBA → RGBa (alpha premultiplication): SSE4, AVX2
51+
- RGBa → RGBA (division by alpha): AVX2
52+
53+
See [CHANGES](CHANGES.SIMD.rst).
54+
55+
56+
## Benchmarks
57+
58+
The numbers in the table represent processed megapixels of source RGB 2560x1600
59+
image per second. For example, if resize of 2560x1600 image is done
60+
in 0.5 seconds, the result will be 8.2 Mpx/s.
61+
62+
- Skia 53
63+
- ImageMagick 6.9.3-8 Q8 x86_64
64+
- Pillow 3.4.1
65+
- Pillow-SIMD 3.4.1.post1
66+
67+
Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53
68+
------------------------|---------|------|-------|----------|----------|--------
69+
**Resize to 16x16** | Bilinear| 41.37| 317.28| 1282.85| 1601.85| 809.49
70+
| Bicubic | 20.58| 174.85| 712.95| 900.65| 453.10
71+
| Lanczos | 14.17| 117.58| 438.60| 544.89| 292.57
72+
**Resize to 320x180** | Bilinear| 29.46| 195.21| 863.40| 1057.81| 592.76
73+
| Bicubic | 15.75| 118.79| 503.75| 504.76| 327.68
74+
| Lanczos | 10.80| 79.59| 312.05| 384.92| 196.92
75+
**Resize to 1920x1200** | Bilinear| 17.80| 68.39| 215.15| 268.29| 192.30
76+
| Bicubic | 9.99| 49.23| 170.41| 210.62| 112.84
77+
| Lanczos | 6.95| 37.71| 130.00| 162.57| 104.76
78+
**Resize to 7712x4352** | Bilinear| 2.54| 8.38| 22.81| 29.17| 20.58
79+
| Bicubic | 1.60| 6.57| 18.23| 23.94| 16.52
80+
| Lanczos | 1.09| 5.20| 14.90| 20.40| 12.05
81+
**Blur** | 1px | 6.60| 16.94| 35.16| |
82+
| 10px | 2.28| 16.94| 35.47| |
83+
| 100px | 0.34| 16.93| 35.53| |
84+
85+
86+
### Some conclusion
87+
88+
Pillow is always faster than ImageMagick. And Pillow-SIMD is faster
89+
than Pillow in 4—5 times. In general, Pillow-SIMD with AVX2 always
90+
**16-40 times faster** than ImageMagick and overperforms Skia,
91+
high-speed graphics library used in Chromium, up to 2 times.
92+
93+
### Methodology
94+
95+
All tests were performed on Ubuntu 14.04 64-bit running on
96+
Intel Core i5 4258U with AVX2 CPU on the single thread.
97+
98+
ImageMagick performance was measured with command-line tool `convert` with
99+
`-verbose` and `-bench` arguments. I use command line because
100+
I need to test the latest version and this is the easiest way to do that.
101+
102+
All operations produce exactly the same results.
103+
Resizing filters compliance:
104+
105+
- PIL.Image.BILINEAR == Triangle
106+
- PIL.Image.BICUBIC == Catrom
107+
- PIL.Image.LANCZOS == Lanczos
108+
109+
In ImageMagick, the radius of gaussian blur is called sigma and the second
110+
parameter is called radius. In fact, there should not be additional parameters
111+
for *gaussian blur*, because if the radius is too small, this is *not*
112+
gaussian blur anymore. And if the radius is big this does not give any
113+
advantages but makes operation slower. For the test, I set the radius
114+
to sigma × 2.5.
115+
116+
Following script was used for testing:
117+
https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63
118+
119+
120+
## Why Pillow itself is so fast
121+
122+
There are no cheats. High-quality resize and blur methods are used for all
123+
benchmarks. Results are almost pixel-perfect. The difference is only effective
124+
algorithms. Resampling in Pillow was rewritten in version 2.7 with
125+
minimal usage of floating point numbers, precomputed coefficients and
126+
cache-awareness transposition. This result was improved in 3.3 & 3.4 with
127+
integer-only arithmetics and other optimizations.
128+
129+
130+
## Why Pillow-SIMD is even faster
131+
132+
Because of SIMD, of course. But this is not all. Heavy loops unrolling,
133+
specific instructions, which not available for scalar.
134+
135+
136+
## Why do not contribute SIMD to the original Pillow
137+
138+
Well, that's not simple. First of all, Pillow supports a large number
139+
of architectures, not only x86. But even for x86 platforms, Pillow is often
140+
distributed via precompiled binaries. To integrate SIMD in precompiled binaries
141+
we need to do runtime checks of CPU capabilities.
142+
To compile the code with runtime checks we need to pass `-mavx2` option
143+
to the compiler. But with that option compiller will inject AVX instructions
144+
enev for SSE functions, because every SSE instruction has AVX equivalent.
145+
So there is no easy way to compile such library, especially with setuptools.
146+
147+
148+
## Installation
149+
150+
In general, you need to do `pip install pillow-simd` as always and if you
151+
are using SSE4-capable CPU everything should run smoothly.
152+
Do not forget to remove original Pillow package first.
153+
154+
If you want the AVX2-enabled version, you need to pass the additional flag to C
155+
compiler. The easiest way to do that is define `CC` variable while compilation.
156+
157+
```bash
158+
$ pip uninstall pillow
159+
$ CC="cc -mavx2" pip install -U --force-reinstall pillow-simd
160+
```
161+
162+
163+
## Contributing to Pillow-SIMD
164+
165+
Pillow-SIMD and Pillow are two separate projects.
166+
Please submit bugs and improvements not related to SIMD to
167+
[original Pillow][original-issues]. All bugs and fixes in Pillow
168+
will appear in next Pillow-SIMD version automatically.
169+
170+
171+
[original-docs]: http://pillow.readthedocs.io/
172+
[original-issues]: https://github.com/python-pillow/Pillow/issues/new
173+
[original-changelog]: https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst
174+
[original-contribute]: https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md
175+
[gaussian-blur-changes]: http://pillow.readthedocs.io/en/3.2.x/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask
176+
[uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd
177+
[uploadcare.logo]: https://ucarecdn.com/dc4b8363-e89f-402f-8ea8-ce606664069c/-/preview/

0 commit comments

Comments
 (0)