|
| 1 | +# Pillow-SIMD |
| 2 | + |
| 3 | +Pillow-SIMD is "following" Pillow fork (which is PIL fork itself). |
| 4 | +"Following" means than Pillow-SIMD versions are 100% compatible |
| 5 | +drop-in replacement for Pillow with the same version number. |
| 6 | +For example, `Pillow-SIMD 3.2.0.post3` is drop-in replacement for |
| 7 | +`Pillow 3.2.0` and `Pillow-SIMD 3.3.3.post0` for `Pillow 3.3.3`. |
| 8 | + |
| 9 | +For more information about original Pillow, please |
| 10 | +[read the documentation][original-docs], |
| 11 | +[check the changelog][original-changelog] and |
| 12 | +[find out how to contribute][original-contribute]. |
| 13 | + |
| 14 | + |
| 15 | +## Why SIMD |
| 16 | + |
| 17 | +There are many ways to improve the performance of image processing. |
| 18 | +You can use better algorithms for the same task, you can make better |
| 19 | +implementation for current algorithms, or you can use more processing unit |
| 20 | +resources. It is perfect when you can just use more efficient algorithm like |
| 21 | +when gaussian blur based on convolutions [was replaced][gaussian-blur-changes] |
| 22 | +by sequential box filters. But a number of such improvements are very limited. |
| 23 | +It is also very tempting to use more processor unit resources |
| 24 | +(via parallelization) when they are available. But it is handier just |
| 25 | +to make things faster on the same resources. And that is where SIMD works better. |
| 26 | + |
| 27 | +SIMD stands for "single instruction, multiple data". This is a way to perform |
| 28 | +same operations against the huge amount of homogeneous data. |
| 29 | +Modern CPU have different SIMD instructions sets like |
| 30 | +MMX, SSE-SSE4, AVX, AVX2, AVX512, NEON. |
| 31 | + |
| 32 | +Currently, Pillow-SIMD can be [compiled](#installation) with SSE4 (default) |
| 33 | +and AVX2 support. |
| 34 | + |
| 35 | + |
| 36 | +## Status |
| 37 | + |
| 38 | +[![Uploadcare][uploadcare.logo]][uploadcare.com] |
| 39 | + |
| 40 | +Pillow-SIMD can be used in production. Pillow-SIMD has been operating on |
| 41 | +[Uploadcare][uploadcare.com] servers for more than 1 year. |
| 42 | +Uploadcare is SAAS for image storing and processing in the cloud |
| 43 | +and the main sponsor of Pillow-SIMD project. |
| 44 | + |
| 45 | +Currently, following operations are accelerated: |
| 46 | + |
| 47 | +- Resize (convolution-based resampling): SSE4, AVX2 |
| 48 | +- Gaussian and box blur: SSE4 |
| 49 | +- Alpha composition: SSE4, AVX2 |
| 50 | +- RGBA → RGBa (alpha premultiplication): SSE4, AVX2 |
| 51 | +- RGBa → RGBA (division by alpha): AVX2 |
| 52 | + |
| 53 | +See [CHANGES](CHANGES.SIMD.rst). |
| 54 | + |
| 55 | + |
| 56 | +## Benchmarks |
| 57 | + |
| 58 | +The numbers in the table represent processed megapixels of source RGB 2560x1600 |
| 59 | +image per second. For example, if resize of 2560x1600 image is done |
| 60 | +in 0.5 seconds, the result will be 8.2 Mpx/s. |
| 61 | + |
| 62 | +- Skia 53 |
| 63 | +- ImageMagick 6.9.3-8 Q8 x86_64 |
| 64 | +- Pillow 3.4.1 |
| 65 | +- Pillow-SIMD 3.4.1.post1 |
| 66 | + |
| 67 | +Operation | Filter | IM | Pillow| SIMD SSE4| SIMD AVX2| Skia 53 |
| 68 | +------------------------|---------|------|-------|----------|----------|-------- |
| 69 | +**Resize to 16x16** | Bilinear| 41.37| 317.28| 1282.85| 1601.85| 809.49 |
| 70 | + | Bicubic | 20.58| 174.85| 712.95| 900.65| 453.10 |
| 71 | + | Lanczos | 14.17| 117.58| 438.60| 544.89| 292.57 |
| 72 | +**Resize to 320x180** | Bilinear| 29.46| 195.21| 863.40| 1057.81| 592.76 |
| 73 | + | Bicubic | 15.75| 118.79| 503.75| 504.76| 327.68 |
| 74 | + | Lanczos | 10.80| 79.59| 312.05| 384.92| 196.92 |
| 75 | +**Resize to 1920x1200** | Bilinear| 17.80| 68.39| 215.15| 268.29| 192.30 |
| 76 | + | Bicubic | 9.99| 49.23| 170.41| 210.62| 112.84 |
| 77 | + | Lanczos | 6.95| 37.71| 130.00| 162.57| 104.76 |
| 78 | +**Resize to 7712x4352** | Bilinear| 2.54| 8.38| 22.81| 29.17| 20.58 |
| 79 | + | Bicubic | 1.60| 6.57| 18.23| 23.94| 16.52 |
| 80 | + | Lanczos | 1.09| 5.20| 14.90| 20.40| 12.05 |
| 81 | +**Blur** | 1px | 6.60| 16.94| 35.16| | |
| 82 | + | 10px | 2.28| 16.94| 35.47| | |
| 83 | + | 100px | 0.34| 16.93| 35.53| | |
| 84 | + |
| 85 | + |
| 86 | +### Some conclusion |
| 87 | + |
| 88 | +Pillow is always faster than ImageMagick. And Pillow-SIMD is faster |
| 89 | +than Pillow in 4—5 times. In general, Pillow-SIMD with AVX2 always |
| 90 | +**16-40 times faster** than ImageMagick and overperforms Skia, |
| 91 | +high-speed graphics library used in Chromium, up to 2 times. |
| 92 | + |
| 93 | +### Methodology |
| 94 | + |
| 95 | +All tests were performed on Ubuntu 14.04 64-bit running on |
| 96 | +Intel Core i5 4258U with AVX2 CPU on the single thread. |
| 97 | + |
| 98 | +ImageMagick performance was measured with command-line tool `convert` with |
| 99 | +`-verbose` and `-bench` arguments. I use command line because |
| 100 | +I need to test the latest version and this is the easiest way to do that. |
| 101 | + |
| 102 | +All operations produce exactly the same results. |
| 103 | +Resizing filters compliance: |
| 104 | + |
| 105 | +- PIL.Image.BILINEAR == Triangle |
| 106 | +- PIL.Image.BICUBIC == Catrom |
| 107 | +- PIL.Image.LANCZOS == Lanczos |
| 108 | + |
| 109 | +In ImageMagick, the radius of gaussian blur is called sigma and the second |
| 110 | +parameter is called radius. In fact, there should not be additional parameters |
| 111 | +for *gaussian blur*, because if the radius is too small, this is *not* |
| 112 | +gaussian blur anymore. And if the radius is big this does not give any |
| 113 | +advantages but makes operation slower. For the test, I set the radius |
| 114 | +to sigma × 2.5. |
| 115 | + |
| 116 | +Following script was used for testing: |
| 117 | +https://gist.github.com/homm/f9b8d8a84a57a7e51f9c2a5828e40e63 |
| 118 | + |
| 119 | + |
| 120 | +## Why Pillow itself is so fast |
| 121 | + |
| 122 | +There are no cheats. High-quality resize and blur methods are used for all |
| 123 | +benchmarks. Results are almost pixel-perfect. The difference is only effective |
| 124 | +algorithms. Resampling in Pillow was rewritten in version 2.7 with |
| 125 | +minimal usage of floating point numbers, precomputed coefficients and |
| 126 | +cache-awareness transposition. This result was improved in 3.3 & 3.4 with |
| 127 | +integer-only arithmetics and other optimizations. |
| 128 | + |
| 129 | + |
| 130 | +## Why Pillow-SIMD is even faster |
| 131 | + |
| 132 | +Because of SIMD, of course. But this is not all. Heavy loops unrolling, |
| 133 | +specific instructions, which not available for scalar. |
| 134 | + |
| 135 | + |
| 136 | +## Why do not contribute SIMD to the original Pillow |
| 137 | + |
| 138 | +Well, that's not simple. First of all, Pillow supports a large number |
| 139 | +of architectures, not only x86. But even for x86 platforms, Pillow is often |
| 140 | +distributed via precompiled binaries. To integrate SIMD in precompiled binaries |
| 141 | +we need to do runtime checks of CPU capabilities. |
| 142 | +To compile the code with runtime checks we need to pass `-mavx2` option |
| 143 | +to the compiler. But with that option compiller will inject AVX instructions |
| 144 | +enev for SSE functions, because every SSE instruction has AVX equivalent. |
| 145 | +So there is no easy way to compile such library, especially with setuptools. |
| 146 | + |
| 147 | + |
| 148 | +## Installation |
| 149 | + |
| 150 | +In general, you need to do `pip install pillow-simd` as always and if you |
| 151 | +are using SSE4-capable CPU everything should run smoothly. |
| 152 | +Do not forget to remove original Pillow package first. |
| 153 | + |
| 154 | +If you want the AVX2-enabled version, you need to pass the additional flag to C |
| 155 | +compiler. The easiest way to do that is define `CC` variable while compilation. |
| 156 | + |
| 157 | +```bash |
| 158 | +$ pip uninstall pillow |
| 159 | +$ CC="cc -mavx2" pip install -U --force-reinstall pillow-simd |
| 160 | +``` |
| 161 | + |
| 162 | + |
| 163 | +## Contributing to Pillow-SIMD |
| 164 | + |
| 165 | +Pillow-SIMD and Pillow are two separate projects. |
| 166 | +Please submit bugs and improvements not related to SIMD to |
| 167 | +[original Pillow][original-issues]. All bugs and fixes in Pillow |
| 168 | +will appear in next Pillow-SIMD version automatically. |
| 169 | + |
| 170 | + |
| 171 | + [original-docs]: http://pillow.readthedocs.io/ |
| 172 | + [original-issues]: https://github.com/python-pillow/Pillow/issues/new |
| 173 | + [original-changelog]: https://github.com/python-pillow/Pillow/blob/master/CHANGES.rst |
| 174 | + [original-contribute]: https://github.com/python-pillow/Pillow/blob/master/.github/CONTRIBUTING.md |
| 175 | + [gaussian-blur-changes]: http://pillow.readthedocs.io/en/3.2.x/releasenotes/2.7.0.html#gaussian-blur-and-unsharp-mask |
| 176 | + [uploadcare.com]: https://uploadcare.com/?utm_source=github&utm_medium=description&utm_campaign=pillow-simd |
| 177 | + [uploadcare.logo]: https://ucarecdn.com/dc4b8363-e89f-402f-8ea8-ce606664069c/-/preview/ |
0 commit comments