-
Notifications
You must be signed in to change notification settings - Fork 1.2k
toBuffer optimization #306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
yikes that does seem pretty slow, I can't remember if we have |
You're right JPEG stream is almost 10 times faster — I'm getting ~900ms consistently even for 100% quality. I'll see if I can use JPEG instead of PNG, but would still be curious to see if we can optimize PNG somehow. Offtopic: there's |
I'll have a look at this to see if I can understand it; just cause it's interesting and because we have some new Buffer stuff added into the next NAN release. |
oh, and I can see an additional |
Ouch, not what I expected. See flame graph. This From what I can glean, this is a libpng issue where it's writing the image in many small chunks. |
By default, the data comes out of libpng in these chunks: 4b, 8b, 8192b. There's a |
@rvagg Interesting. Thanks for investigating. What about compression level? Is there a way to change it? Would it affect performance at all or is the bottleneck purely in data output? |
I don't believe you can control the compression level via Cairo without reimplementing the In fact, just to confirm that this is a libpng/Cairo problem, I did this in Canvas.cc: //status = cairo_surface_write_to_png_stream(canvas->surface(), toBuffer, &closure);
status = cairo_surface_write_to_png(canvas->surface(), "/tmp/tmp.png"); which bypasses the whole Timing the execution shows that it takes roughly the same amount of time as the original. |
What about rewriting it to use a buffered dynamic memory implementation instead of just doing malloc each time? Each buffer would be backed by a dynamic array that doubled in size each time it didn't have space. It would be possible to give a starting size, or it should default to... 10, like Java's ArrayList. |
Well it sort of is at the moment, it starts off at 1024 bytes and then does a |
If a normal page size is 4k, it would make slightly more sense to do things in increments of that, rather than in 1k chunks. However, if the required size is large, it will be much faster to do it by exponential than linear growth, as it will lead to O(log n) mallocs rather than O(n) mallocs. 4 MB in 1K chunks 4 MB in 4K chunks Anyway, LodePNG http://lodev.org/lodepng/ is a highly customizable PNG library. Could libpng be replaced? Might be possible to write a libpng facade on top, if one does not exist already. |
I did some checking of my own, the solution seems rather straightforward. Starting with a bit from /**
* SECTION:cairo-png
* @Title: PNG Support
* @Short_Description: Reading and writing PNG images
* @See_Also: #cairo_surface_t
*
* The PNG functions allow reading PNG images into image surfaces, and writing
* any surface to a PNG file.
*
* It is a toy API. It only offers very simple support for reading and
* writing PNG files, which is sufficient for testing and
* demonstration purposes. Applications which need more control over
* the generated PNG file should access the pixel data directly, using
* cairo_image_surface_get_data() or a backend-specific access
* function, and process it with another library, e.g. gdk-pixbuf or
* libpng.
**/ The reason it works in this bad fashion is because it does not use buffering with
So, the solution is to use cairo_image_surface_get_data() and feed it through By the way, png_structp png_ptr = png_create_read_struct_2
(PNG_LIBPNG_VER_STRING, (png_voidp)user_error_ptr,
user_error_fn, user_warning_fn, (png_voidp)
user_mem_ptr, user_malloc_fn, user_free_fn); |
Nice find! It sounds like a lot of work, are you offering? |
Shouldn't be too much. I'll cook something up later today.. |
PNG writing code has been extracted: kkoopa/node-canvas@9981b0f |
The memory stuff and flushing and whatnot do not matter much. Neither does zlib's compression buffer size (8192). It is all about PNG compression being slow as hell, especially with filtering. I added a second argument to Canvas::toBuffer, which allows setting the zlib compression level (0 - 9), default being 6. $ node canvastest.js --level=6
toBuffer: 9401ms
$ node canvastest.js --level=3
toBuffer: 4416ms
$ node canvastest.js --level=0
toBuffer: 2175ms I can clean it up and make a pull request if everybody's happy. |
I added the ability to optionally set the row filters to use. For fastest results, use no filtering and no compression. $ node canvastest.js 0 NO_FILTERS The resulting picture is 84 MB. |
are you using @kangax's image? I thought these numbers sounded pretty crazy but check this out:
haha.. certainly inherent to png I guess |
Yes, I am using that image, and yes -- I know for a fact that PNG compression is very heavy. I did some measurements a couple of months ago and switched to jpeg in some of my canvas projects. Part of the reason it is slow is that the DEFLATE algorithm used by |
I knew it was worse but 1/s per megapixel is pretty harsh |
It is only using one thread and likely thrashing the cache somewhat too. Also remember that there still is overhead compared to a normal bitmap format from computing a CRC_32 for each 8k chunk. |
The performance gains look amazing. Thanks @kkoopa! I'm definitely happy with this if @visionmedia is ok with pull request and such. |
@kkoopa @visionmedia how is compression level passed to |
toBuffer and streamPNGSync now accept two optional arguments tagged on at the end in order not to break backwards compatibility with the old API. This is not the most elegant way of structuring it, but a decent compromise. Illustrating with
which correspond 1:1 to those in So, in order to make a synchronous toBuffer call, you do |
Gotcha. FWIW, tested it with original image and got these fun numbers (time taken + size for each compression level): toBuffer 0: 2034ms
83.58MB
toBuffer 1: 3137ms
33.88MB
toBuffer 2: 3325ms
33.39MB
toBuffer 3: 4185ms
32.49MB
toBuffer 4: 4861ms
33.77MB
toBuffer 5: 6077ms
32.46MB
toBuffer 6: 9464ms
31.39MB
toBuffer 7: 12958ms
31.04MB
toBuffer 8: 29828ms
30.63MB
toBuffer 9: 41531ms
30.53MB |
Mmm, seems about right, but you left out level 0, resulting in 84 something MB. If you turn off filtering you should gain a couple hundred ms. PNG docs say zlib levels 3-6 have been determined best for normal PNGs, however in this case there is very little size increase between 3 and 1, but a 25 % time reduction -- so I would go with level 1. I guess this issue can be closed then. |
Definitely 1 seems the best as far as size/time ratio. I edited the comment to include level 0. It's the fastest but size is way too big. |
Just noticed that travis build fails when trying to install 1.1.0 — https://travis-ci.org/kangax/fabric.js/jobs/9902326#L463
It works for me locally (on OS X 10.7.5) |
See #312. It would be possible to simply add #ifndef CAIRO_FORMAT_RGB30
#define CAIRO_FORMAT_RGB30 5
#endif but this might cover up other serious problem (or it might work fine. For what it's worth, I've tried it like that on Windows, as I couldn't find any newer prebuilt libcairo binaries, and it seems to work). |
Looks like CAIRO_FORMAT_RGB30 was introduced in 1.12. I have 1.12.14 installed locally. Not sure about the one on Travis' Ubuntu. |
It is reasonable to assume it is Ubuntu 12.04, which only has 1.10.something IIRC. Include a newer libpng in your project and make it link to that and use its includes instead. |
#315 should rectify this |
In terms of PNG compression speed one way to mitigate the overhead of zlib is to quantize the image first, reducing the total # of colors and effectively reducing the work zlib needs to do. This is what Mapnik does and these are the results with the above image:
Note: z1 here means compression=1 passed to zlib and quantization is done with this code: https://github.com/mapnik/mapnik/blob/master/include/mapnik/hextree.hpp. Testcase using node-mapnik bindings (npm install mapnik): var mapnik = require('mapnik');
var bytes = require('bytes');
var fs = require('fs');
var image = new mapnik.Image.open('b4f4d900-f5f7-11e2-888e-2c2289c6edd0.jpg')
var tests = [
{ name:'64 color png / z1', format:'png8:c=64:z=1' },
{ name:'256 color png / z1', format:'png8:z=1' },
{ name:'256 color png / z6', format:'png8:z=6' },
{ name:'full color png / z1', format:'png32:z=1' },
{ name:'full color png / z6', format:'png32:z=6' },
]
tests.forEach(function(test) {
var start = +new Date;
var buffer = image.encodeSync(test.format);
var elapsed = new Date-start;
var size = bytes(buffer.length);
fs.writeFileSync(test.format+'.png',buffer);
console.log(test.name, elapsed+'ms', size);
}); |
We got a Burner on our hands 🔥 :D |
Nice results, I must say, if the color loss is acceptable (easily noticable at full resolution, red houses and such are no long red). Storing photos as PNGs does not make too much sense itself, but I wonder what the gains are when encoding an image with large swaths of not too many colors. Does normal full-color png compression still take a lot more time than the color-reduced version? |
@kkoopa - Yes, I agree that PNG is not a good format for photos. But yes: images with a limited set of colors (like tiles for maps) but that still needing alpha transparency (for compositing): this is where I prefer PNG. For example: https://foursquare.com/explore?mode=url&near=San%20Francisco%2C%20CA That is compressed to 64 colors and still looks good. |
Looks like webp is the real winner (of course if your browser supports it):
var mapnik = require('mapnik');
var bytes = require('bytes');
var fs = require('fs');
var image = new mapnik.Image.open('b4f4d900-f5f7-11e2-888e-2c2289c6edd0.jpg');
var tests = [
{ name:'64 color png / z1', ext: '.png', format:'png8:c=64:z=1' },
{ name:'256 color png / z1', ext: '.png', format:'png8:z=1' },
{ name:'256 color png / z6', ext: '.png', format:'png8:z=6' },
{ name:'full color png / z1', ext: '.png', format:'png32:z=1' },
{ name:'full color png / z6', ext: '.png', format:'png32:z=6' },
{ name:'jpeg80', ext: '.jpg', format:'jpeg80' },
{ name:'jpeg90', ext: '.jpg', format:'jpeg90' },
{ name:'jpeg100', ext: '.jpg', format:'jpeg100' },
{ name:'webp', ext: '.webp', format:'webp:image_hint=2:quality=75:method=0' },
{ name:'webp', ext: '.webp', format:'webp:image_hint=2:quality=55:method=0' },
]
tests.forEach(function(test) {
var start = +new Date;
var buffer = image.encodeSync(test.format);
var elapsed = new Date-start;
var size = bytes(buffer.length);
fs.writeFileSync('out/'+test.format+test.ext,buffer);
console.log(test.name, elapsed+'ms', size);
}); Download the webp quality:55 from here and open in chrome. |
#562 webp is 13% in size of a png and 87% space saved! If webp support is not provided by the users choice in browser then the developer will default to png fallback. Webp is supported on Chrome, so users of chrome should be advantaged with lower size images and a faster browsing experience. The comparison in size is too great to ignore, faster web experience when available is gold! Yet so much effort goes to supporting legacy browsers. I could ask the client if they have Chrome browser then if not - in node use |
Isn't the obvious solution to make this library not encode image data? In my dream world, Side-benefits: node-canvas would be easier to maintain and wouldn't depend on libpng or libjpeg. Obviously, that's a breaking change. During transition, |
@adamhooper |
@zbjornson Ooh, didn't see that method. It's similar, but not the same: the purpose is completely different. (Furthermore, I opine that the method to access raw pixel data should be called
Obviously, I think there's room in the Node ecosystem to put pixel-manipulation and image-encoding in separate libraries. Maybe I'd like to do custom pixel-manipulation (like giving pixel (As it so happens, I want to do all those things :)) There's no substitute for |
I ran into performance issues with
toBuffer
. Here's an example of drawing (rather) large image onto canvas, then callingtoBuffer
.toBuffer
is what's taking the longest here (drawImage
, for example, is insignificant). I'm getting ~7-7.5 sec ontoBuffer
call. This is on node v0.10.13, node-canvas v1.0.3, and Mac OS X (10.7.5).Is there a way to speed things up? Anything at all? I also tried async
toBuffer
but it doesn't really help.Attaching image used in a test.
The text was updated successfully, but these errors were encountered: