-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add BMP support #1295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BMP support #1295
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool 👍🙂. Two minor suggestions.
Did you write the BMPParser itself? I only skimmed it, assuming that it has already been tested, but I also can't find where it came from. It looks like it only supports the most common bitmap variants. I came across a header-only library that supports more pixel formats, but I have no idea how common the different types are.
The feature request asking for BMP support has been open for more than a year, while implementing a simple BMP parser is trivial by following the specification. The pixel formats and compression algorithms it supports are the most common used ones, so I think it covers about 95% BMP files that average user may try to load. If anyone opens an issue with unsupported BMP, we can easily implement a new pixel format/compression method. |
Having our own BMP implementation rubs me the wrong way a bit, but if there is no established external library than I guess it's the way to go 🤔 |
I'm not opposed to having our own if it addresses our needs; I was just curious if it needs to be reviewed closely or if it's already in use elsewhere and how compatible it is. This parser is from-scratch, I take it? https://github.com/ArashPartow/bitmap is the header-only library I came across that supports several color conversions. It does more than we need (bitmap library plus a graphics library), and would need modification to accept a memory source (instead of only a file source -- e.g. make it take an istream or array). |
It's not in use elsewhere. You can review it closely if you want.
That library supports only 24-bit images. The image from issue #953 is 1-bit (monochrome). |
I have used a JS implementation of a BMP parser inside Potentially we could use that implementation, I'm not sure how much faster or slower it is than a C++ implementation though 🤔 Would be very interesting to see some benchmarks edit: the implementation have been used to parse favicons from all over the web, it has seen at least a few hundred thousand different images 🐎 |
const fs = require('fs');
const {Image} = require('canvas');
const decodeBmp = require('decode-bmp');
const source = fs.readFileSync('10000x10000-monochrome.bmp');
const img = new Image;
console.time('BMPParser');
img.src = source; // Synchronous
console.timeEnd('BMPParser');
console.time('decode-bmp');
decodeBmp(source);
console.timeEnd('decode-bmp'); > node test.js
BMPParser: 812.537ms
decode-bmp: 5899.919ms |
Update: added support for another common BMP header.
Feel free to correct me if anything in the table is incorrect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for being slow (and I know I still owe you a review of your other PR). Looking good. Given what's available and the simplicity of this, I support having our own parser. Found a few issues so far.
Might also find some useful bits from the Go parser: https://github.com/golang/image/blob/master/bmp/reader.go and TF parser: https://github.com/tensorflow/tensorflow/blob/3f23f4ddeabbdc0704444d84c158bd6c348a9f10/tensorflow/core/kernels/decode_bmp_op.cc#L101.
I've attached a 30-byte "bmp bomb" I made that claims it encodes a 67 MB image. There's no uninitialized mem exposure or buffer overrun 👍 , but I think the handling of this case could be better (see inline comments). Not sure if this is the best way, but a possible way: since you're supporting no compression and bitfield compression, you should know exactly how many bytes you need the buffer to contain before you start the decoding loop. That might also speed things up by moving the "overflow" (overrun) checks out of the loop.
Thanks for the review.
It's actually
Yes, that is correct. I added check before entering the loop to ensure that image data 1) doesn't overlap with the header and 2) doesn't exceed the buffer size. Also removed checks from the loop.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. Almost there, a few more small suggestions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't looked too closely at the code, but feel free to merge if you think it's good @zbjornson 👍 |
Neither have I but it looks good and I must say it's quite a relief to not have to update prebuilds for this! |
Fix #953