Skip to content

Add BMP support #1295

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 12, 2018
Merged

Add BMP support #1295

merged 4 commits into from
Nov 12, 2018

Conversation

Hakerh400
Copy link
Contributor

Fix #953

  • Update changelog
  • Add test

Copy link
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool 👍🙂. Two minor suggestions.

Did you write the BMPParser itself? I only skimmed it, assuming that it has already been tested, but I also can't find where it came from. It looks like it only supports the most common bitmap variants. I came across a header-only library that supports more pixel formats, but I have no idea how common the different types are.

@Hakerh400
Copy link
Contributor Author

The feature request asking for BMP support has been open for more than a year, while implementing a simple BMP parser is trivial by following the specification. The pixel formats and compression algorithms it supports are the most common used ones, so I think it covers about 95% BMP files that average user may try to load. If anyone opens an issue with unsupported BMP, we can easily implement a new pixel format/compression method.

@LinusU
Copy link
Collaborator

LinusU commented Nov 6, 2018

Having our own BMP implementation rubs me the wrong way a bit, but if there is no established external library than I guess it's the way to go 🤔

@zbjornson
Copy link
Collaborator

I'm not opposed to having our own if it addresses our needs; I was just curious if it needs to be reviewed closely or if it's already in use elsewhere and how compatible it is. This parser is from-scratch, I take it?

https://github.com/ArashPartow/bitmap is the header-only library I came across that supports several color conversions. It does more than we need (bitmap library plus a graphics library), and would need modification to accept a memory source (instead of only a file source -- e.g. make it take an istream or array).

@Hakerh400
Copy link
Contributor Author

Hakerh400 commented Nov 6, 2018

if it needs to be reviewed closely or if it's already in use elsewhere

It's not in use elsewhere. You can review it closely if you want.

I came across a header-only library that supports more pixel formats

That library supports only 24-bit images. The image from issue #953 is 1-bit (monochrome).

@LinusU
Copy link
Collaborator

LinusU commented Nov 6, 2018

I have used a JS implementation of a BMP parser inside decode-ico for long over a year in production, I just extracted the BMP parser into it's own package: decode-bmp

Potentially we could use that implementation, I'm not sure how much faster or slower it is than a C++ implementation though 🤔 Would be very interesting to see some benchmarks ☺️

edit: the implementation have been used to parse favicons from all over the web, it has seen at least a few hundred thousand different images 🐎

@Hakerh400
Copy link
Contributor Author

Would be very interesting to see some benchmarks

const fs = require('fs');
const {Image} = require('canvas');
const decodeBmp = require('decode-bmp');

const source = fs.readFileSync('10000x10000-monochrome.bmp');
const img = new Image;

console.time('BMPParser');
img.src = source; // Synchronous
console.timeEnd('BMPParser');

console.time('decode-bmp');
decodeBmp(source);
console.timeEnd('decode-bmp');
> node test.js
BMPParser: 812.537ms
decode-bmp: 5899.919ms

@Hakerh400
Copy link
Contributor Author

Update: added support for another common BMP header.
While we are awaiting @zbjornson's detailed review, here is the parser comparison:

ArashPartow/bitmap decode-bmp BMPParser
Supports 1-bit images ✔️ ✔️
Supports 32-bit images ✔️ ✔️
Supports pre-multiplied alpha ✔️
Fast ✔️ ✔️
Easy to add new features ✔️ ✔️
Parses the minimal BMP ✔️ ✔️

Feel free to correct me if anything in the table is incorrect.

Copy link
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for being slow (and I know I still owe you a review of your other PR). Looking good. Given what's available and the simplicity of this, I support having our own parser. Found a few issues so far.

Might also find some useful bits from the Go parser: https://github.com/golang/image/blob/master/bmp/reader.go and TF parser: https://github.com/tensorflow/tensorflow/blob/3f23f4ddeabbdc0704444d84c158bd6c348a9f10/tensorflow/core/kernels/decode_bmp_op.cc#L101.

I've attached a 30-byte "bmp bomb" I made that claims it encodes a 67 MB image. There's no uninitialized mem exposure or buffer overrun 👍 , but I think the handling of this case could be better (see inline comments). Not sure if this is the best way, but a possible way: since you're supporting no compression and bitfield compression, you should know exactly how many bytes you need the buffer to contain before you start the decoding loop. That might also speed things up by moving the "overflow" (overrun) checks out of the loop.

bomb.zip

@Hakerh400
Copy link
Contributor Author

Hakerh400 commented Nov 9, 2018

Thanks for the review.
Added fixes and also support for negative height.
After reading the specification for the second time, it seems that 0 is not valid value for width or height, and also width may not be negative.

I don't think you want to be using std::byte anywhere. (a) It requires C++17 and we can't use past C++14 right now for compatibility. (b) Doing anything but bitwise ops on a byte is undefined behavior (or an implicit cast?). uint8_t is probably what you want.

It's actually uint8_t
In the BMPParser.h header file there is type definition: typedef uint8_t byte;

You've called setErr (good) but you're not breaking the loops where this is used. You're at least returning a 0 instead of uninitialized memory or overrunning the buffer though.

Yes, that is correct. I added check before entering the loop to ensure that image data 1) doesn't overlap with the header and 2) doesn't exceed the buffer size. Also removed checks from the loop.

since you're supporting no compression and bitfield compression, you should know exactly how many bytes you need the buffer to contain before you start the decoding loop.

The specification says that if compression is used, the valid image data size must be present in the header. So, all checks can be moved out of the loop even if compression is used. Edit: seems not to be true for BITMAPCOREHEADER. But anyway, checks are removed, if someone decide to implement compression, it is easy to revert checks.

Copy link
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Almost there, a few more small suggestions.

Copy link
Collaborator

@zbjornson zbjornson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏 Nice, that's all from me. @chearon @LinusU anything else?

The ambiguous symbol error might be a compiler bug in MSVS 2015 that's fixed in 2017. (I get the error in the former but not the latter.)

@LinusU
Copy link
Collaborator

LinusU commented Nov 11, 2018

Haven't looked too closely at the code, but feel free to merge if you think it's good @zbjornson 👍

@chearon
Copy link
Collaborator

chearon commented Nov 12, 2018

Neither have I but it looks good and I must say it's quite a relief to not have to update prebuilds for this!

@zbjornson zbjornson merged commit cc6c173 into Automattic:master Nov 12, 2018
@Hakerh400 Hakerh400 deleted the bmp branch November 12, 2018 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants