-
Notifications
You must be signed in to change notification settings - Fork 155
feat(image-adapter): improve error handling and status codes #886
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
commit: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one big issue with this one, it's that in general we don't want to return the error message. Or if we want to do that, it should be under some env variable like OPEN_NEXT_FORWARD_ERROR_MESSAGE
that would default to false or in debug mode only.
Even forwarding the status code could be bad, we need to look into what next start
is doing here. We should do the same, and if they don't forward the status code we could put it under an env variable as well
Agreed @conico974 , we should be choosy with what info we pass down. Although, the main focus for this PR is that nothing that a client passing in a param to this endpoint like this... should return a 5xx. That would mean that any malicious user could just cause a 5xx error by requesting that endpoint with flawed querystring data. And at the end of the day, this is client issue
|
With the above commit.... Image Optimization Adapter Error Handling ImprovementsOverviewThe changes we've made to the image optimization adapter ensure that it behaves like the standard Next.js image optimization while also addressing security concerns about sensitive data leakage and preventing false alarms in monitoring. Key Improvements1. Preserves Next.js's Default Error Behavior:
2. Preserves Correct HTTP Status Codes:
3. Prevents Sensitive Data Leakage:
4. Improves Monitoring Accuracy:
Security ConsiderationsThe security of this approach comes from two layers: 1. Next.js's Built-in Sanitization:
2. Our Additional Error Handling:
|
opennextjs-aws/packages/open-next/src/adapters/image-optimization-adapter.ts Lines 86 to 89 in 6d36c0a
RecoverableError in this one aswell?
|
|
statusCode = e.statusCode; | ||
} else if ("code" in e) { | ||
const code = e.code as string; | ||
if (code === "ENOTFOUND" || code === "ECONNREFUSED") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't ECONNREFUSED
usually a 500 level error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@khuezy I think you might be missing the original PR context?
All these errors (most) are fundamentally due to the server acting on the client's provided URL.
If a client provides a broken url (eg. invalid url, missing image, access denied, connection refused etc) then thats a "client error" 4xx not a fault of the server 5xx.
This is indeed how this works with native NextJS image optimizer. next start
.
You can test this by going to
/_next/image?url=https://someUrl.com
where https://someUrl.com
is a url that returns ECONNREFUSED, ENOTFOUNT, ACCESS DENIED etc.
Open Next server will 500
NextJS (next start
) will return 4xx with the standard message "url" parameter is valid but upstream response is invalid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an important question might be to what extent does the native NextJS Image Optimizer disregard the error codes and message it gets back from this adapter and simply respond to the client in that generic way.
Then perhaps all we need to do in the adapter is ensure we're not throwing or console.error on anything ... while still returning the accurate error messages?
This would change how this code works now.... as its trying to modify the error codes and messages to lower 5xx to 4xx so as not to crash / throw the server due to a client bad request (url)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gotcha. Very weird that nextjs decided to swallow the 500s into 400. Ooof
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? This is what this PR is proposing it should be.
Otherwise you give access to end users, malicious or otherwise, to cause a 5xx on your servers but simply giving the endpoint a faulty URL or "Bad Request".
It's no fault of the image optimizer that the end-user passed it a URL to a ZIP file vs an image, or a 404 link, or an access denied one etc etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd have that problem no matter what right? You're suppose to whitelist the endpoints, eg even for server actions and images - remotePatterns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Next.js's approach of normalizing upstream 5xx errors to 4xx responses is actually quite sensible for this use case. Here's why:
Monitoring clarity: By converting upstream 5xx errors to 4xx, Next.js ensures that your monitoring systems only alert on problems you can actually fix, not issues with third-party services.
Semantic correctness: From Next.js's perspective, if a user provides a URL that leads to a broken resource, that's fundamentally a client input problem - the client provided a "bad" URL, even if it's "bad" because the upstream server failed.
Practical operations: Development and operations teams need clean separation between "our systems are broken" vs. "external dependencies are broken." The conversion to 4xx helps maintain this boundary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Monitoring clarity
- I think you can argue the opposite too. If Nextjs converts the specific error to a generic, it's hard for you to determine what the problem is.
Semantic correctness
- Most of the time, the "end user" is not the consumer of the image optimization request - the app is. This is why the remotePatterns
exists. Your app service is not a service for people to arbitrarily use to optimize their images.
Practical operations: Development and operations teams need clean separation between "our systems are broken" vs. "external dependencies are broken." The conversion to 4xx helps maintain this boundary.
I'm confused by this statement, it sounds contradicting? If the error is swallowed, how would you know if it's "our system" or "their system" that's broken?
I've simplified the classifyError function to focus on the core principle: distinguishing between client errors and server errors for proper logging, while preserving the original error information. The new implementation: Preserves original error details - We keep the original error message and status code when available This approach has several benefits:
|
Core Changes
Internal vs External Image HandlingNext.js treats internal and external image errors differently, requiring separate approaches:
This approach ensures consistent error behavior across image types while providing appropriate client feedback and maintaining security. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I'd like some more eyes on this since I'm not currently using OpenNext.
I think we have some image optimization e2e tests already.
Fixes #885