Skip to content

"Content-Type: application/json" does not uses UTF-8 charset by default #352

Closed
@yeDor

Description

@yeDor
  • Framework version: 1.5
  • Implementations: Spring Boot

Scenario

A Lambda with responses of type json: "Content-Type: application/json"
The lambda is behind an API Gateway. /{proxy+}
Do a GET on the resource and get the object as JSON in UTF-8 charset according to https://tools.ietf.org/html/rfc4627#section-3

Example controller:

public class MyController {

 public static class UtfResponse {

    public String s = "öüäß фрыцшщ";
  }

  @GetMapping("/json")
  public ResponseEntity<UtfResponse> getJson() {
    return ResponseEntity.ok(new UtfResponse());
  }

  @GetMapping(value = "/json/utf8", produces = MediaType.APPLICATION_JSON_UTF8_VALUE)
  public ResponseEntity<UtfResponse> getUtfJson() {
    return ResponseEntity.ok(new UtfResponse());
  }
}

Expected behavior

Getting the objects with their content in UTF-8 regardless if "charset=UTF-8" specified or not, according to https://tools.ietf.org/html/rfc4627#section-3.

Actual behavior

  public void getJson() {
    InputStream inputStream = new AwsProxyRequestBuilder(BASE_URL + "/json", GET.name()).json().buildStream();
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    handle(inputStream, outputStream);

    AwsProxyResponse response = readResponse(outputStream);
    assertThat(response.getStatusCode()).isEqualTo(OK.value());
assertThat(response.getMultiValueHeaders().getFirst(CONTENT_TYPE)).isEqualTo(APPLICATION_JSON_VALUE);
    assertThat(response.getBody()).isEqualTo("{\"s\":\"öüäß фрыцшщ\"}");
  }

Test with "Content-Type":"application/json" fails with

Expected :{"s":"öüäß фрыцшщ"}
Actual :{"s":"öüä� ������"}

Test with "Content-Type":["application/json; charset=UTF-8"] works as expected:

  public void getJsonUtf() {
    InputStream inputStream = new AwsProxyRequestBuilder(BASE_URL + "/json/utf8", GET.name()).json().buildStream();
    ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
    handle(inputStream, outputStream);

    AwsProxyResponse response = readResponse(outputStream);
    assertThat(response.getStatusCode()).isEqualTo(OK.value());
    assertThat(response.getMultiValueHeaders().getFirst(CONTENT_TYPE)).isEqualTo("application/json; charset=UTF-8");
    assertThat(response.getBody()).isEqualTo("{\"s\":\"öüäß фрыцшщ\"}");
  }

Spring MVC Test without "aws-serverless-java-container" works as expected:

  public void readJson() throws Exception {
    mockMvc.perform(get(BASE_URL + "/json"))
           .andExpect(status().isOk())
           .andExpect(content().contentType(APPLICATION_JSON))
           .andExpect(content().json("{\"s\":\"öüäß фрыцшщ\"}", true));
  }

Full log output

[2020-05-25 10:09:49,093] INFO [main] d.p.p.s.SenderStreamLambdaHandler - Output: {"statusCode":200,"multiValueHeaders":{"Cache-Control":["no-cache, no-store, max-age=0, must-revalidate"],"Content-Type":["application/json"],"Expires":["0"],"Pragma":["no-cache"],"Vary":["Origin","Access-Control-Request-Method","Access-Control-Request-Headers","Origin","Access-Control-Request-Method","Access-Control-Request-Headers"],"X-Content-Type-Options":["nosniff"],"X-Frame-Options":["DENY"],"X-XSS-Protection":["1; mode=block"]},"body":"{"s":"öüä� ������"}","isBase64Encoded":false}

Issue

com.amazonaws.serverless.proxy.model.ContainerConfig#getDefaultContentCharset is ISO-8859-1 regardless from MIME-type.
-> com.amazonaws.serverless.proxy.internal.servlet.AwsHttpServletResponse#flushBuffer
responseBody = new String(bodyOutputStream.toByteArray(), charset);
bodyOutputStream is in UTF-8 but responseBody is constructed with ISO-8859-1

Notes

MediaType.APPLICATION_JSON_UTF8_VALUE is Deprecated
as of 5.2 in favor of APPLICATION_JSON_VALUE since major browsers like Chrome now comply with the specification and interpret correctly UTF-8 special characters without requiring a charset=UTF-8 parameter

Possible workaround

Change the default charset in the constructor method of your Lambda handler class:
LambdaContainerHandler.getContainerConfig().setDefaultContentCharset("UTF-8");

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions