Skip to content

Json API Response's content-type is text/html;charset=UTF-8 by I want application/json; #122

@workji

Description

@workji

crawling vuejs site' background json data api

  1. The Request:
yield SeleniumRequest(
                    url=json_api_url,
                    wait_time=3,
                    callback=self.parse_api)
  1. The Origin Response:
{"data":{"list":[{"title":"adidas originals Yeezy 450 "Cloud White" H68038"},{"title":"adidas "Have A Good Game" H68038"}],"next":true,"total":2000},"result":1}
  1. I really Get Response:
    def parse_api(self, response):
        json_str = response.xpath('//body/text()').get()
        json_obj = json.loads(json_str)
{"data":{"list":[{"title":"adidas originals Yeezy 450 "Cloud White" H68038"},{"title":"adidas "Have A Good Game" H68038"}],"next":true,"total":2000},"result":1}
  1. The Problem:
json_obj = json.loads(json_str)              <- Go Error
json.decoder.JSONDecodeError: Expecting ',' delimiter: line
  1. The basic reason:
    when response's content-type is text/html;
    the HTML character entities ( &quot; ) changed to ( " ) and destory json format

so, my question is how can i change content-type [ text/html; ] to [ application/json; ] , or how can i avoid ( &quot; ) changed to ( " )
thank you very much!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions