Skip to content

Do not try to decode body of Response #175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@ scrapy_splash.egg-info
htmlcov
.hypothesis
.ipynb_checkpoints
.pytest_cache
27 changes: 11 additions & 16 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -260,19 +260,8 @@ to set ``meta['splash']['args']`` use ``SplashRequest(..., args=myargs)``.

* ``meta['splash']['magic_response']`` - when set to True and a JSON
response is received from Splash, several attributes of the response
(headers, body, url, status code) are filled using data returned in JSON:

* response.headers are filled from 'headers' keys;
* response.url is set to the value of 'url' key;
* response.body is set to the value of 'html' key,
or to base64-decoded value of 'body' key;
* response.status is set to the value of 'http_status' key.
When ``meta['splash']['http_status_from_error_code']`` is True
and ``assert(splash:go(..))`` fails with an HTTP error
response.status is also set to HTTP error code.

Original URL, status and headers are available as ``response.real_url``,
``response.splash_response_status`` and ``response.splash_response_headers``.
(headers, body, url, status code) are filled using data returned in JSON,
for details see Responses section

This option is set to True by default if you use SplashRequest.
``render.json`` and ``execute`` endpoints may not have all the necessary
Expand Down Expand Up @@ -326,9 +315,15 @@ SplashJsonResponse provide extra features:

* response.headers are filled from 'headers' keys;
* response.url is set to the value of 'url' key;
* response.body is set to the value of 'html' key,
or to base64-decoded value of 'body' key;
* response.status is set from the value of 'http_status' key.
* response.body is set to the value of 'html' key, utf-8 text expected,
or to base64-decoded binary value of 'body' key;
* response.status is set to the value of 'http_status' key.
When ``meta['splash']['http_status_from_error_code']`` is True
and ``assert(splash:go(..))`` fails with an HTTP error
response.status is also set to HTTP error code.

Original URL, status and headers are available as ``response.real_url``,
``response.splash_response_status`` and ``response.splash_response_headers``.

When ``response.body`` is updated in SplashJsonResponse
(either from 'html' or from 'body' keys) familiar ``response.css``
Expand Down
1 change: 0 additions & 1 deletion scrapy_splash/response.py
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,6 @@ def _load_from_json(self):
# response.body
if 'body' in self.data:
self._body = base64.b64decode(self.data['body'])
self._cached_ubody = self._body.decode(self.encoding)
elif 'html' in self.data:
self._cached_ubody = self.data['html']
self._body = self._cached_ubody.encode(self.encoding)
Expand Down
16 changes: 16 additions & 0 deletions tests/test_middleware.py
Original file line number Diff line number Diff line change
Expand Up @@ -286,6 +286,22 @@ def test_magic_response():
if c.name == 'spam':
assert c.value == 'ham'

resp_data = {
'url': "http://exmaple.com/#id42",
'body': base64.b64encode(b'\xad').decode('ascii'),
'headers': [
{'name': 'Content-Type', 'value': "text/html; charset=cp1251"},
]
}
resp = TextResponse("http://mysplash.example.com/execute",
headers={b'Content-Type': b'application/json'},
body=json.dumps(resp_data).encode('utf8'))

try:
resp2 = mw.process_response(req, resp, None)
except:
assert 'process_response raised exception' is None


def test_cookies():
mw = _get_mw()
Expand Down