-
Notifications
You must be signed in to change notification settings - Fork 378
feat: add an internal HttpClient
to be used in send_request
for PlaywrightCrawler
using APIRequestContext
bound to the browser context
#1134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
After discussions with @janbuchar in Slack, we came to the decision that this approach is best for However, there is a problem. Playwright does not propagate the headers set with It relevant #1055 Refactoring would be required to pass the headers set when opening the browser context to the crawling context. Or wait for Playwright to do something about it on their end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
tests/unit/crawlers/_playwright/test_playwright_crawler.py:486
- Ensure that send_request_response.read() is decoded (e.g. using .decode('utf-8')) before passing it to json.loads to avoid potential type errors when parsing bytes.
check_data['send_request'] = dict(json.loads(send_request_response.read()))
send_request
for PlaywrightCrawler
using APIRequestContext
bound to the browser contextHttpClient
to be used in send_request
for PlaywrightCrawler
using APIRequestContext
bound to the browser context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, just a couple of nits 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
HttpClient
implementation forPlaywrightCrawler
using theAPIRequestContext
fromPlaywright
. This ensures that HTTP requests use the same proxies as the browser context.Issues
HttpClient
implementation #928