Closed
Description
Description
The current setup has significant complexity due to multiple ways of passing components like Configuration
, EventManager
, and StorageClient
. This leads to numerous edge cases and unexpected behavior, making the system harder to maintain and use consistently.
Related issues
- This change could address the following issues:
- Crawler doesn't respect
configuration
argument #539 - How can I disable cache completely? #369
- I'm not sure if entirely, but it could be a good first step.
- A typo(?) causing crashes in newer versions of apify + crawlee[parsel] apify-sdk-python#324 (comment)
- If we could remove the need to initialize BasicCrawler under async with Actor, that would be great.
- Fix the usage of Configuration class #670
- It would allow a much better approach to resolving this.
- Crawler doesn't respect
Solution
- Standardize access to
Configuration
,EventManager
, andStorageClient
exclusively through theservice_container
module. - This will introduce a breaking change.