Fix Bad Gateway and optimize PHP workers #3219
Merged
+200
−244
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation for the change, related issues
This PR fixes Bad Gateway (502) errors that occur when multiple concurrent requests are made to the PHP request handler.
Implementation details
The primary PHP instance was not used for enqueued requests. Instead, all enqueued requests needed to wait for the secondary instances, and the primary instance only served new requests directly, never from the queue.
The fix refactors
PHPProcessManagerto a pool-based design where all spawned instances (including primary) are added to an idle pool,acquirePHPInstance()takes from the pool or spawns new ones if needed, and instances are returned to the pool for reuse. All PHP instances should be automatically rotated every 400 requests.Increased instance wait timeout from 5s to 30s: When all PHP instances are busy, new requests wait for one to become available. The previous 5s timeout was likely a bit too short for some scenarios. This timeout limits wait time for an available instance, not PHP execution time.
Semaphore stale resolver bug: When a semaphore acquire timed out, its resolver remained in the queue. Later releases would notify these stale resolvers instead of actual waiting requests. The fix removes timed-out resolvers from the queue.
Testing Instructions (or ideally a Blueprint)
The fix can be verified by running concurrent requests in the browser console:
On
trunk, some of these promises would be rejected. In this branch, all should succeed.