feat: Allow large Parse File uploads#9286
feat: Allow large Parse File uploads#9286dalyaidan1 wants to merge 28 commits intoparse-community:alphafrom
Conversation
Thanks for opening this pull request! |
spec/FilesRouter.spec.js
Outdated
|
|
||
| afterAll(async () => { | ||
| // clean up the server for resuse | ||
| if (server && server.close) { |
There was a problem hiding this comment.
Wouldn't it be enough to just call await reconfigureServer() to init the server with the default config?
There was a problem hiding this comment.
Your right. Missed that, new to testing here.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## alpha #9286 +/- ##
==========================================
+ Coverage 93.50% 93.52% +0.02%
==========================================
Files 186 186
Lines 14810 14834 +24
==========================================
+ Hits 13848 13874 +26
+ Misses 962 960 -2 ☔ View full report in Codecov by Sentry. |
|
There seems to be some test coverage missing, could you add that? |
Signed-off-by: Manuel <5673677+mtrezza@users.noreply.github.com>
Signed-off-by: Manuel <5673677+mtrezza@users.noreply.github.com>
|
Could you take a look at the failing postgres tests? |
a storage adapter is needed to test postgres
@mtrezza I changed those failing tests to only mongo. Without pulling in a storage adapter such as the fs adapter, those tests will not be able to pass in postgres. Let me know if pulling in one as a dev dependency would be preferable? All the adapters will also need changes similar to the inbuilt GridFS adapters changes in order to handle the large files. I have adjustments for the fs and s3 adapters that I can also make PRs in their respective repositories for as needed. |
|
I understand the change is really for the GridFSStorageAdapter and it just so happens that the adapter is bundled with Parse Server. It isn't related to DB adapters, so maybe the tests should be MongoDB only in this PR. Does that make sense? Only if the Postgres adapter supported Parse.File upload (like MongoDB does with GridFS), then there would be some Postgres related changes needed in this PR, does it? Yes, for other storage adapters the PRs in their respective repos would be great. |
|
Yes that is correct. There would need to be inbuilt storage adapters to test for postgres. I've made the other PRs: |
|
Could you change the test from |
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
📝 WalkthroughWalkthroughThis PR addresses the 512MB file upload limitation by replacing base64 string construction with a Blob-backed approach for large buffers, while adding streaming encryption support in the storage adapter to handle large Blob data efficiently. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant FilesRouter
participant GridFSBucketAdapter
participant CryptoStream
participant MongoDB
Client->>FilesRouter: POST /files/{filename} (large file)
alt File size >= 512MB
FilesRouter->>FilesRouter: Detect size > V8 limit
FilesRouter->>FilesRouter: Convert Buffer to Blob
else File size < 512MB
FilesRouter->>FilesRouter: Keep as base64 string
end
FilesRouter->>GridFSBucketAdapter: createFile(filename, fileData: Blob|Buffer)
alt Data is Blob + Encryption enabled
GridFSBucketAdapter->>GridFSBucketAdapter: Create Transform stream with cipher
GridFSBucketAdapter->>CryptoStream: Pipe Blob through cipher
CryptoStream->>CryptoStream: Apply encryption chunk-by-chunk
CryptoStream->>CryptoStream: Append IV & auth tag
CryptoStream->>MongoDB: Upload encrypted stream
else Data is Buffer + Encryption enabled
GridFSBucketAdapter->>GridFSBucketAdapter: Single-step cipher operation
GridFSBucketAdapter->>MongoDB: Upload encrypted buffer
else No encryption
GridFSBucketAdapter->>MongoDB: Upload data directly
end
MongoDB-->>GridFSBucketAdapter: Confirmation
GridFSBucketAdapter-->>FilesRouter: Success
FilesRouter-->>Client: 200 OK
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🧹 Nitpick comments (4)
src/Routers/FilesRouter.js (1)
236-240: Redundant optional chaining on line 240.Line 236 already guards that
fileObject.file._source?.file instanceof Blobis truthy, so on line 240_sourceis guaranteed non-nullish. The?.onfileObject.file._source?.fileis unnecessary and obscures the intent.Proposed fix
// set the file data - fileData = fileObject.file._source?.file; + fileData = fileObject.file._source.file;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Routers/FilesRouter.js` around lines 236 - 240, The optional chaining on fileObject.file._source?.file is redundant because the preceding check uses fileObject.file._source?.file instanceof Blob; update the assignment in FilesRouter.js to use the guaranteed path fileObject.file._source.file when setting fileData (and keep the fileSize assignment as-is), i.e. remove the "?.", referencing the same fileObject/fileData/fileSize symbols to make intent explicit.spec/FilesRouter.spec.js (1)
46-77: Temp file cleanup is not guaranteed on test failure.If
postFileor anexpectthrows,fs.unlinkSyncis never reached, leaving large temp files (up to ~562 MB) on disk. Usetry/finallyor anafterEachhook to ensure cleanup.Proposed fix (example for one test; apply to all three)
it('should allow Parse.File uploads under 512MB', async () => { const filePath = path.join(__dirname, 'file.txt'); - await fs.promises.writeFile(filePath, Buffer.alloc(1024 * 1024)); - - const response = await postFile('file.txt', filePath); - expect(response.ok).toBe(true); - - fs.unlinkSync(filePath); + try { + await fs.promises.writeFile(filePath, Buffer.alloc(1024 * 1024)); + const response = await postFile('file.txt', filePath); + expect(response.ok).toBe(true); + } finally { + if (fs.existsSync(filePath)) fs.unlinkSync(filePath); + } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spec/FilesRouter.spec.js` around lines 46 - 77, The tests create large temporary files but call fs.unlinkSync only on the success path, risking leftover files if postFile or expect throws; update each test ("should allow Parse.File uploads under 512MB", "exactly 512MB", "over 512MB") to ensure cleanup by either (a) declaring filePath in the test scope and wrapping the postFile/expect sequence in a try/finally that always calls fs.unlinkSync(filePath) in the finally, or (b) move filePath creation to a beforeEach/outer scope and add an afterEach that checks for existence and unlinks it; reference the tests and the postFile/fs.unlinkSync calls when making the changes.spec/GridFSBucketStorageAdapter.spec.js (1)
506-533: Mock cipher is missinggetAuthTag()— non-blob encrypted path would throw a different error.The mock on line 512 returns
{ update, final }but omitsgetAuthTag(). For the blob path, ifupdatethrows in thetransformcallback,flush(which callsgetAuthTag) is never reached, so the test passes. But for the non-blob path (Buffer.concat([cipher.update(data), cipher.final(), iv, cipher.getAuthTag()])), ifupdatethrows,getAuthTagis never reached either. This works by coincidence for theupdate-error test.However, for the
final-error test (lines 536-563), the non-blob path callscipher.update(data)(succeeds), thencipher.final()(throws).getAuthTag()is never called. So both tests pass, but the mock is incomplete. If anyone adjusts the error-handling order, this could produce confusingTypeError: cipher.getAuthTag is not a functioninstead of the expected error.Consider adding
getAuthTagto the mock for completeness:Proposed fix
spyOn(crypto, 'createCipheriv').and.returnValue({ update: (_chunk) => { throw error; }, final: () => { return Buffer.from('encryptedData'); }, + getAuthTag: () => { return Buffer.alloc(16); }, });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@spec/GridFSBucketStorageAdapter.spec.js` around lines 506 - 533, The cipher mock used in the GridFSBucketAdapter tests is incomplete (crypto.createCipheriv is mocked to return only update and final), which can spuriously cause "cipher.getAuthTag is not a function" errors; update the mock returned by the spy on crypto.createCipheriv in the tests that simulate cipher errors (the tests calling gfsAdapter.createFile) to also include a getAuthTag() method (e.g., returning a Buffer or appropriate value) so both blob and non-blob paths can call getAuthTag safely, and keep the existing update/final behaviors and the final crypto.createCipheriv.and.callThrough() restore.src/Adapters/Files/GridFSBucketAdapter.js (1)
81-156:stream.on('finish', resolve)should be registered before piping starts.Although practically safe due to Node's single-threaded event loop (pipe is async), it's a best-practice to attach event listeners before initiating the data flow. Moving the
finish/errorlisteners onstreamabove the pipe/write section makes the intent clearer and avoids a subtle ordering footgun if the code is later refactored.Proposed restructure sketch
return new Promise((resolve, reject) => { try { const iv = ...; const cipher = ...; + stream.on('finish', resolve); + stream.on('error', reject); + if (data instanceof Blob) { // ... pipe logic (remove duplicate stream.on('error', reject) inside) } else { // ... write logic } - - stream.on('finish', resolve); - stream.on('error', reject); } catch (e) { reject(e); } });🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/Adapters/Files/GridFSBucketAdapter.js` around lines 81 - 156, Move the stream event listeners so they are attached before any piping or writes start: register stream.on('finish', resolve) and stream.on('error', reject) (and any intermediate error handlers like on the cipherTransform) prior to calling readableStream.pipe(...)/stream.write(...)/stream.end() inside the Promise block in GridFSBucketAdapter.js; update the Promise flow around the readableStream/cipherTransform/stream code paths (the branches that use cipher and iv, the Blob branch, and the non-Blob write branch) so listeners are installed first, then start the piping or write operations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@spec/FilesRouter.spec.js`:
- Around line 27-31: The test's headers object sets 'Content-Type' to
'multipart/form-data' even though the request body is a raw file stream; update
the headers in FilesRouter.spec.js (the headers constant used in the test) to
use 'application/octet-stream' or the file's real MIME type so the Content-Type
accurately reflects a raw stream rather than multipart/form-data.
- Around line 46-55: The tests use the legacy async done callback pattern which
can mask promise rejections; update each test (e.g., the it blocks with
descriptions "should allow Parse.File uploads under 512MB" and the two following
its) by removing the done parameter (change async done => to async () =>) and
deleting the terminal done() call(s), leaving the existing await calls so Jest
handles promise resolution/rejection natively; apply the same change to the
other two it blocks referenced in the diff so all three tests use pure
async/await.
In `@spec/GridFSBucketStorageAdapter.spec.js`:
- Around line 128-131: The async helper expectMissingFile is being invoked
without awaiting in several cleanup loops, causing assertions to run
asynchronously; update the test cleanup loops that call expectMissingFile with
await (e.g., inside iterations over fileNames and where fileName3 is used) so
calls like expectMissingFile(unencryptedAdapter, fileNames[i]),
expectMissingFile(oldEncryptedAdapter, fileNames[i]) and
expectMissingFile(unEncryptedAdapter, fileName3) become awaited, ensuring the
async assertions complete before proceeding.
- Around line 29-40: getDataAsString's handling of Blob is incorrect because
Blob.prototype.text() accepts no encoding; update the 'blob' branch in
getDataAsString to call await data.text() with no args for UTF-8, and if a
non-UTF-8 encoding is requested use await data.arrayBuffer() combined with new
TextDecoder(encoding).decode(...) to respect the encoding; keep the existing
fallback for non-Blob values returning data.toString(encoding).
In `@src/Adapters/Files/GridFSBucketAdapter.js`:
- Around line 93-134: The readableStream from the Blob can emit 'error' which
.pipe() won't forward, causing the promise to hang; add a
readableStream.on('error', reject) handler (and remove it when stream
finishes/errors) for both the cipher branch (before piping into cipherTransform)
and the non-cipher branch (before piping into stream) so any read errors
immediately call reject; reference the Blob-derived readableStream,
Readable.fromWeb conversion, the Transform instance cipherTransform, and the
destination gridfs stream variable stream to locate where to attach the handler.
- Around line 96-99: The code currently calls Readable.fromWeb(readableStream)
unconditionally; instead, gate that call by checking the Node.js runtime version
(process.versions.node) and only use Readable.fromWeb when the Node version is
new enough (>=22.17.0 or your chosen minimum such as >=24.11.0); for older Node
versions implement a safe fallback that converts a Web ReadableStream into a
Node Readable by creating an async generator that uses
readableStream.getReader() to yield chunks and then pass that generator into
Readable.from — update the conversion logic in GridFSBucketAdapter (the
Readable.fromWeb / ReadableStream branch) to perform the version check and use
the manual async-generator->Readable fallback when the runtime is too old.
In `@src/Routers/FilesRouter.js`:
- Around line 185-195: The code creates a Blob from req.body which doubles
memory during large uploads; update the FilesRouter handling around
MAX_V8_STRING_SIZE_BYTES and Parse.File so you either (a) defer Blob creation
until right before the storage adapter upload call (move new Blob([req.body])
close to where Parse.File is handed to the adapter) or (b) immediately drop the
original Buffer reference by setting req.body = null right after constructing
the Blob (ensure filename and contentType are preserved), so the temporary
overlap of req.body and the Blob is minimized.
- Around line 234-245: The code is directly accessing Parse.File internals
(fileObject.file._source and _data) which are undocumented and brittle; update
the upload flow to avoid these private fields by either (1) passing the original
raw file (Blob or Buffer) into the request payload or into fileObject (e.g.,
fileObject.rawFile) so this router can read size and data from that documented
source, or (2) if the Parse.File is already saved, use documented APIs (e.g.,
fetch the file via fileObject.file.url() over HTTP) to obtain bytes instead of
reading _source/_data; if neither is possible add a concise comment on why a
private field is used and implement a guarded fallback that throws a clear error
when those internals are missing. Ensure changes touch the code paths that set
fileData/fileObject.fileSize and reference symbols fileObject.file,
fileObject.fileSize, and fileData so the coupling to Parse.File internals is
removed or clearly documented and fail-safe.
---
Nitpick comments:
In `@spec/FilesRouter.spec.js`:
- Around line 46-77: The tests create large temporary files but call
fs.unlinkSync only on the success path, risking leftover files if postFile or
expect throws; update each test ("should allow Parse.File uploads under 512MB",
"exactly 512MB", "over 512MB") to ensure cleanup by either (a) declaring
filePath in the test scope and wrapping the postFile/expect sequence in a
try/finally that always calls fs.unlinkSync(filePath) in the finally, or (b)
move filePath creation to a beforeEach/outer scope and add an afterEach that
checks for existence and unlinks it; reference the tests and the
postFile/fs.unlinkSync calls when making the changes.
In `@spec/GridFSBucketStorageAdapter.spec.js`:
- Around line 506-533: The cipher mock used in the GridFSBucketAdapter tests is
incomplete (crypto.createCipheriv is mocked to return only update and final),
which can spuriously cause "cipher.getAuthTag is not a function" errors; update
the mock returned by the spy on crypto.createCipheriv in the tests that simulate
cipher errors (the tests calling gfsAdapter.createFile) to also include a
getAuthTag() method (e.g., returning a Buffer or appropriate value) so both blob
and non-blob paths can call getAuthTag safely, and keep the existing
update/final behaviors and the final crypto.createCipheriv.and.callThrough()
restore.
In `@src/Adapters/Files/GridFSBucketAdapter.js`:
- Around line 81-156: Move the stream event listeners so they are attached
before any piping or writes start: register stream.on('finish', resolve) and
stream.on('error', reject) (and any intermediate error handlers like on the
cipherTransform) prior to calling
readableStream.pipe(...)/stream.write(...)/stream.end() inside the Promise block
in GridFSBucketAdapter.js; update the Promise flow around the
readableStream/cipherTransform/stream code paths (the branches that use cipher
and iv, the Blob branch, and the non-Blob write branch) so listeners are
installed first, then start the piping or write operations.
In `@src/Routers/FilesRouter.js`:
- Around line 236-240: The optional chaining on fileObject.file._source?.file is
redundant because the preceding check uses fileObject.file._source?.file
instanceof Blob; update the assignment in FilesRouter.js to use the guaranteed
path fileObject.file._source.file when setting fileData (and keep the fileSize
assignment as-is), i.e. remove the "?.", referencing the same
fileObject/fileData/fileSize symbols to make intent explicit.
| const headers = { | ||
| 'X-Parse-Application-Id': Parse.applicationId, | ||
| 'X-Parse-Master-Key': Parse.masterKey, | ||
| 'Content-Type': 'multipart/form-data', | ||
| }; |
There was a problem hiding this comment.
Content-Type: multipart/form-data is semantically incorrect for a raw stream body.
The request sends a raw file stream, not a multipart-encoded body. multipart/form-data requires boundary delimiters. Since express.raw() accepts any content type, this works by accident. Use application/octet-stream (or the actual MIME type) to be accurate.
Proposed fix
const headers = {
'X-Parse-Application-Id': Parse.applicationId,
'X-Parse-Master-Key': Parse.masterKey,
- 'Content-Type': 'multipart/form-data',
+ 'Content-Type': 'application/octet-stream',
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const headers = { | |
| 'X-Parse-Application-Id': Parse.applicationId, | |
| 'X-Parse-Master-Key': Parse.masterKey, | |
| 'Content-Type': 'multipart/form-data', | |
| }; | |
| const headers = { | |
| 'X-Parse-Application-Id': Parse.applicationId, | |
| 'X-Parse-Master-Key': Parse.masterKey, | |
| 'Content-Type': 'application/octet-stream', | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@spec/FilesRouter.spec.js` around lines 27 - 31, The test's headers object
sets 'Content-Type' to 'multipart/form-data' even though the request body is a
raw file stream; update the headers in FilesRouter.spec.js (the headers constant
used in the test) to use 'application/octet-stream' or the file's real MIME type
so the Content-Type accurately reflects a raw stream rather than
multipart/form-data.
| it('should allow Parse.File uploads under 512MB', async done => { | ||
| const filePath = path.join(__dirname, 'file.txt'); | ||
| await fs.promises.writeFile(filePath, Buffer.alloc(1024 * 1024)); | ||
|
|
||
| const response = await postFile('file.txt', filePath); | ||
| expect(response.ok).toBe(true); | ||
|
|
||
| fs.unlinkSync(filePath); | ||
| done(); | ||
| }); |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
Replace done callback with pure async/await.
All three tests use the async done => pattern. Since the test body is fully await-based, the done callback is unnecessary and can mask unhandled rejections. This applies equally to the tests on lines 57 and 68.
Based on learnings: "New tests in the parse-server repository should use async/await with promise-based patterns rather than callback patterns with done()."
Proposed fix (apply same pattern to all three tests)
- it('should allow Parse.File uploads under 512MB', async done => {
+ it('should allow Parse.File uploads under 512MB', async () => {
const filePath = path.join(__dirname, 'file.txt');
await fs.promises.writeFile(filePath, Buffer.alloc(1024 * 1024));
const response = await postFile('file.txt', filePath);
expect(response.ok).toBe(true);
fs.unlinkSync(filePath);
- done();
});📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| it('should allow Parse.File uploads under 512MB', async done => { | |
| const filePath = path.join(__dirname, 'file.txt'); | |
| await fs.promises.writeFile(filePath, Buffer.alloc(1024 * 1024)); | |
| const response = await postFile('file.txt', filePath); | |
| expect(response.ok).toBe(true); | |
| fs.unlinkSync(filePath); | |
| done(); | |
| }); | |
| it('should allow Parse.File uploads under 512MB', async () => { | |
| const filePath = path.join(__dirname, 'file.txt'); | |
| await fs.promises.writeFile(filePath, Buffer.alloc(1024 * 1024)); | |
| const response = await postFile('file.txt', filePath); | |
| expect(response.ok).toBe(true); | |
| fs.unlinkSync(filePath); | |
| }); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@spec/FilesRouter.spec.js` around lines 46 - 55, The tests use the legacy
async done callback pattern which can mask promise rejections; update each test
(e.g., the it blocks with descriptions "should allow Parse.File uploads under
512MB" and the two following its) by removing the done parameter (change async
done => to async () =>) and deleting the terminal done() call(s), leaving the
existing await calls so Jest handles promise resolution/rejection natively;
apply the same change to the other two it blocks referenced in the diff so all
three tests use pure async/await.
| const getDataAsString = async (type, data, encoding = 'utf8') => { | ||
| switch (type) { | ||
| case 'string': | ||
| return data.toString(encoding); | ||
| case 'blob': | ||
| return (data instanceof Blob) | ||
| ? await data.text(encoding) : | ||
| data.toString(encoding); | ||
| default: | ||
| throw new Error(`Invalid type: ${type}`); | ||
| } | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Blob.prototype.text() method parameters encoding Web API MDN
💡 Result:
Blob.prototype.text() has no parameters—its syntax is simply:
blob.text()It always decodes the blob’s bytes as UTF-8 (you can’t pass an encoding like "iso-8859-1"). MDN states the data is “always presumed to be in UTF-8 format” and lists Parameters: None. [1] The File API spec defines text() as reading all bytes and then applying UTF-8 decode. [2]
If you need a different encoding, use FileReader.readAsText(blob, encoding) or TextDecoder on an ArrayBuffer instead. [1] [2]
Sources: [1] MDN Blob: text() method https://developer.mozilla.org/en-US/docs/Web/API/Blob/text, [2] W3C File API (Editor’s Draft) 3.3.3 The text() method https://w3c.github.io/FileAPI/
Blob.prototype.text() does not accept an encoding parameter.
Per the Web API spec, Blob.text() takes no arguments and always decodes as UTF-8. The encoding parameter on line 35 is silently ignored. This works by accident since the default is 'utf8', but the function signature is misleading.
Proposed fix
-const getDataAsString = async (type, data, encoding = 'utf8') => {
+const getDataAsString = async (type, data) => {
switch (type) {
case 'string':
- return data.toString(encoding);
+ return data.toString('utf8');
case 'blob':
return (data instanceof Blob)
- ? await data.text(encoding) :
- data.toString(encoding);
+ ? await data.text() :
+ data.toString('utf8');
default:
throw new Error(`Invalid type: ${type}`);
}
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@spec/GridFSBucketStorageAdapter.spec.js` around lines 29 - 40,
getDataAsString's handling of Blob is incorrect because Blob.prototype.text()
accepts no encoding; update the 'blob' branch in getDataAsString to call await
data.text() with no args for UTF-8, and if a non-UTF-8 encoding is requested use
await data.arrayBuffer() combined with new TextDecoder(encoding).decode(...) to
respect the encoding; keep the existing fallback for non-Blob values returning
data.toString(encoding).
| for (let i = 0; i < fileNames.length; i++) { | ||
| await unencryptedAdapter.deleteFile(fileNames[i]); | ||
| expectMissingFile(unencryptedAdapter, fileNames[i]); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the expectMissingFile function definition
rg -n 'expectMissingFile' spec/GridFSBucketStorageAdapter.spec.js -B 2 -A 5 | head -60Repository: parse-community/parse-server
Length of output: 2111
🏁 Script executed:
# Get the remaining occurrences around lines 376-383
rg -n 'expectMissingFile' spec/GridFSBucketStorageAdapter.spec.js -A 2 | tail -30Repository: parse-community/parse-server
Length of output: 680
🏁 Script executed:
# Get full context around the problematic lines to confirm the pattern
sed -n '128,133p' spec/GridFSBucketStorageAdapter.spec.js
echo "---"
sed -n '184,189p' spec/GridFSBucketStorageAdapter.spec.js
echo "---"
sed -n '376,385p' spec/GridFSBucketStorageAdapter.spec.jsRepository: parse-community/parse-server
Length of output: 789
Add missing await for expectMissingFile calls in cleanup loops.
The expectMissingFile function (line 7) is async and contains expect assertions, but it's called without await on lines 130, 186, 240, 308, 313, 378, and 383. This allows the test to complete before the assertion runs, risking silent failures or stale state in subsequent iterations.
Locations requiring `await`
- Line 130:
expectMissingFile(unencryptedAdapter, fileNames[i]); - Line 186:
expectMissingFile(oldEncryptedAdapter, fileNames[i]); - Line 240:
expectMissingFile(oldEncryptedAdapter, fileNames[i]); - Line 308:
expectMissingFile(oldEncryptedAdapter, fileNames[i]); - Line 313:
expectMissingFile(unEncryptedAdapter, fileName3); - Line 378:
expectMissingFile(oldEncryptedAdapter, fileNames[i]); - Line 383:
expectMissingFile(unEncryptedAdapter, fileName3);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@spec/GridFSBucketStorageAdapter.spec.js` around lines 128 - 131, The async
helper expectMissingFile is being invoked without awaiting in several cleanup
loops, causing assertions to run asynchronously; update the test cleanup loops
that call expectMissingFile with await (e.g., inside iterations over fileNames
and where fileName3 is used) so calls like expectMissingFile(unencryptedAdapter,
fileNames[i]), expectMissingFile(oldEncryptedAdapter, fileNames[i]) and
expectMissingFile(unEncryptedAdapter, fileName3) become awaited, ensuring the
async assertions complete before proceeding.
| if (data instanceof Blob) { | ||
| let readableStream = data.stream(); | ||
|
|
||
| // may come in as a web stream, so we need to convert it to a node stream | ||
| if (readableStream instanceof ReadableStream) { | ||
| readableStream = Readable.fromWeb(readableStream); | ||
| } | ||
|
|
||
| if (cipher && iv) { | ||
| // we need to stream the data through the cipher | ||
| const cipherTransform = new Transform({ | ||
| transform(chunk, encoding, callback) { | ||
| try { | ||
| const encryptedChunk = cipher.update(chunk); | ||
| callback(null, encryptedChunk); | ||
| } catch (err) { | ||
| callback(err); | ||
| } | ||
| }, | ||
| // at the end we need to push the final cipher text, iv, and auth tag | ||
| flush(callback) { | ||
| try { | ||
| this.push(cipher.final()); | ||
| this.push(iv); | ||
| this.push(cipher.getAuthTag()); | ||
| callback(); | ||
| } catch (err) { | ||
| callback(err); | ||
| } | ||
| } | ||
| }); | ||
| // pipe the stream through the cipher and then to the gridfs stream | ||
| readableStream | ||
| .pipe(cipherTransform) | ||
| .on('error', reject) | ||
| .pipe(stream) | ||
| .on('error', reject); | ||
| } else { | ||
| // if we don't have a cipher, we can just pipe the stream to the gridfs stream | ||
| readableStream.pipe(stream) | ||
| .on('error', reject) | ||
| } |
There was a problem hiding this comment.
Missing error handler on readableStream — promise can hang.
Node's .pipe() does not forward 'error' events. If the readable stream from the Blob errors, neither cipherTransform/stream will receive it, so reject is never called and the returned promise hangs indefinitely.
Add an error listener on readableStream itself:
Proposed fix
if (readableStream instanceof ReadableStream) {
readableStream = Readable.fromWeb(readableStream);
}
+ readableStream.on('error', reject);
+
if (cipher && iv) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (data instanceof Blob) { | |
| let readableStream = data.stream(); | |
| // may come in as a web stream, so we need to convert it to a node stream | |
| if (readableStream instanceof ReadableStream) { | |
| readableStream = Readable.fromWeb(readableStream); | |
| } | |
| if (cipher && iv) { | |
| // we need to stream the data through the cipher | |
| const cipherTransform = new Transform({ | |
| transform(chunk, encoding, callback) { | |
| try { | |
| const encryptedChunk = cipher.update(chunk); | |
| callback(null, encryptedChunk); | |
| } catch (err) { | |
| callback(err); | |
| } | |
| }, | |
| // at the end we need to push the final cipher text, iv, and auth tag | |
| flush(callback) { | |
| try { | |
| this.push(cipher.final()); | |
| this.push(iv); | |
| this.push(cipher.getAuthTag()); | |
| callback(); | |
| } catch (err) { | |
| callback(err); | |
| } | |
| } | |
| }); | |
| // pipe the stream through the cipher and then to the gridfs stream | |
| readableStream | |
| .pipe(cipherTransform) | |
| .on('error', reject) | |
| .pipe(stream) | |
| .on('error', reject); | |
| } else { | |
| // if we don't have a cipher, we can just pipe the stream to the gridfs stream | |
| readableStream.pipe(stream) | |
| .on('error', reject) | |
| } | |
| if (data instanceof Blob) { | |
| let readableStream = data.stream(); | |
| // may come in as a web stream, so we need to convert it to a node stream | |
| if (readableStream instanceof ReadableStream) { | |
| readableStream = Readable.fromWeb(readableStream); | |
| } | |
| readableStream.on('error', reject); | |
| if (cipher && iv) { | |
| // we need to stream the data through the cipher | |
| const cipherTransform = new Transform({ | |
| transform(chunk, encoding, callback) { | |
| try { | |
| const encryptedChunk = cipher.update(chunk); | |
| callback(null, encryptedChunk); | |
| } catch (err) { | |
| callback(err); | |
| } | |
| }, | |
| // at the end we need to push the final cipher text, iv, and auth tag | |
| flush(callback) { | |
| try { | |
| this.push(cipher.final()); | |
| this.push(iv); | |
| this.push(cipher.getAuthTag()); | |
| callback(); | |
| } catch (err) { | |
| callback(err); | |
| } | |
| } | |
| }); | |
| // pipe the stream through the cipher and then to the gridfs stream | |
| readableStream | |
| .pipe(cipherTransform) | |
| .on('error', reject) | |
| .pipe(stream) | |
| .on('error', reject); | |
| } else { | |
| // if we don't have a cipher, we can just pipe the stream to the gridfs stream | |
| readableStream.pipe(stream) | |
| .on('error', reject) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/Adapters/Files/GridFSBucketAdapter.js` around lines 93 - 134, The
readableStream from the Blob can emit 'error' which .pipe() won't forward,
causing the promise to hang; add a readableStream.on('error', reject) handler
(and remove it when stream finishes/errors) for both the cipher branch (before
piping into cipherTransform) and the non-cipher branch (before piping into
stream) so any read errors immediately call reject; reference the Blob-derived
readableStream, Readable.fromWeb conversion, the Transform instance
cipherTransform, and the destination gridfs stream variable stream to locate
where to attach the handler.
| // may come in as a web stream, so we need to convert it to a node stream | ||
| if (readableStream instanceof ReadableStream) { | ||
| readableStream = Readable.fromWeb(readableStream); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check minimum Node version required by Parse Server
echo "=== engines field from package.json ==="
cat package.json | jq '.engines'Repository: parse-community/parse-server
Length of output: 187
🌐 Web query:
Node.js Readable.fromWeb stable release version history
💡 Result:
stream.Readable.fromWeb() stability / release history (Node.js)
| Node.js version | Date (first release) | Readable.fromWeb status / change |
|---|---|---|
| v17.0.0 | Oct 19, 2021 | API added (stream.Readable.fromWeb) (not yet marked stable). [1][2] |
| v22.17.0 | Apr 24, 2024 | API marked stable (backported to the v22 line). [1][3] |
| v24.0.0 | May 6, 2025 | API marked stable in the v24 line (same stabilization milestone as v22.17.0). [1][4] |
Sources: Node.js Stream API docs “History” table for stream.Readable.fromWeb() [1], Node.js releases page for v17 first-release date [2], Node.js v22.17.0 archive page (first released date) [3], Node.js v24.0.0 release post (release date) [4].
Use Readable.fromWeb() only with Node >=24.11.0, or upgrade minimum Node version requirements.
Readable.fromWeb() is experimental in Node 20.x and unstable in Node 22.x versions below 22.17.0. Parse Server currently supports Node 20.19.0+ and 22.12.0+, both of which include versions with experimental/unstable Readable.fromWeb(). The API was stabilized in Node 22.17.0 and 24.0.0+.
Either guard the code path to skip Readable.fromWeb() for older versions, or raise the minimum Node version requirement to >=22.17.0 (or >=24.11.0 for full v24 coverage).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/Adapters/Files/GridFSBucketAdapter.js` around lines 96 - 99, The code
currently calls Readable.fromWeb(readableStream) unconditionally; instead, gate
that call by checking the Node.js runtime version (process.versions.node) and
only use Readable.fromWeb when the Node version is new enough (>=22.17.0 or your
chosen minimum such as >=24.11.0); for older Node versions implement a safe
fallback that converts a Web ReadableStream into a Node Readable by creating an
async generator that uses readableStream.getReader() to yield chunks and then
pass that generator into Readable.from — update the conversion logic in
GridFSBucketAdapter (the Readable.fromWeb / ReadableStream branch) to perform
the version check and use the manual async-generator->Readable fallback when the
runtime is too old.
| // If the request body is a buffer and it's size is greater than the V8 string size limit | ||
| // we need to use a Blob to avoid the V8 string size limit | ||
| const MAX_V8_STRING_SIZE_BYTES = 536_870_912; | ||
|
|
||
| let file; | ||
|
|
||
| if (Buffer.isBuffer(req.body) && req.body?.length >= MAX_V8_STRING_SIZE_BYTES) { | ||
| file = new Parse.File(filename, new Blob([req.body]), contentType); | ||
| } else { | ||
| file = new Parse.File(filename, { base64: req.body.toString('base64') }, contentType); | ||
| } |
There was a problem hiding this comment.
Large uploads temporarily double memory usage.
new Blob([req.body]) copies the buffer contents, so for a 512 MB+ upload both req.body (Buffer) and the Blob coexist in memory until the handler completes. This means a single ~600 MB upload consumes ~1.2 GB of process memory.
Consider whether the original req.body reference can be released early (e.g., req.body = null after creating the Blob), or whether the Blob wrapping can be deferred closer to the storage adapter call to shorten the overlap window.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/Routers/FilesRouter.js` around lines 185 - 195, The code creates a Blob
from req.body which doubles memory during large uploads; update the FilesRouter
handling around MAX_V8_STRING_SIZE_BYTES and Parse.File so you either (a) defer
Blob creation until right before the storage adapter upload call (move new
Blob([req.body]) close to where Parse.File is handed to the adapter) or (b)
immediately drop the original Buffer reference by setting req.body = null right
after constructing the Blob (ensure filename and contentType are preserved), so
the temporary overlap of req.body and the Blob is minimized.
| let fileData; | ||
| // if the file is a blob, get the size from the blob | ||
| if (fileObject.file._source?.file instanceof Blob) { | ||
| // get the size of the blob | ||
| fileObject.fileSize = fileObject.file._source.file.size; | ||
| // set the file data | ||
| fileData = fileObject.file._source?.file; | ||
| } else { | ||
| const bufferData = Buffer.from(fileObject.file._data, 'base64'); | ||
| fileObject.fileSize = Buffer.byteLength(bufferData); | ||
| fileData = bufferData; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat package.json | jq '.dependencies["parse"], .devDependencies["parse"]'Repository: parse-community/parse-server
Length of output: 84
🏁 Script executed:
rg -n '_source\b|_data\b' --type=js src/ | head -30Repository: parse-community/parse-server
Length of output: 2505
🏁 Script executed:
# Check if Parse.File is instantiated elsewhere and how
rg -n 'new Parse\.File|Parse\.File\(' --type=js src/ | head -20Repository: parse-community/parse-server
Length of output: 597
🏁 Script executed:
# Look for type definitions or documentation about Parse.File
fd -e d.ts -e ts -e js 'parse' --search-path=node_modules 2>/dev/null | head -10Repository: parse-community/parse-server
Length of output: 54
🌐 Web query:
Parse.File _source _data internal properties JavaScript SDK documentation
💡 Result:
In the Parse JavaScript SDK, Parse.File’s underscored fields like _source and _data are internal implementation details (not part of the public API), so they’re not documented as supported/stable. The supported/public surface is methods like name(), url(), save(), getData() and toJSON(). [1]
What _source is (internal)
From the SDK implementation, Parse.File stores an internal _source object that describes where the file’s content came from and how it should be uploaded. It’s set by the constructor based on what you pass as data (bytes/base64/blob/uri), e.g. formats like:
{ format: 'base64', base64: '...', type: '...' }{ format: 'file', file: Blob, type: '...' }{ format: 'uri', uri: '...', type: '...' }[2]
This mirrors what the constructor docs say you can pass as data (byte array, {base64: ...}, {uri: ...}, browser File/Blob). [1]
What _data is (internal)
In current SDK builds, _data is an internal base64 cache of the file’s contents:
- It’s populated when you construct the file from bytes/base64 (and after downloading via
getData()). getData()returns_dataimmediately if it’s already present; otherwise it downloads from_urland storesresult.base64into_data. [2]- When uploading from a browser
Blob/file,_datais cleared (null) after a successfulsave()(since the SDK didn’t necessarily keep a base64 copy around). [2]
What to use instead of _source / _data
Use the public methods:
await file.getData()to obtain base64 (rather than reading_data) [1]file.url()/file.name()[1]file.toJSON()to get the{ __type: 'File', name, url }representation stored in objects [2]
Sources:
[1] Parse JS SDK API Reference – Parse.File (v2.6.0) https://parseplatform.org/Parse-SDK-JS/api/2.6.0/Parse.File.html
[2] Parse JS SDK source (ParseFile.js, parse@8.0.3 build on UNPKG) https://app.unpkg.com/parse@8.0.3/files/lib/browser/ParseFile.js
🌐 Web query:
Parse JavaScript SDK Parse.File public API extract file data
💡 Result:
In the Parse JavaScript SDK, Parse.File’s public instance API does not expose the file’s raw bytes/base64 after it’s been created—its documented public methods are essentially name(), save(), and url() (plus options). So the supported way to “extract file data” is to download it from the file URL. [1]
Browser (download as Blob)
const file = obj.get('myFile'); // Parse.File
const res = await fetch(file.url());
const blob = await res.blob(); // <-- file dataNode / Cloud Code (download as Buffer)
const file = request.object.get('myFile'); // Parse.File
const res = await Parse.Cloud.httpRequest({ url: file.url() });
const buf = res.buffer; // <-- file data (Buffer)This pattern (get file.url() then httpRequest) is commonly used in Cloud Code. [2]
Note: file.url() is only available after the file is saved or loaded from a Parse.Object. [1]
Sources: Parse JS SDK Parse.File API docs [1]; example approach via Parse.Cloud.httpRequest [2].
[1] https://parseplatform.org/Parse-SDK-JS/api/2.1.0/Parse.File.html
[2] https://stackoverflow.com/questions/44604937/get-data-from-parse-file-via-beforesave
🏁 Script executed:
# Check if there's any Parse SDK type definitions or documentation in node_modules
find node_modules/parse -name "*.d.ts" -o -name "*.md" 2>/dev/null | head -20Repository: parse-community/parse-server
Length of output: 54
🏁 Script executed:
# Look at the Parse.File usage pattern more carefully - what fields are publicly documented?
cat node_modules/parse/package.json 2>/dev/null | jq '.version, .main, .types'Repository: parse-community/parse-server
Length of output: 54
Reliance on undocumented Parse.File internals (_source, _data) creates maintenance risk.
This code directly accesses fileObject.file._source.file, fileObject.file._source.type, and fileObject.file._data—all internal implementation details of Parse.File not exposed in the public API. The Parse SDK web search confirms these are undocumented and not guaranteed stable across versions. Even with Parse SDK pinned at 8.0.3, a future minor/patch update could restructure these internals without breaking semver contracts, silently breaking file uploads.
Consider extracting file data through documented APIs where available, or document the tight coupling to Parse SDK internals with a comment explaining why these private fields are required (e.g., Parse.File.getData() requires saved files, which is not applicable in this upload context).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/Routers/FilesRouter.js` around lines 234 - 245, The code is directly
accessing Parse.File internals (fileObject.file._source and _data) which are
undocumented and brittle; update the upload flow to avoid these private fields
by either (1) passing the original raw file (Blob or Buffer) into the request
payload or into fileObject (e.g., fileObject.rawFile) so this router can read
size and data from that documented source, or (2) if the Parse.File is already
saved, use documented APIs (e.g., fetch the file via fileObject.file.url() over
HTTP) to obtain bytes instead of reading _source/_data; if neither is possible
add a concise comment on why a private field is used and implement a guarded
fallback that throws a clear error when those internals are missing. Ensure
changes touch the code paths that set fileData/fileObject.fileSize and reference
symbols fileObject.file, fileObject.fileSize, and fileData so the coupling to
Parse.File internals is removed or clearly documented and fail-safe.
Pull Request
Issue
Closes: #9283
Approach
During routing if a file size is larger than can be handled by the V8 engine as a string, it is used as a Blob. This changes touches both the
FilesRouterandGridFSBucketAdapterwhere the creation/upload happens, and the file is handled as a Buffer. I also added a new FilesRouter test spec for these changes.Tasks
Summary by CodeRabbit
New Features
Bug Fixes
Tests