Skip to content

feat: support concurrent chunk uploads#94

Closed
TorstenDittmann wants to merge 1 commit into
mainfrom
concurrent-chunk-uploads-1-9-x
Closed

feat: support concurrent chunk uploads#94
TorstenDittmann wants to merge 1 commit into
mainfrom
concurrent-chunk-uploads-1-9-x

Conversation

@TorstenDittmann

Copy link
Copy Markdown
Contributor

This PR updates the SDK to support concurrent chunk uploads.

@greptile-apps

greptile-apps Bot commented May 21, 2026

Copy link
Copy Markdown

Greptile Summary

This PR refactors the chunked upload path in Client.cs to upload up to 8 chunks concurrently using Task.WhenAll worker tasks, fixes a thread-safety bug where headers were mutated on DefaultRequestHeaders instead of per-request headers, adds new Account service methods, and removes deprecated Deno 1.x runtime enum entries.

  • Concurrent upload engine: a worker pool driven by Interlocked.Increment dispatches chunks in parallel; the first chunk is always uploaded sequentially to obtain the upload ID before concurrent workers start.
  • Header fix: per-request Content-Range and x-appwrite-id headers are now set on the HttpRequestMessage rather than on DefaultRequestHeaders, correcting a shared-state race that existed before this PR.
  • New Account methods: CreatePushTarget, UpdatePushTarget, DeletePushTarget, and CreateOAuth2Session are added as generated SDK surface.

Confidence Score: 3/5

The concurrent upload logic introduces a genuine file-handle leak that will affect any caller uploading a large file by path, and a shared-variable write that is not guarded by a lock.

For large path-based uploads, info.OpenRead() creates a FileStream stored in input.Data that is immediately orphaned — ReadChunkAsync opens independent handles per chunk and never touches input.Data. The leaked handle persists until GC runs a finalizer, holding a file lock on Windows. Separately, uploadId is written inside UploadChunkAsync without synchronisation; a narrow window where it is still empty when concurrent workers start allows multiple workers to race the assignment, potentially sending chunks without the correct ID.

Appwrite/Client.cs — the concurrent upload path around ReadChunkAsync and UploadChunkAsync.

Important Files Changed

Filename Overview
Appwrite/Client.cs Concurrent chunk upload implementation with a resource leak (orphaned FileStream for large path uploads), an unsynchronised write to the shared uploadId variable, and a lock-in-async issue for stream reads.
Appwrite/Services/Account.cs Adds CreateOAuth2Session, CreatePushTarget, UpdatePushTarget, and DeletePushTarget API methods; straightforward generated code, no issues.
Appwrite/Enums/BuildRuntime.cs Removes deprecated Deno 1.21, 1.24, and 1.35 runtime entries; no issues.
Appwrite/Enums/Runtime.cs Removes deprecated Deno 1.21, 1.24, and 1.35 runtime entries; mirrors BuildRuntime change, no issues.
CHANGELOG.md Entire 4.1.0 section removed — appears intentional as this PR supersedes that version.

Reviews (1): Last reviewed commit: "feat: support concurrent chunk uploads" | Re-trigger Greptile

Comment thread Appwrite/Client.cs
Comment on lines +540 to +550
var readLock = new object();

async Task<byte[]> ReadChunkAsync(long start, long end)
{
var length = (int)(end - start);
var chunk = new byte[length];

switch(input.SourceType)
{
case "path":
using (var chunkStream = System.IO.File.OpenRead(input.Path))

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Resource leak for large path-based uploads — the FileStream opened into input.Data at line 459 (info.OpenRead()) is never used or disposed for files larger than ChunkSize. The new concurrent path's ReadChunkAsync opens its own independent FileStream per chunk, completely bypassing input.Data, so the original stream is abandoned for the entire lifetime of the upload. On Windows this will hold a file lock; on any platform it leaks a file descriptor until GC runs a finalizer.

Suggested change
var readLock = new object();
async Task<byte[]> ReadChunkAsync(long start, long end)
{
var length = (int)(end - start);
var chunk = new byte[length];
switch(input.SourceType)
{
case "path":
using (var chunkStream = System.IO.File.OpenRead(input.Path))
// For the "path" case, input.Data was opened to determine size but is
// not used by ReadChunkAsync (which opens fresh handles per chunk).
// Dispose it now to avoid leaking the file descriptor.
if (input.SourceType == "path" && input.Data is IDisposable pathStream)
{
pathStream.Dispose();
input.Data = null;
}
var readLock = new object();
async Task<byte[]> ReadChunkAsync(long start, long end)
{
var length = (int)(end - start);
var chunk = new byte[length];
switch(input.SourceType)
{
case "path":
using (var chunkStream = System.IO.File.OpenRead(input.Path))

Comment thread Appwrite/Client.cs
Comment on lines +616 to +621
if (index == 0 || string.IsNullOrEmpty(uploadId))
{
uploadId = chunkResult.ContainsKey("$id")
? chunkResult["$id"]?.ToString() ?? string.Empty
: string.Empty;
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Unsynchronised write to uploadId across concurrent workers — the check-then-write on string.IsNullOrEmpty(uploadId) and the subsequent assignment are not protected by any lock. If uploadId is still empty when the concurrent phase starts (e.g. idParamName was absent or the GET check failed), multiple workers can race through the condition simultaneously. Any worker that reads uploadId between another worker's condition check and its assignment will pass an empty x-appwrite-id header, causing the server to treat that chunk as a new independent upload rather than part of the same file.

Comment thread Appwrite/Client.cs
Comment on lines 563 to 579
case "stream":
var stream = input.Data as Stream;
if (stream == null)
throw new InvalidOperationException("Stream data is null");
stream.Seek(offset, SeekOrigin.Begin);
await stream.ReadAsync(buffer, 0, ChunkSize);
lock (readLock)
{
stream.Seek(start, SeekOrigin.Begin);
var read = 0;
while (read < length)
{
var count = stream.Read(chunk, read, length - read);
if (count == 0)
break;
read += count;
}
}
break;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 lock used inside an async method to serialize stream reads is problematic. Holding a Monitor lock across synchronous stream.Read calls on potentially slow I/O can starve the thread pool. SemaphoreSlim with WaitAsync is the idiomatic async-safe alternative.

Suggested change
case "stream":
var stream = input.Data as Stream;
if (stream == null)
throw new InvalidOperationException("Stream data is null");
stream.Seek(offset, SeekOrigin.Begin);
await stream.ReadAsync(buffer, 0, ChunkSize);
lock (readLock)
{
stream.Seek(start, SeekOrigin.Begin);
var read = 0;
while (read < length)
{
var count = stream.Read(chunk, read, length - read);
if (count == 0)
break;
read += count;
}
}
break;
case "stream":
var stream = input.Data as Stream;
if (stream == null)
throw new InvalidOperationException("Stream data is null");
await streamSemaphore.WaitAsync();
try
{
stream.Seek(start, SeekOrigin.Begin);
var read = 0;
while (read < length)
{
var count = await stream.ReadAsync(chunk, read, length - read);
if (count == 0)
break;
read += count;
}
}
finally
{
streamSemaphore.Release();
}
break;

@TorstenDittmann

Copy link
Copy Markdown
Contributor Author

Closing in favor of focused PR #95.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant