Skip to content

Severe push latency (~20s) for trivial changes in very large monorepo. #1464

@CrazyFanFan

Description

@CrazyFanFan

Hello josh-proxy team,

First, thank you for this great tool. We are evaluating josh-proxy for our very large monorepo and have encountered significant performance latency during push operations that we would like to understand better.

Environment & Setup

  • josh-proxy version: r24.10.04
  • josh-proxy Server: 64 vCPUs, 256GB RAM
  • Client: git version 2.51.0
  • Network: Low-latency, high-bandwidth internal network
  • josh-proxy Subscription Usage: We cloned the repository with the following command to subscribe to specific prefixes:
    git clone http://server/repo.git:[::subdira,::subdirb].git
    Our working copy only includes content under the subdira and subdirb directories (the "subscribed prefixes" referenced in subsequent sections).

Repository Scale

Our repository is exceptionally large, which we understand is a primary use case for josh-proxy:

  • Full repo size: 5.3 GB
  • Git LFS objects size: 55 GB
  • Total commits: ~568,000
  • Total branches: ~6,300
  • Total tags: ~1,500
  • Number of files in subscribed prefixes (subdira + subdirb): ~74,000

Problem Description

We are observing consistently high latency (~20,000 ms) when pushing trivial changes through josh-proxy. A common workflow is modifying a single line in a single file (located in either subdira or subdirb) and pushing the change—this process takes approximately 20 seconds.

To isolate the issue, we performed the same push operation directly to our GitLab backend (bypassing josh-proxy): pushing the identical single-line change to the same branch completed in under 2 seconds. This confirms the ~18-second overhead is introduced by josh-proxy's processing.

Our Preliminary Analysis & Core Questions

We have reviewed the code and logs and suspect the bottleneck is related to the complex history and scale of our repository

Our primary goal in opening this issue is to clarify the following:

  1. Expected Behavior
    Is 20s of latency for a one-line change in subdira/subdirb an expected outcome for a repository of our size given our server hardware? Or does this indicate a problem in our setup or a known josh-proxy bottleneck?

  2. Configuration Optimization
    Are we missing critical configuration parameters for multi-prefix subscriptions? Are there tuning options tailored to large monorepos with multiple subscribed directories?

  3. Profiling & Further Assistance
    What is the best way for us to help profile this latency? We have collected debug logs and are ready to provide additional traces, performance profiles, or any other data needed. We are eager to collaborate on resolving this.

Thank you for your time and expertise. We are excited about josh-proxy and hope to help make it work seamlessly for our large-scale use case.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions