Skip to content

Commit 5d11a62

Browse files
jeffhostetlerdscho
authored andcommitted
survey: started TODO list at bottom of source file
1 parent f36fb84 commit 5d11a62

File tree

1 file changed

+46
-0
lines changed

1 file changed

+46
-0
lines changed

builtin/survey.c

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1685,3 +1685,49 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
16851685
clear_survey_context(&ctx);
16861686
return 0;
16871687
}
1688+
1689+
/*
1690+
* NEEDSWORK: The following is a bit of a laundry list of things
1691+
* that I'd like to add.
1692+
*
1693+
* [] Dump stats on all of the packfiles. The number and size of each.
1694+
* Whether each is in the .git directory or in an alternate. The state
1695+
* of the IDX or MIDX files and etc. Delta chain stats. All of this
1696+
* data is relative to the "lived-in" state of the repository. Stuff
1697+
* that may change after a GC or repack.
1698+
*
1699+
* [] Dump stats on each remote. When we fetch from a remote the size
1700+
* of the response is related to the set of haves on the server. You
1701+
* can see this in `GIT_TRACE_CURL=1 git fetch`. We get a `ls-refs`
1702+
* payload that lists all of the branches and tags on the server, so
1703+
* at a minimum the RefName and SHA for each. But for annotated tags
1704+
* we also get the peeled SHA. The size of this overhead on every
1705+
* fetch is proporational to the size of the `git ls-remote` response
1706+
* (roughly, although the latter repeats the RefName of the peeled
1707+
* tag). If, for example, you have 500K refs on a remote, you're
1708+
* going to have a long "haves" message, so every fetch will be slow
1709+
* just because of that overhead (not counting new objects to be
1710+
* downloaded).
1711+
*
1712+
* Note that the local set of tags in "refs/tags/" is a union over all
1713+
* remotes. However, since most people only have one remote, we can
1714+
* probaly estimate the overhead value directly from the size of the
1715+
* set of "refs/tags/" that we visited while building the `ref_info`
1716+
* and `ref_array` and not need to ask the remote.
1717+
*
1718+
* [] Dump info on the complexity of the DAG. Criss-cross merges.
1719+
* The number of edges that must be touched to compute merge bases.
1720+
* Edge length. The number of parallel lanes in the history that must
1721+
* be navigated to get to the merge base. What affects the cost of
1722+
* the Ahead/Behind computation? How often do criss-crosses occur and
1723+
* do they cause various operations to slow down?
1724+
*
1725+
* [] If there are primary branches (like "main" or "master") are they
1726+
* always on the left side of merges? Does the graph have a clean
1727+
* left edge? Or are there normal and "backwards" merges? Do these
1728+
* cause problems at scale?
1729+
*
1730+
* [] If we have a hierarchy of FI/RI branches like "L1", "L2, ...,
1731+
* can we learn anything about the shape of the repo around these FI
1732+
* and RI integrations?
1733+
*/

0 commit comments

Comments
 (0)