@@ -1685,3 +1685,49 @@ int cmd_survey(int argc, const char **argv, const char *prefix, struct repositor
1685
1685
clear_survey_context (& ctx );
1686
1686
return 0 ;
1687
1687
}
1688
+
1689
+ /*
1690
+ * NEEDSWORK: The following is a bit of a laundry list of things
1691
+ * that I'd like to add.
1692
+ *
1693
+ * [] Dump stats on all of the packfiles. The number and size of each.
1694
+ * Whether each is in the .git directory or in an alternate. The state
1695
+ * of the IDX or MIDX files and etc. Delta chain stats. All of this
1696
+ * data is relative to the "lived-in" state of the repository. Stuff
1697
+ * that may change after a GC or repack.
1698
+ *
1699
+ * [] Dump stats on each remote. When we fetch from a remote the size
1700
+ * of the response is related to the set of haves on the server. You
1701
+ * can see this in `GIT_TRACE_CURL=1 git fetch`. We get a `ls-refs`
1702
+ * payload that lists all of the branches and tags on the server, so
1703
+ * at a minimum the RefName and SHA for each. But for annotated tags
1704
+ * we also get the peeled SHA. The size of this overhead on every
1705
+ * fetch is proporational to the size of the `git ls-remote` response
1706
+ * (roughly, although the latter repeats the RefName of the peeled
1707
+ * tag). If, for example, you have 500K refs on a remote, you're
1708
+ * going to have a long "haves" message, so every fetch will be slow
1709
+ * just because of that overhead (not counting new objects to be
1710
+ * downloaded).
1711
+ *
1712
+ * Note that the local set of tags in "refs/tags/" is a union over all
1713
+ * remotes. However, since most people only have one remote, we can
1714
+ * probaly estimate the overhead value directly from the size of the
1715
+ * set of "refs/tags/" that we visited while building the `ref_info`
1716
+ * and `ref_array` and not need to ask the remote.
1717
+ *
1718
+ * [] Dump info on the complexity of the DAG. Criss-cross merges.
1719
+ * The number of edges that must be touched to compute merge bases.
1720
+ * Edge length. The number of parallel lanes in the history that must
1721
+ * be navigated to get to the merge base. What affects the cost of
1722
+ * the Ahead/Behind computation? How often do criss-crosses occur and
1723
+ * do they cause various operations to slow down?
1724
+ *
1725
+ * [] If there are primary branches (like "main" or "master") are they
1726
+ * always on the left side of merges? Does the graph have a clean
1727
+ * left edge? Or are there normal and "backwards" merges? Do these
1728
+ * cause problems at scale?
1729
+ *
1730
+ * [] If we have a hierarchy of FI/RI branches like "L1", "L2, ...,
1731
+ * can we learn anything about the shape of the repo around these FI
1732
+ * and RI integrations?
1733
+ */
0 commit comments