Skip to content

[RFC] Create 'core.featureAdoptionRate' setting to update config defaults #254

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 33 additions & 1 deletion Documentation/config/core.txt
Original file line number Diff line number Diff line change
Expand Up @@ -577,7 +577,8 @@ the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1].

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jeff Hostetler wrote (reply to this):



On 6/3/2019 4:18 PM, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <[email protected]>
> 
> Several advanced config settings are highly recommended for clients
> using large repositories. Power users learn these one-by-one and
> enable them as they see fit. This could be made simpler, to allow
> more users to have access to these almost-always beneficial features
> (and more beneficial in larger repos).
> 
> Create a 'repo.size' config setting whose only accepted value is
> 'large'. When a repo.size=large is given, change the default values
> of some config settings. If the setting is given explicitly, then
> take the explicit value.
> 
> This change adds these two defaults to the repo.size=large setting:
> 
>   * core.commitGraph=true
>   * gc.writeCommitGraph=true
> 
> To centralize these config options and properly set the defaults,
> create a repo_settings that contains chars for each config variable.
> Use -1 as "unset", with 0 for false and 1 for true.
> 
> The prepare_repo_settings() method ensures that this settings
> struct has been initialized, and avoids double-scanning the config
> settings.
> 
> Signed-off-by: Derrick Stolee <[email protected]>
[...]
> diff --git a/repo-settings.c b/repo-settings.c
> new file mode 100644
> index 0000000000..6f5e18d92e
> --- /dev/null
> +++ b/repo-settings.c
> @@ -0,0 +1,44 @@
> +#include "cache.h"
> +#include "repository.h"
> +#include "config.h"
> +#include "repo-settings.h"
> +
> +
> +#define UPDATE_DEFAULT(s,v) if (s != -1) { s = v; }

We should guard this with a "do { ... } while (0)"

> +
> +static int git_repo_config(const char *key, const char *value, void *cb)
> +{
> +	struct repo_settings *rs = (struct repo_settings *)cb;
> +
> +	if (!strcmp(key, "core.size")) {
> +		if (!strcmp(value, "large")) {
> +			UPDATE_DEFAULT(rs->core_commit_graph, 1);
> +			UPDATE_DEFAULT(rs->gc_write_commit_graph, 1);
> +		}
> +		return 0;
> +	}
> +	if (!strcmp(key, "core.commitgraph")) {
> +		rs->core_commit_graph = git_config_bool(key, value);
> +		return 0;
> +	}
> +	if (!strcmp(key, "gc.writecommitgraph")) {
> +		rs->gc_write_commit_graph = git_config_bool(key, value);
> +		return 0;
> +	}
> +
> +	return 1;
> +}
[...]

Jeff

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> +core.featureAdoptionRate::
> +	Set an integer value on a scale from 0 to 10 describing your
> +	desire to adopt new performance features. Defaults to 0. As
> +	the value increases, features are enabled by changing the
> +	default values of other config settings. If a config variable
> +	is specified explicitly, the explicit value will override these
> +	defaults:
> ++
> +If the value is at least 3, then the following defaults are modified.
> +These represent relatively new features that have existed for multiple
> +major releases, and present significant performance benefits. They do
> +not modify the user-facing output of porcelain commands.
> ++
> +* `core.commitGraph=true` enables reading commit-graph files.
> ++
> +* `gc.writeCommitGraph=true` eneables writing commit-graph files during
> +`git gc`.

I was re-reading the whole series, and found that the phrase
"present significant benefits" was somewhat overselling.  Wouldn't
that claim largely depend on the end-user's workflow?  The same
comment applies to the description of "at least 5" level, too.

I would not mind if we say "enabling this may present performance
benefits", with or without "significant" before "performance
benefits", and with or without ", depending how your repository is
used" at the end.

Thanks.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 6/28/2019 4:50 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> 
>> +core.featureAdoptionRate::
>> +	Set an integer value on a scale from 0 to 10 describing your
>> +	desire to adopt new performance features. Defaults to 0. As
>> +	the value increases, features are enabled by changing the
>> +	default values of other config settings. If a config variable
>> +	is specified explicitly, the explicit value will override these
>> +	defaults:
>> ++
>> +If the value is at least 3, then the following defaults are modified.
>> +These represent relatively new features that have existed for multiple
>> +major releases, and present significant performance benefits. They do
>> +not modify the user-facing output of porcelain commands.
>> ++
>> +* `core.commitGraph=true` enables reading commit-graph files.
>> ++
>> +* `gc.writeCommitGraph=true` eneables writing commit-graph files during
>> +`git gc`.
> 
> I was re-reading the whole series, and found that the phrase
> "present significant benefits" was somewhat overselling.  Wouldn't
> that claim largely depend on the end-user's workflow?  The same
> comment applies to the description of "at least 5" level, too.
> 
> I would not mind if we say "enabling this may present performance
> benefits", with or without "significant" before "performance
> benefits", and with or without ", depending how your repository is
> used" at the end.

Thanks for taking such a close look. Indeed, it is not appropriate
to over-sell here. I will take another stab at this documentation
next week.

-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> @@ -41,7 +42,6 @@ static int aggressive_depth = 50;
>  static int aggressive_window = 250;
>  static int gc_auto_threshold = 6700;
>  static int gc_auto_pack_limit = 50;
> -static int gc_write_commit_graph;
>  static int detach_auto = 1;
>  static timestamp_t gc_log_expire_time;
>  static const char *gc_log_expire = "1.day.ago";
> @@ -148,7 +148,6 @@ static void gc_config(void)
>  	git_config_get_int("gc.aggressivedepth", &aggressive_depth);
>  	git_config_get_int("gc.auto", &gc_auto_threshold);
>  	git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
> -	git_config_get_bool("gc.writecommitgraph", &gc_write_commit_graph);
>  	git_config_get_bool("gc.autodetach", &detach_auto);
>  	git_config_get_expiry("gc.pruneexpire", &prune_expire);
>  	git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
> @@ -685,7 +684,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
>  		clean_pack_garbage();
>  	}
>  
> -	if (gc_write_commit_graph)
> +	prepare_repo_settings(the_repository);
> +	if (the_repository->settings->gc_write_commit_graph == 1)
>  		write_commit_graph_reachable(get_object_directory(), 0,
>  					     !quiet && !daemonized);

OK, so the general idea is to move per-subsystem local variables to
new fields in the repository structure, stop parsing the configuration
in per-subsystem config callback, which is how the configuration for
the writing side is done above, and ...

>  
> diff --git a/commit-graph.c b/commit-graph.c
> index 7c5e54875f..b09c465a7a 100644
> --- a/commit-graph.c
> +++ b/commit-graph.c
> @@ -16,6 +16,7 @@
>  #include "hashmap.h"
>  #include "replace-object.h"
>  #include "progress.h"
> +#include "repo-settings.h"
>  
>  #define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */
>  #define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
> @@ -311,7 +312,6 @@ static void prepare_commit_graph_one(struct repository *r, const char *obj_dir)
>  static int prepare_commit_graph(struct repository *r)
>  {
>  	struct object_directory *odb;
> -	int config_value;
>  
>  	if (git_env_bool(GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD, 0))
>  		die("dying as requested by the '%s' variable on commit-graph load!",
> @@ -321,9 +321,10 @@ static int prepare_commit_graph(struct repository *r)
>  		return !!r->objects->commit_graph;
>  	r->objects->commit_graph_attempted = 1;
>  
> +	prepare_repo_settings(r);
> +
>  	if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0) &&
> -	    (repo_config_get_bool(r, "core.commitgraph", &config_value) ||
> -	    !config_value))
> +	    r->settings->core_commit_graph != 1)
>  		/*
>  		 * This repository is not configured to use commit graphs, so
>  		 * do not load one. (But report commit_graph_attempted anyway

... how the reading side is done here.  And then instead let the new
repo-settings module read them ...

> diff --git a/repo-settings.c b/repo-settings.c
> new file mode 100644
> index 0000000000..f7fc2a1959
> --- /dev/null
> +++ b/repo-settings.c
> @@ -0,0 +1,44 @@
> +#include "cache.h"
> +#include "repository.h"
> +#include "config.h"
> +#include "repo-settings.h"
> +
> +#define UPDATE_DEFAULT(s,v) do { if (s == -1) { s = v; } } while(0)
> +
> +static int git_repo_config(const char *key, const char *value, void *cb)
> +{
> +	struct repo_settings *rs = (struct repo_settings *)cb;
> +
> +	if (!strcmp(key, "core.featureadoptionrate")) {
> +		int rate = git_config_int(key, value);
> +		if (rate >= 3) {
> +			UPDATE_DEFAULT(rs->core_commit_graph, 1);
> +			UPDATE_DEFAULT(rs->gc_write_commit_graph, 1);
> +		}
> +		return 0;
> +	}
> +	if (!strcmp(key, "core.commitgraph")) {
> +		rs->core_commit_graph = git_config_bool(key, value);
> +		return 0;
> +	}
> +	if (!strcmp(key, "gc.writecommitgraph")) {
> +		rs->gc_write_commit_graph = git_config_bool(key, value);
> +		return 0;
> +	}
> +
> +	return 1;
> +}

... like this.  If a concrete/underlying configuration variable
appears in the configuration data stream, the fields would get their
values overwritten, and if the adoption rate setting appears
_before_ concreate ones, they affect the value in the fields.  So
the assignment to these fields when the adoption rate setting is
seen survives only when the underlying variable does not appear
anywhere in the configuration dasta stream at all.

Which is exactly what we want.  Nicely written.

> +void prepare_repo_settings(struct repository *r)
> +{
> +	if (r->settings)
> +		return;
> +
> +	r->settings = xmalloc(sizeof(*r->settings));
> +
> +	/* Defaults */
> +	r->settings->core_commit_graph = -1;
> +	r->settings->gc_write_commit_graph = -1;
> +
> +	repo_config(r, git_repo_config, r->settings);
> +}
> diff --git a/repo-settings.h b/repo-settings.h
> new file mode 100644
> index 0000000000..11d08648e1
> --- /dev/null
> +++ b/repo-settings.h
> @@ -0,0 +1,13 @@
> +#ifndef REPO_SETTINGS_H
> +#define REPO_SETTINGS_H
> +
> +struct repo_settings {
> +	char core_commit_graph;
> +	char gc_write_commit_graph;
> +};

I do not see a particular reason to favor type "char" here. "char"
is wider than e.g. "signed int :2", if you wanted to save space
compared to a more obvious and naive type, e.g. "int".  More
importantly, the language does not guarantee signedness of "char",
so the sentinel value of "-1" is a bit tricky to use, as you have to
be prepared to see your code used on an "unsigned char" platform.

Use of "signed char" would be OK, but this is a singleton instance
per repository, so I am not sure how much it matters to save a few
words here by not using the most natural "int" type.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 6/28/2019 5:42 PM, Junio C Hamano wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
>
>> +struct repo_settings {
>> +	char core_commit_graph;
>> +	char gc_write_commit_graph;
>> +};
> 
> I do not see a particular reason to favor type "char" here. "char"
> is wider than e.g. "signed int :2", if you wanted to save space
> compared to a more obvious and naive type, e.g. "int".  More
> importantly, the language does not guarantee signedness of "char",
> so the sentinel value of "-1" is a bit tricky to use, as you have to
> be prepared to see your code used on an "unsigned char" platform.

I was unaware that platforms could change the signedness of "char".
Thanks for guarding against it.

You're right that it probably isn't worth saving space here, as these
values are replacing existing globals somewhere anyway. If we start
worrying about these being present for each of thousands of submodules
then we probably have bigger problems.

> Use of "signed char" would be OK, but this is a singleton instance
> per repository, so I am not sure how much it matters to save a few
> words here by not using the most natural "int" type.

I'll use 'int' in v2.

Thanks,
-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Carlo Arenas wrote (reply to this):

On Fri, Jun 28, 2019 at 6:44 PM Derrick Stolee <[email protected]> wrote:
>
> On 6/28/2019 5:42 PM, Junio C Hamano wrote:
> > "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> >
> > Use of "signed char" would be OK, but this is a singleton instance
> > per repository, so I am not sure how much it matters to save a few
> > words here by not using the most natural "int" type.
>
> I'll use 'int' in v2.

FWIW, this broke the build in (at least) Linux AArch64, unless the followin=
g
is applied on top of ds/early-access

Carlo
-- >8 --
Subject: [PATCH] repo-settings: explicitly make flags using char as signed

used as a three state boolean and therefore could also be set/compared
with -1, which will break in architectures that use unsigned char by
default (ex: ARM)

Signed-off-by: Carlo Marcelo Arenas Bel=C3=B3n <[email protected]>
---
 repo-settings.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/repo-settings.h b/repo-settings.h
index b50228f992..544873fff4 100644
--- a/repo-settings.h
+++ b/repo-settings.h
@@ -2,9 +2,9 @@
 #define REPO_SETTINGS_H

 struct repo_settings {
-       char core_commit_graph;
-       char gc_write_commit_graph;
-       char pack_use_sparse;
+       signed char core_commit_graph;
+       signed char gc_write_commit_graph;
+       signed char pack_use_sparse;
        int index_version;
 };

--=20
2.22.0

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 6/30/2019 2:35 PM, Carlo Arenas wrote:
> On Fri, Jun 28, 2019 at 6:44 PM Derrick Stolee <[email protected]> wrote:
>>
>> On 6/28/2019 5:42 PM, Junio C Hamano wrote:
>>> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
>>>
>>> Use of "signed char" would be OK, but this is a singleton instance
>>> per repository, so I am not sure how much it matters to save a few
>>> words here by not using the most natural "int" type.
>>
>> I'll use 'int' in v2.
> 
> FWIW, this broke the build in (at least) Linux AArch64, unless the following
> is applied on top of ds/early-access

Thanks, Carlo, for reporting this. I'm glad we have someone running early
builds on platforms with these special cases!

I'm working on rerolling to use int.

-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Carlo Arenas wrote (reply to this):

On Mon, Jul 1, 2019 at 8:32 AM Derrick Stolee via GitGitGadget
<[email protected]> wrote:
>
> To centralize these config options and properly set the defaults,
> create a repo_settings that contains chars for each config variable.
> Use -1 as "unset", with 0 for false and 1 for true.

minor nitpick that hopefully Junio can fix: s/chars/ints

> +* `gc.writeCommitGraph=true` eneables writing commit-graph files during

typo: s/eneables/enable

Carlo

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Duy Nguyen wrote (reply to this):

On Mon, Jul 1, 2019 at 10:32 PM Derrick Stolee via GitGitGadget
<[email protected]> wrote:
> @@ -601,3 +602,22 @@ core.abbrev::
>         in your repository, which hopefully is enough for
>         abbreviated object names to stay unique for some time.
>         The minimum length is 4.
> +
> +core.featureAdoptionRate::
> +       Set an integer value on a scale from 0 to 10 describing your
> +       desire to adopt new performance features. Defaults to 0. As
> +       the value increases, features are enabled by changing the
> +       default values of other config settings. If a config variable
> +       is specified explicitly, the explicit value will override these
> +       defaults:

This is because I'd like to keep core.* from growing too big (it's
already big), hard to read, search and maintain. Perhaps this should
belong to a separate group? Something like tuning.something or
defaults.something.

> +If the value is at least 3, then the following defaults are modified.
> +These represent relatively new features that have existed for multiple
> +major releases, and may present performance benefits. These benefits
> +depend on the amount and kind of data in your repo and how you use it.

Then instead of numeric values, maybe the user should write some sort
description about the repo and we optimize for that, similar to gcc
-Os optimized for size, -Ofast for compiler speed (-O<n> is all about
execution speed).

We could write, for example, tuning.commitHistory = {small, medium,
large} and tuning.worktree = {small, large, medium} and maybe
tuning.refSize and use that to optimize. We can still have different
optimization levels (probably just "none", "recommended" vs
"aggressive" where agressive enables most new stuff),
-- 
Duy

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):


On Tue, Jul 02 2019, Duy Nguyen wrote:

> On Mon, Jul 1, 2019 at 10:32 PM Derrick Stolee via GitGitGadget
> <[email protected]> wrote:
>> @@ -601,3 +602,22 @@ core.abbrev::
>>         in your repository, which hopefully is enough for
>>         abbreviated object names to stay unique for some time.
>>         The minimum length is 4.
>> +
>> +core.featureAdoptionRate::
>> +       Set an integer value on a scale from 0 to 10 describing your
>> +       desire to adopt new performance features. Defaults to 0. As
>> +       the value increases, features are enabled by changing the
>> +       default values of other config settings. If a config variable
>> +       is specified explicitly, the explicit value will override these
>> +       defaults:
>
> This is because I'd like to keep core.* from growing too big (it's
> already big), hard to read, search and maintain. Perhaps this should
> belong to a separate group? Something like tuning.something or
> defaults.something.

The main thing users look at is "man git-config" (or its web rendering)
which renders it all in one page anyway.

I think in general adding more things to core.* sucks less than
explaining the special-case that "tuning.*" isn't a config for
git-tuning(1) (although we have some of that already, e.g. with
trace2.*).

Documentation/config/core.txt is ~600 lines. Maybe it would be a good
idea to split it up, similar to your split of
Documentation/config/*.txt, but let's not conflate how we'd like to
maintain stuff in git.git with a config interface we expose externally.

It's going to be very confusing for users if some settings that
otherwise would be in core aren't there because a file in git.git was
"too big" at the time. Users (mostly) aren't going to know/care in what
chronological order we added config keys.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jakub Narebski wrote (reply to this):

Duy Nguyen <[email protected]> writes:
> On Mon, Jul 1, 2019 at 10:32 PM Derrick Stolee via GitGitGadget
> <[email protected]> wrote:
>> @@ -601,3 +602,22 @@ core.abbrev::
>>         in your repository, which hopefully is enough for
>>         abbreviated object names to stay unique for some time.
>>         The minimum length is 4.
>> +
>> +core.featureAdoptionRate::
>> +       Set an integer value on a scale from 0 to 10 describing your
>> +       desire to adopt new performance features. Defaults to 0. As
>> +       the value increases, features are enabled by changing the
>> +       default values of other config settings. If a config variable
>> +       is specified explicitly, the explicit value will override these
>> +       defaults:
>
> This is because I'd like to keep core.* from growing too big (it's
> already big), hard to read, search and maintain. Perhaps this should
> belong to a separate group? Something like tuning.something or
> defaults.something.

I'm not sure if I consider core.* too big.  Well, there are 55 or more
entries in this namespace.

>> +If the value is at least 3, then the following defaults are modified.
>> +These represent relatively new features that have existed for multiple
>> +major releases, and may present performance benefits. These benefits
>> +depend on the amount and kind of data in your repo and how you use it.
>
> Then instead of numeric values, maybe the user should write some sort
> description about the repo and we optimize for that, similar to gcc
> -Os optimized for size, -Ofast for compiler speed (-O<n> is all about
> execution speed).

I also do not like those magic numbers.

>
> We could write, for example, tuning.commitHistory =3D {small, medium,
> large} and tuning.worktree =3D {small, large, medium} and maybe
> tuning.refSize and use that to optimize. We can still have different
> optimization levels (probably just "none", "recommended" vs
> "aggressive" where agressive enables most new stuff),

I think we have three different things that are currently conflated in
one config variable and one value.

First is what we want to optimize for; is it on-disk repository size,
command performance / execution speed, or maybe convenient information.

Second is what type of repository we are dealing with.  Is there a
problem with long history, large number of files in checkout, large
and/or binary files, or all together?  The original `core.size=3Dlarge`
(or proposed core.repositorySize) was all about this issue.  Another
issue that might be important is that if it is leaf developer
repository, or is it maintainer repository, etc. (which affects for
example how the push looks like).

Third is what tradeoffs we are willing to accept to get required
performance.  Are we willing to use additional stable optional features;
are we willing to use new experimental optional features; are we
willing; are we willing to sacrifice convenience (ahead/behind
information in status, information bout forced updates in push output,
etc.) for performance?  This what current proposal is about.

It may not nnned to be a separate confi variable for a separate aspect;
it may be enough to have value that is space-separated list, or
something like that.

Best,
--
Jakub Nar=EAbski

core.commitGraph::
If true, then git will read the commit-graph file (if it exists)
to parse the graph structure of commits. Defaults to false. See
to parse the graph structure of commits. Defaults to false, unless
`core.featureAdoptionRate` is at least three. See
linkgit:git-commit-graph[1] for more information.

core.useReplaceRefs::
Expand All @@ -601,3 +602,34 @@ core.abbrev::
in your repository, which hopefully is enough for
abbreviated object names to stay unique for some time.
The minimum length is 4.

core.featureAdoptionRate::
Set an integer value on a scale from 0 to 10 describing your
desire to adopt new performance features. Defaults to 0. As
the value increases, features are enabled by changing the
default values of other config settings. If a config variable
is specified explicitly, the explicit value will override these
defaults:
+
If the value is at least 3, then the following defaults are modified.
These represent relatively new features that have existed for multiple
major releases, and may present performance benefits. These benefits
depend on the amount and kind of data in your repo and how you use it.
The settings do not modify the user-facing output of porcelain commands.
+
* `core.commitGraph=true` enables reading commit-graph files.
+
* `gc.writeCommitGraph=true` eneables writing commit-graph files during
`git gc`.
+
* `index.version=4` uses prefix-compression to reduce the size of the
.git/index file.
+
If the value is at least 5, then all of the defaults above are included,
plus the defaults below. These represent new features that present
significant performance benefits, but may not have been released for
multiple major versions.
+
* `pack.useSparse=true` uses the sparse tree-walk algorithm, which is
optimized for enumerating objects during linkgit:git-push[1] from a
client machine.
4 changes: 2 additions & 2 deletions Documentation/config/gc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -63,8 +63,8 @@ gc.writeCommitGraph::
If true, then gc will rewrite the commit-graph file when
linkgit:git-gc[1] is run. When using `git gc --auto`
the commit-graph will be updated if housekeeping is
required. Default is false. See linkgit:git-commit-graph[1]
for details.
required. Default is false, unless `core.featureAdoptionRage`
is at least three. See linkgit:git-commit-graph[1] for details.

gc.logExpiry::
If the file gc.log exists, then `git gc --auto` will print
Expand Down
2 changes: 2 additions & 0 deletions Documentation/config/index.txt
Original file line number Diff line number Diff line change
Expand Up @@ -24,3 +24,5 @@ index.threads::
index.version::
Specify the version with which new index files should be
initialized. This does not affect existing repositories.
If `core.featureAdoptionRate` is at least three, then the
default value is 4.
3 changes: 2 additions & 1 deletion Documentation/config/pack.txt
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,8 @@ pack.useSparse::
objects. This can have significant performance benefits when
computing a pack to send a small change. However, it is possible
that extra objects are added to the pack-file if the included
commits contain certain types of direct renames.
commits contain certain types of direct renames. Defaults to
false, unless `core.featureAdoptionRate` is at least five.

pack.writeBitmaps (deprecated)::
This is a deprecated synonym for `repack.writeBitmaps`.
Expand Down
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -967,6 +967,7 @@ LIB_OBJS += refspec.o
LIB_OBJS += ref-filter.o
LIB_OBJS += remote.o
LIB_OBJS += replace-object.o
LIB_OBJS += repo-settings.o
LIB_OBJS += repository.o
LIB_OBJS += rerere.o
LIB_OBJS += resolve-undo.o
Expand Down
6 changes: 3 additions & 3 deletions builtin/gc.c
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@
#include "pack-objects.h"
#include "blob.h"
#include "tree.h"
#include "repo-settings.h"

#define FAILED_RUN "failed to run %s"

Expand All @@ -41,7 +42,6 @@ static int aggressive_depth = 50;
static int aggressive_window = 250;
static int gc_auto_threshold = 6700;
static int gc_auto_pack_limit = 50;
static int gc_write_commit_graph;
static int detach_auto = 1;
static timestamp_t gc_log_expire_time;
static const char *gc_log_expire = "1.day.ago";
Expand Down Expand Up @@ -148,7 +148,6 @@ static void gc_config(void)
git_config_get_int("gc.aggressivedepth", &aggressive_depth);
git_config_get_int("gc.auto", &gc_auto_threshold);
git_config_get_int("gc.autopacklimit", &gc_auto_pack_limit);
git_config_get_bool("gc.writecommitgraph", &gc_write_commit_graph);
git_config_get_bool("gc.autodetach", &detach_auto);
git_config_get_expiry("gc.pruneexpire", &prune_expire);
git_config_get_expiry("gc.worktreepruneexpire", &prune_worktrees_expire);
Expand Down Expand Up @@ -685,7 +684,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
clean_pack_garbage();
}

if (gc_write_commit_graph)
prepare_repo_settings(the_repository);
if (the_repository->settings->gc_write_commit_graph == 1)
write_commit_graph_reachable(get_object_directory(), 0,
!quiet && !daemonized);

Expand Down
9 changes: 5 additions & 4 deletions builtin/pack-objects.c
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
#include "dir.h"
#include "midx.h"
#include "trace2.h"
#include "repo-settings.h"

#define IN_PACK(obj) oe_in_pack(&to_pack, obj)
#define SIZE(obj) oe_size(&to_pack, obj)
Expand Down Expand Up @@ -2707,10 +2708,6 @@ static int git_pack_config(const char *k, const char *v, void *cb)
use_bitmap_index_default = git_config_bool(k, v);
return 0;
}
if (!strcmp(k, "pack.usesparse")) {
sparse = git_config_bool(k, v);
return 0;
}
if (!strcmp(k, "pack.threads")) {
delta_search_threads = git_config_int(k, v);
if (delta_search_threads < 0)
Expand Down Expand Up @@ -3330,6 +3327,10 @@ int cmd_pack_objects(int argc, const char **argv, const char *prefix)
read_replace_refs = 0;

sparse = git_env_bool("GIT_TEST_PACK_SPARSE", 0);
prepare_repo_settings(the_repository);
if (!sparse && the_repository->settings->pack_use_sparse != -1)
sparse = the_repository->settings->pack_use_sparse;

reset_pack_idx_option(&pack_idx_opts);
git_config(git_pack_config, NULL);

Expand Down
7 changes: 4 additions & 3 deletions commit-graph.c
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include "hashmap.h"
#include "replace-object.h"
#include "progress.h"
#include "repo-settings.h"

#define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */
#define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
Expand Down Expand Up @@ -311,7 +312,6 @@ static void prepare_commit_graph_one(struct repository *r, const char *obj_dir)
static int prepare_commit_graph(struct repository *r)
{
struct object_directory *odb;
int config_value;

if (git_env_bool(GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD, 0))
die("dying as requested by the '%s' variable on commit-graph load!",
Expand All @@ -321,9 +321,10 @@ static int prepare_commit_graph(struct repository *r)
return !!r->objects->commit_graph;
r->objects->commit_graph_attempted = 1;

prepare_repo_settings(r);

if (!git_env_bool(GIT_TEST_COMMIT_GRAPH, 0) &&
(repo_config_get_bool(r, "core.commitgraph", &config_value) ||
!config_value))
r->settings->core_commit_graph != 1)
/*
* This repository is not configured to use commit graphs, so
* do not load one. (But report commit_graph_attempted anyway
Expand Down
12 changes: 7 additions & 5 deletions read-cache.c
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include "fsmonitor.h"
#include "thread-utils.h"
#include "progress.h"
#include "repo-settings.h"

/* Mask for the name length in ce_flags in the on-disk index */

Expand Down Expand Up @@ -1599,16 +1600,17 @@ struct cache_entry *refresh_cache_entry(struct index_state *istate,

#define INDEX_FORMAT_DEFAULT 3

static unsigned int get_index_format_default(void)
static unsigned int get_index_format_default(struct repository *r)
{
char *envversion = getenv("GIT_INDEX_VERSION");
char *endp;
int value;
unsigned int version = INDEX_FORMAT_DEFAULT;

if (!envversion) {
if (!git_config_get_int("index.version", &value))
version = value;
prepare_repo_settings(r);

if (r->settings->index_version >= 0)
version = r->settings->index_version;
if (version < INDEX_FORMAT_LB || INDEX_FORMAT_UB < version) {
warning(_("index.version set, but the value is invalid.\n"
"Using version %i"), INDEX_FORMAT_DEFAULT);
Expand Down Expand Up @@ -2765,7 +2767,7 @@ static int do_write_index(struct index_state *istate, struct tempfile *tempfile,
}

if (!istate->version) {
istate->version = get_index_format_default();
istate->version = get_index_format_default(the_repository);
if (git_env_bool("GIT_TEST_SPLIT_INDEX", 0))
init_split_index(istate);
}
Expand Down
58 changes: 58 additions & 0 deletions repo-settings.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#include "cache.h"
#include "repository.h"
#include "config.h"
#include "repo-settings.h"

#define UPDATE_DEFAULT(s,v) do { if (s == -1) { s = v; } } while(0)

static int git_repo_config(const char *key, const char *value, void *cb)
{
struct repo_settings *rs = (struct repo_settings *)cb;

if (!strcmp(key, "core.featureadoptionrate")) {
int rate = git_config_int(key, value);
if (rate >= 3) {
UPDATE_DEFAULT(rs->core_commit_graph, 1);
UPDATE_DEFAULT(rs->gc_write_commit_graph, 1);
UPDATE_DEFAULT(rs->index_version, 4);
}
if (rate >= 5) {
UPDATE_DEFAULT(rs->pack_use_sparse, 1);
}
return 0;
}
if (!strcmp(key, "core.commitgraph")) {
rs->core_commit_graph = git_config_bool(key, value);
return 0;
}
if (!strcmp(key, "gc.writecommitgraph")) {
rs->gc_write_commit_graph = git_config_bool(key, value);
return 0;
}
if (!strcmp(key, "pack.usesparse")) {
rs->pack_use_sparse = git_config_bool(key, value);
return 0;
}
if (!strcmp(key, "index.version")) {
rs->index_version = git_config_int(key, value);
return 0;
}

return 1;
}

void prepare_repo_settings(struct repository *r)
{
if (r->settings)
return;

r->settings = xmalloc(sizeof(*r->settings));

/* Defaults */
r->settings->core_commit_graph = -1;
r->settings->gc_write_commit_graph = -1;
r->settings->pack_use_sparse = -1;
r->settings->index_version = -1;

repo_config(r, git_repo_config, r->settings);
}
15 changes: 15 additions & 0 deletions repo-settings.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#ifndef REPO_SETTINGS_H
#define REPO_SETTINGS_H

struct repo_settings {
int core_commit_graph;
int gc_write_commit_graph;
int pack_use_sparse;
int index_version;
};

struct repository;

void prepare_repo_settings(struct repository *r);

#endif /* REPO_SETTINGS_H */
3 changes: 3 additions & 0 deletions repository.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#include "path.h"

struct config_set;
struct repo_settings;
struct git_hash_algo;
struct index_state;
struct lock_file;
Expand Down Expand Up @@ -72,6 +73,8 @@ struct repository {
*/
char *submodule_prefix;

struct repo_settings *settings;

/* Subsystems */
/*
* Repository's config which contains key-value pairs from the usual
Expand Down
34 changes: 29 additions & 5 deletions t/t1600-index.sh
Original file line number Diff line number Diff line change
Expand Up @@ -59,17 +59,41 @@ test_expect_success 'out of bounds index.version issues warning' '
)
'

test_expect_success 'GIT_INDEX_VERSION takes precedence over config' '
test_index_version () {
INDEX_VERSION_CONFIG=$1 &&
REPO_ADOPTION_RATE=$2 &&
ENV_VAR_VERSION=$3
EXPECTED_OUTPUT_VERSION=$4 &&
(
rm -f .git/index &&
GIT_INDEX_VERSION=4 &&
export GIT_INDEX_VERSION &&
git config --add index.version 2 &&
rm -f .git/config &&
if test "$INDEX_VERSION_CONFIG" -ne 0
then
git config --add index.version $INDEX_VERSION_CONFIG
fi &&
if test "$REPO_ADOPTION_RATE" -ne 0
then
git config --add core.featureAdoptionRate $REPO_ADOPTION_RATE
fi &&
if test "$ENV_VAR_VERSION" -ne 0
then
GIT_INDEX_VERSION=$ENV_VAR_VERSION &&
export GIT_INDEX_VERSION
else
unset GIT_INDEX_VERSION
fi &&
git add a 2>&1 &&
echo 4 >expect &&
echo $EXPECTED_OUTPUT_VERSION >expect &&
test-tool index-version <.git/index >actual &&
test_cmp expect actual
)
}

test_expect_success 'index version config precedence' '
test_index_version 2 0 4 4 &&
test_index_version 2 3 0 2 &&
test_index_version 0 3 0 4 &&
test_index_version 0 3 2 2
'

test_done