-
Notifications
You must be signed in to change notification settings - Fork 140
[RFC] Create 'core.featureAdoptionRate' setting to update config defaults #254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Create 'core.featureAdoptionRate' setting to update config defaults #254
Conversation
28cbf45
to
d4ff987
Compare
/submit |
Submitted as [email protected] |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
@@ -577,8 +577,9 @@ the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff Hostetler wrote (reply to this):
On 6/3/2019 4:18 PM, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <[email protected]>
>
> Several advanced config settings are highly recommended for clients
> using large repositories. Power users learn these one-by-one and
> enable them as they see fit. This could be made simpler, to allow
> more users to have access to these almost-always beneficial features
> (and more beneficial in larger repos).
>
> Create a 'repo.size' config setting whose only accepted value is
> 'large'. When a repo.size=large is given, change the default values
> of some config settings. If the setting is given explicitly, then
> take the explicit value.
>
> This change adds these two defaults to the repo.size=large setting:
>
> * core.commitGraph=true
> * gc.writeCommitGraph=true
>
> To centralize these config options and properly set the defaults,
> create a repo_settings that contains chars for each config variable.
> Use -1 as "unset", with 0 for false and 1 for true.
>
> The prepare_repo_settings() method ensures that this settings
> struct has been initialized, and avoids double-scanning the config
> settings.
>
> Signed-off-by: Derrick Stolee <[email protected]>
[...]
> diff --git a/repo-settings.c b/repo-settings.c
> new file mode 100644
> index 0000000000..6f5e18d92e
> --- /dev/null
> +++ b/repo-settings.c
> @@ -0,0 +1,44 @@
> +#include "cache.h"
> +#include "repository.h"
> +#include "config.h"
> +#include "repo-settings.h"
> +
> +
> +#define UPDATE_DEFAULT(s,v) if (s != -1) { s = v; }
We should guard this with a "do { ... } while (0)"
> +
> +static int git_repo_config(const char *key, const char *value, void *cb)
> +{
> + struct repo_settings *rs = (struct repo_settings *)cb;
> +
> + if (!strcmp(key, "core.size")) {
> + if (!strcmp(value, "large")) {
> + UPDATE_DEFAULT(rs->core_commit_graph, 1);
> + UPDATE_DEFAULT(rs->gc_write_commit_graph, 1);
> + }
> + return 0;
> + }
> + if (!strcmp(key, "core.commitgraph")) {
> + rs->core_commit_graph = git_config_bool(key, value);
> + return 0;
> + }
> + if (!strcmp(key, "gc.writecommitgraph")) {
> + rs->gc_write_commit_graph = git_config_bool(key, value);
> + return 0;
> + }
> +
> + return 1;
> +}
[...]
Jeff
On the Git mailing list, Johannes Schindelin wrote (reply to this):
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
d4ff987
to
4abb634
Compare
4abb634
to
5bba906
Compare
/submit |
Submitted as [email protected] |
This branch is now known as |
This patch series was integrated into pu via git@5956c60. |
This patch series was integrated into pu via git@824356a. |
This patch series was integrated into pu via git@e8a49a8. |
This patch series was integrated into pu via git@08638cf. |
This patch series was integrated into pu via git@426fac7. |
This patch series was integrated into pu via git@35c55df. |
This patch series was integrated into pu via git@0f9ba5f. |
This patch series was integrated into pu via git@76c9e34. |
This patch series was integrated into pu via git@612dc94. |
This patch series was integrated into pu via git@489feba. |
This patch series was integrated into pu via git@ae81f0a. |
If a repo is large, it likely has many paths in its working directory. This means the index could be compressed using version 4. Set this as a default when core.featureAdoptionRate is at least three. Since the index version is written to a file, this is an excellent opportunity to test that the config settings are working correctly with the different precedence rules. Adapt a test from t1600-index.sh to verify the version is set properly with different values of index.version config, core.featureAdoptionRate, and GIT_INDEX_VERSION. Signed-off-by: Derrick Stolee <[email protected]>
If a repo is large, then it probably has a very large working directory. In this case, a typical developer's edits usually impact many fewer paths than the full path set. The sparse treewalk algorithm is optimized for this case, speeding up 'git push' calls. Use pack.useSparse=true when core.featureAdoptionRate is at least five. This is the first setting where the feature has only been out for a single major version. This could be moved to the "at least three" category after another major version. Signed-off-by: Derrick Stolee <[email protected]>
5bba906
to
d080065
Compare
/submit |
Submitted as [email protected] |
This patch series was integrated into pu via git@7ffae11. |
@@ -577,7 +577,8 @@ the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Carlo Arenas wrote (reply to this):
On Mon, Jul 1, 2019 at 8:32 AM Derrick Stolee via GitGitGadget
<[email protected]> wrote:
>
> To centralize these config options and properly set the defaults,
> create a repo_settings that contains chars for each config variable.
> Use -1 as "unset", with 0 for false and 1 for true.
minor nitpick that hopefully Junio can fix: s/chars/ints
> +* `gc.writeCommitGraph=true` eneables writing commit-graph files during
typo: s/eneables/enable
Carlo
@@ -577,7 +577,8 @@ the `GIT_NOTES_REF` environment variable. See linkgit:git-notes[1]. | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Duy Nguyen wrote (reply to this):
On Mon, Jul 1, 2019 at 10:32 PM Derrick Stolee via GitGitGadget
<[email protected]> wrote:
> @@ -601,3 +602,22 @@ core.abbrev::
> in your repository, which hopefully is enough for
> abbreviated object names to stay unique for some time.
> The minimum length is 4.
> +
> +core.featureAdoptionRate::
> + Set an integer value on a scale from 0 to 10 describing your
> + desire to adopt new performance features. Defaults to 0. As
> + the value increases, features are enabled by changing the
> + default values of other config settings. If a config variable
> + is specified explicitly, the explicit value will override these
> + defaults:
This is because I'd like to keep core.* from growing too big (it's
already big), hard to read, search and maintain. Perhaps this should
belong to a separate group? Something like tuning.something or
defaults.something.
> +If the value is at least 3, then the following defaults are modified.
> +These represent relatively new features that have existed for multiple
> +major releases, and may present performance benefits. These benefits
> +depend on the amount and kind of data in your repo and how you use it.
Then instead of numeric values, maybe the user should write some sort
description about the repo and we optimize for that, similar to gcc
-Os optimized for size, -Ofast for compiler speed (-O<n> is all about
execution speed).
We could write, for example, tuning.commitHistory = {small, medium,
large} and tuning.worktree = {small, large, medium} and maybe
tuning.refSize and use that to optimize. We can still have different
optimization levels (probably just "none", "recommended" vs
"aggressive" where agressive enables most new stuff),
--
Duy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Ævar Arnfjörð Bjarmason wrote (reply to this):
On Tue, Jul 02 2019, Duy Nguyen wrote:
> On Mon, Jul 1, 2019 at 10:32 PM Derrick Stolee via GitGitGadget
> <[email protected]> wrote:
>> @@ -601,3 +602,22 @@ core.abbrev::
>> in your repository, which hopefully is enough for
>> abbreviated object names to stay unique for some time.
>> The minimum length is 4.
>> +
>> +core.featureAdoptionRate::
>> + Set an integer value on a scale from 0 to 10 describing your
>> + desire to adopt new performance features. Defaults to 0. As
>> + the value increases, features are enabled by changing the
>> + default values of other config settings. If a config variable
>> + is specified explicitly, the explicit value will override these
>> + defaults:
>
> This is because I'd like to keep core.* from growing too big (it's
> already big), hard to read, search and maintain. Perhaps this should
> belong to a separate group? Something like tuning.something or
> defaults.something.
The main thing users look at is "man git-config" (or its web rendering)
which renders it all in one page anyway.
I think in general adding more things to core.* sucks less than
explaining the special-case that "tuning.*" isn't a config for
git-tuning(1) (although we have some of that already, e.g. with
trace2.*).
Documentation/config/core.txt is ~600 lines. Maybe it would be a good
idea to split it up, similar to your split of
Documentation/config/*.txt, but let's not conflate how we'd like to
maintain stuff in git.git with a config interface we expose externally.
It's going to be very confusing for users if some settings that
otherwise would be in core aren't there because a file in git.git was
"too big" at the time. Users (mostly) aren't going to know/care in what
chronological order we added config keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jakub Narebski wrote (reply to this):
Duy Nguyen <[email protected]> writes:
> On Mon, Jul 1, 2019 at 10:32 PM Derrick Stolee via GitGitGadget
> <[email protected]> wrote:
>> @@ -601,3 +602,22 @@ core.abbrev::
>> in your repository, which hopefully is enough for
>> abbreviated object names to stay unique for some time.
>> The minimum length is 4.
>> +
>> +core.featureAdoptionRate::
>> + Set an integer value on a scale from 0 to 10 describing your
>> + desire to adopt new performance features. Defaults to 0. As
>> + the value increases, features are enabled by changing the
>> + default values of other config settings. If a config variable
>> + is specified explicitly, the explicit value will override these
>> + defaults:
>
> This is because I'd like to keep core.* from growing too big (it's
> already big), hard to read, search and maintain. Perhaps this should
> belong to a separate group? Something like tuning.something or
> defaults.something.
I'm not sure if I consider core.* too big. Well, there are 55 or more
entries in this namespace.
>> +If the value is at least 3, then the following defaults are modified.
>> +These represent relatively new features that have existed for multiple
>> +major releases, and may present performance benefits. These benefits
>> +depend on the amount and kind of data in your repo and how you use it.
>
> Then instead of numeric values, maybe the user should write some sort
> description about the repo and we optimize for that, similar to gcc
> -Os optimized for size, -Ofast for compiler speed (-O<n> is all about
> execution speed).
I also do not like those magic numbers.
>
> We could write, for example, tuning.commitHistory =3D {small, medium,
> large} and tuning.worktree =3D {small, large, medium} and maybe
> tuning.refSize and use that to optimize. We can still have different
> optimization levels (probably just "none", "recommended" vs
> "aggressive" where agressive enables most new stuff),
I think we have three different things that are currently conflated in
one config variable and one value.
First is what we want to optimize for; is it on-disk repository size,
command performance / execution speed, or maybe convenient information.
Second is what type of repository we are dealing with. Is there a
problem with long history, large number of files in checkout, large
and/or binary files, or all together? The original `core.size=3Dlarge`
(or proposed core.repositorySize) was all about this issue. Another
issue that might be important is that if it is leaf developer
repository, or is it maintainer repository, etc. (which affects for
example how the push looks like).
Third is what tradeoffs we are willing to accept to get required
performance. Are we willing to use additional stable optional features;
are we willing to use new experimental optional features; are we
willing; are we willing to sacrifice convenience (ahead/behind
information in status, information bout forced updates in push output,
etc.) for performance? This what current proposal is about.
It may not nnned to be a separate confi variable for a separate aspect;
it may be enough to have value that is space-separated list, or
something like that.
Best,
--
Jakub Nar=EAbski
This patch series was integrated into pu via git@15e2861. |
This patch series was integrated into pu via git@091ac6a. |
This patch series was integrated into pu via git@1ac2906. |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
On the Git mailing list, Taylor Blau wrote (reply to this):
|
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
This patch series was integrated into pu via git@f852a7a. |
On the Git mailing list, Junio C Hamano wrote (reply to this):
|
This patch series was integrated into pu via git@9424e2b. |
On the Git mailing list, Jakub Narebski wrote (reply to this):
|
This patch series was integrated into pu via git@c0d71ed. |
This patch series was integrated into pu via git@0522edb. |
This patch series was integrated into pu via git@365cd2f. |
This patch series was integrated into pu via git@6ec1ead. |
This patch series was integrated into pu via git@e2c26bc. |
This patch series was integrated into pu via git@af5be84. |
On the Git mailing list, Derrick Stolee wrote (reply to this):
|
Here is a second run at this RFC, which aims to create a "meta" config setting that automatically turns on other settings according to a user's willingness to trade new Git behavior or new feature risk for performance benefits. The new name for the setting is "core.featureAdoptionRate" and is an integer scale from 0 to 10. There will be multiple "categories" of settings, and the intention is to allow more granular levels as necessary.
The first category is "3 or higher" which means that the user is willing to adopt features that have been tested in multiple major releases. The settings to include here are core.commitGraph=true, gc.writeCommitGraph=true, and index.version=4.
The second category is "5 or higher" which means the user is willing to adopt features that have not been out for multiple major releases. The setting included here is pack.useSparse=true.
In the future, I would add a "7 or higher" setting which means the user is willing to have a change of behavior in exchange for performance benefits. The two settings to place here are 'status.aheadBehind=false' and 'fetch.showForcedUpdates=false'. Instead of including these settings in the current series, I've submitted them independently for full review [1, 2].
Hopefully this direction is amenable to allow "early adopters" gain access to new performance features even if they are not necessary reading every line of the release notes.
Thanks,
-Stolee
[1] https://public-inbox.org/git/[email protected]/
[2] https://public-inbox.org/git/[email protected]/
Cc: [email protected], [email protected]