Skip to content

Remove special casing for PSEUDOREF updates #673

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 7 additions & 6 deletions Documentation/git-update-ref.txt
Original file line number Diff line number Diff line change
Expand Up @@ -148,12 +148,13 @@ still see a subset of the modifications.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Johannes Schindelin wrote (reply to this):

Hi Han-Wen,

On Thu, 16 Jul 2020, Han-Wen Nienhuys via GitGitGadget wrote:

> From: Han-Wen Nienhuys <[email protected]>
>
> The previous behavior was introduced in commit 74ec19d4be
> ("pseudorefs: create and use pseudoref update and delete functions",
> Jul 31, 2015), with the justification "alternate ref backends still
> need to store pseudorefs in GIT_DIR".
>
> Refs such as REBASE_HEAD are read through the ref backend. This can
> only work consistently if they are written through the ref backend as
> well. Tooling that works directly on files under .git should be
> updated to use git commands to read refs instead.
>
> The following behaviors change:
>
> * Updates to pseudorefs (eg. ORIG_HEAD) with
>   core.logAllRefUpdates=always will create reflogs for the pseudoref.
>
> * non-HEAD pseudoref symrefs are also dereferenced on deletion. Update
>   t1405 accordingly.

Unfortunately, there is still a breakage in t1405, although it only really
shows up on Windows and macOS because of case-insensitive filesystems.
https://github.com/gitgitgadget/git/runs/879772942?check_suite_focus=true
shows the concrete problem: t1405.17 'delete_ref(refs/heads/foo)' will
fail thusly:

	++ test-tool ref-store main delete-ref msg refs/heads/foo c90e4dc5e12224a428dedfbd45ba11e5531706a2 0
	error: cannot lock ref 'refs/heads/foo': is at 1e995a94b5a60488b6ebaf78ec779d85a55ea700 but expected c90e4dc5e12224a428dedfbd45ba11e5531706a2
	error: last command exited with $?=1

The reason is that there exists a `refs/heads/foo` _and_ the symref `FOO`
(which points to `refs/heads/master`, which is at 1e995a).

An earlier symptom of this bug is visible on Linux, too, though: when
you run t1405.4 'delete_refs(FOO, refs/tags/new-tag)', it shows this:

	+ git rev-parse FOO --
	warning: ignoring dangling symref FOO
	fatal: bad revision 'FOO'

(Seeing this warning suggests to me that this should probably be caught
_by the regression test_, i.e. stderr should be redirected, and it should
be verified via `test_i18ngrep` that that warning is not shown.)

The reason for this warning is that the symref `.git/FOO` is no longer
removed as part of t1405.4 'delete_refs(FOO, refs/tags/new-tag)'. I've
used this patch to bisect the breakage down to this here patch:

-- snip --
diff --git a/t/t1405-main-ref-store.sh b/t/t1405-main-ref-store.sh
index a8d695cd4f1..779906d607f 100755
--- a/t/t1405-main-ref-store.sh
+++ b/t/t1405-main-ref-store.sh
@@ -34,6 +34,7 @@ test_expect_success 'delete_refs(FOO, refs/tags/new-tag)' '
 	m=$(git rev-parse master) &&
 	$RUN delete-refs 0 nothing FOO refs/tags/new-tag &&
 	test_must_fail git rev-parse FOO -- &&
+	test_path_is_missing .git/FOO &&
 	test_must_fail git rev-parse refs/tags/new-tag --&&
 	test_must_fail git rev-parse master -- &&
 	git update-ref refs/heads/master $m
-- snap --

Hopefully you can figure out what is going wrong.

Ciao,
Dscho

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Han-Wen Nienhuys wrote (reply to this):

On Fri, Jul 17, 2020 at 12:18 PM Johannes Schindelin
<[email protected]> wrote:
> The reason is that there exists a `refs/heads/foo` _and_ the symref `FOO`
> (which points to `refs/heads/master`, which is at 1e995a).
>
> An earlier symptom of this bug is visible on Linux, too, though: when
> you run t1405.4 'delete_refs(FOO, refs/tags/new-tag)', it shows this:
>
>         + git rev-parse FOO --
>         warning: ignoring dangling symref FOO
>         fatal: bad revision 'FOO'

Thanks, this was very helpful for debugging.

The reason is that we do split_symref_update(), where a transaction
that modifies SYMREF (pointing to REFERENT) without REF_NO_DEREF is
changed to a transaction that modifies REFERENT and a reflog-only
update for SYMREF. In this case, that leaves the FOO symref in place.

This seems very specific to the test (is there any use of symrefs
beyond HEAD?) , and I fixed it by passing REF_NO_DEREF to the
delete-refs call.

It would also be nice if this behavior was lifted out of the files
backend and into the generic layer, but it's not yet clear to me if
this is possible.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

LOGGING UPDATES
---------------
If config parameter "core.logAllRefUpdates" is true and the ref is one under
"refs/heads/", "refs/remotes/", "refs/notes/", or the symbolic ref HEAD; or
the file "$GIT_DIR/logs/<ref>" exists then `git update-ref` will append
a line to the log file "$GIT_DIR/logs/<ref>" (dereferencing all
symbolic refs before creating the log name) describing the change
in ref value. Log lines are formatted as:
If config parameter "core.logAllRefUpdates" is true and the ref is one
under "refs/heads/", "refs/remotes/", "refs/notes/", or a pseudoref
like HEAD or ORIG_HEAD; or the file "$GIT_DIR/logs/<ref>" exists then
`git update-ref` will append a line to the log file
"$GIT_DIR/logs/<ref>" (dereferencing all symbolic refs before creating
the log name) describing the change in ref value. Log lines are
formatted as:

oldsha1 SP newsha1 SP committer LF

Expand Down
127 changes: 10 additions & 117 deletions refs.c
Original file line number Diff line number Diff line change
Expand Up @@ -708,10 +708,9 @@ int dwim_log(const char *str, int len, struct object_id *oid, char **log)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Han-Wen Nienhuys via GitGitGadget" <[email protected]> writes:

> From: Han-Wen Nienhuys <[email protected]>
>
> This is consistent with the definition of REF_TYPE_PSEUDOREF
> (uppercase in the root ref namespace).

I wonder if some special casing we saw for "HEAD" in 1/2 (e.g. we no
longer do this to pseudorefs in this codepath but HEAD is different
and will continue to be dealt with in the codepath) needs adjustment
after this change?  HEAD still needs to be able to have a different
value (either detached or pointing to another ref) per worktree, but
it is unclear where that knowledge now resides in the new code.  Are
we declaring that all pseudorefs are per worktree (which I think is
not wrong)?

> Signed-off-by: Han-Wen Nienhuys <[email protected]>
> ---
>  refs.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/refs.c b/refs.c
> index a2fd42364f..265767a234 100644
> --- a/refs.c
> +++ b/refs.c
> @@ -676,10 +676,9 @@ int dwim_log(const char *str, int len, struct object_id *oid, char **log)
>  
>  static int is_per_worktree_ref(const char *refname)
>  {
> -	return !strcmp(refname, "HEAD") ||
> -		starts_with(refname, "refs/worktree/") ||
> -		starts_with(refname, "refs/bisect/") ||
> -		starts_with(refname, "refs/rewritten/");
> +	return starts_with(refname, "refs/worktree/") ||
> +	       starts_with(refname, "refs/bisect/") ||
> +	       starts_with(refname, "refs/rewritten/");
>  }
>  
>  static int is_pseudoref_syntax(const char *refname)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Han-Wen Nienhuys wrote (reply to this):

On Mon, Jul 6, 2020 at 10:30 PM Junio C Hamano <[email protected]> wrote:
> > This is consistent with the definition of REF_TYPE_PSEUDOREF
> > (uppercase in the root ref namespace).
>
> I wonder if some special casing we saw for "HEAD" in 1/2 (e.g. we no
> longer do this to pseudorefs in this codepath but HEAD is different
> and will continue to be dealt with in the codepath) needs adjustment
> after this change?  HEAD still needs to be able to have a different
> value (either detached or pointing to another ref) per worktree, but
> it is unclear where that knowledge now resides in the new code.  Are
> we declaring that all pseudorefs are per worktree (which I think is
> not wrong)?

the source code already says

enum ref_type {
 REF_TYPE_PER_WORKTREE, /* refs inside refs/ but not shared */
 REF_TYPE_PSEUDOREF, /* refs outside refs/ in current worktree */

but considers HEAD a REF_TYPE_PER_WORKTREE, counter to the comment.

IIRC, this change actually has almost no effect, because most code
treats both enum values as equivalent. (see eg. builtin/reflog.c),
except for the code that the preceding commit changes.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--
Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

static int is_per_worktree_ref(const char *refname)
{
return !strcmp(refname, "HEAD") ||
starts_with(refname, "refs/worktree/") ||
starts_with(refname, "refs/bisect/") ||
starts_with(refname, "refs/rewritten/");
return starts_with(refname, "refs/worktree/") ||
starts_with(refname, "refs/bisect/") ||
starts_with(refname, "refs/rewritten/");
}

static int is_pseudoref_syntax(const char *refname)
Expand Down Expand Up @@ -771,102 +770,6 @@ long get_files_ref_lock_timeout_ms(void)
return timeout_ms;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Han-Wen Nienhuys via GitGitGadget" <[email protected]> writes:

> Subject: Re: [PATCH 1/2] Modify pseudo refs through ref backend storage
> From: Han-Wen Nienhuys <[email protected]>

With what definition of "pseudo refs" has this change been made?
Those things like HEAD, CHERRY_PICK_HEAD, FETCH_HEAD etc. that have
traditionally been written as a plain text file in $GIT_DIR and are
used to name objects by having a full object name in it?  

Or the entities that behave like refs and stored in ref backends,
with all-uppercase-names but do not sit inside refs/ hierarchy?

I think it is OK (and possibly a good move in the longer term, but
that is just my gut feeling) to make ref backends resopnsible for
enumerating, reading and writing them (i.e. I think we want to use
the latter definition for the longer term health of the project).
And we would want to ...

> The previous behavior was introduced in commit 74ec19d4be
> ("pseudorefs: create and use pseudoref update and delete functions",
> Jul 31, 2015), with the justification "alternate ref backends still
> need to store pseudorefs in GIT_DIR".

... declare that justification invalid for that purpose.

Is that what is going on?  I just want to make sure I am following
your flow of thought.

> Refs such as REBASE_HEAD are read through the ref backend. This can
> only work consistently if they are written through the ref backend as
> well. Tooling that works directly on files under .git should be
> updated to use git commands to read refs instead.

OK.

> diff --git a/refs/files-backend.c b/refs/files-backend.c
> index 6516c7bc8c..9951c2e403 100644
> --- a/refs/files-backend.c
> +++ b/refs/files-backend.c
> @@ -1228,7 +1228,6 @@ static int files_delete_refs(struct ref_store *ref_store, const char *msg,
>  
>  	for (i = 0; i < refnames->nr; i++) {
>  		const char *refname = refnames->items[i].string;
> -
>  		if (refs_delete_ref(&refs->base, msg, refname, NULL, flags))
>  			result |= error(_("could not remove reference %s"), refname);
>  	}

Unreleated change?

> @@ -2436,7 +2435,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
>  	update->backend_data = lock;
>  
>  	if (update->type & REF_ISSYMREF) {
> -		if (update->flags & REF_NO_DEREF) {
> +		if (update->flags & REF_NO_DEREF ||
> +		    (ref_type(update->refname) == REF_TYPE_PSEUDOREF &&
> +		     strcmp(update->refname, "HEAD"))) {
>  			/*
>  			 * We won't be reading the referent as part of
>  			 * the transaction, so we have to read it here

The old "if we are not dereferencing" condition in if() exactly
matched the comment, but the condition in if() after this change is
not "if we are not dereferencing".  Even if we are dereferencing,
under some new condition, we would still drop into this block and do
not follow the "else" side that creates a new update for the
referent.  Is this part of "modify pseudo refs via backend" topic,
or should it be a separate modification?  Why is this change needed?

It seems that no matter where in the refs/ hierarchy (or even
outside) a symbolic ref resides, the way to update itself (not the
referent through the symbolic ref) should be the same, which is what
the original says, and we want to change that reasoning here, but it
is not quite clear to me why.

> @@ -2782,8 +2783,10 @@ static int files_transaction_finish(struct ref_store *ref_store,
>  		struct ref_update *update = transaction->updates[i];
>  		struct ref_lock *lock = update->backend_data;
>  
> -		if (update->flags & REF_NEEDS_COMMIT ||
> -		    update->flags & REF_LOG_ONLY) {
> +		if ((ref_type(lock->ref_name) != REF_TYPE_PSEUDOREF ||
> +		     !strcmp(lock->ref_name, "HEAD")) &&
> +		    (update->flags & REF_NEEDS_COMMIT ||
> +		     update->flags & REF_LOG_ONLY)) {

And this one stops the files backend from touching all pseudorefs
other than HEAD with this codepath.  That somehow feels totally
opposite from what the log message explained above---weren't we
updating the code to write these pseudorefs through the individual
backends, which the files backend is one example of?  Isn't this
change stopping the backend from writing the pseudorefs other than
HEAD instead?

Puzzled.

>  			if (files_log_ref_write(refs,
>  						lock->ref_name,
>  						&lock->old_oid,


> diff --git a/t/t1400-update-ref.sh b/t/t1400-update-ref.sh
> index 27171f8261..6b8030e8fe 100755
> --- a/t/t1400-update-ref.sh
> +++ b/t/t1400-update-ref.sh
> @@ -476,7 +476,7 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER
>  test_expect_success 'given old value for missing pseudoref, do not create' '
>  	test_must_fail git update-ref PSEUDOREF $A $B 2>err &&
>  	test_path_is_missing .git/PSEUDOREF &&

The reason why I asked what this patch thinks the definition of
pseudoref is is because of this thing.  Shouldn't this line be fixed
not to depend on the files backend?  Likewise for $(cat .git/PSEUDOREF)
in the remaining two hunks.

> -	test_i18ngrep "could not read ref" err
> +	test_i18ngrep "unable to resolve reference" err
>  '
>  
>  test_expect_success 'create pseudoref' '
> @@ -497,7 +497,7 @@ test_expect_success 'overwrite pseudoref with correct old value' '
>  test_expect_success 'do not overwrite pseudoref with wrong old value' '
>  	test_must_fail git update-ref PSEUDOREF $D $E 2>err &&
>  	test $C = $(cat .git/PSEUDOREF) &&
> -	test_i18ngrep "unexpected object ID" err
> +	test_i18ngrep "cannot lock ref.*expected" err
>  '
>  
>  test_expect_success 'delete pseudoref' '
> @@ -509,7 +509,7 @@ test_expect_success 'do not delete pseudoref with wrong old value' '
>  	git update-ref PSEUDOREF $A &&
>  	test_must_fail git update-ref -d PSEUDOREF $B 2>err &&
>  	test $A = $(cat .git/PSEUDOREF) &&
> -	test_i18ngrep "unexpected object ID" err
> +	test_i18ngrep "cannot lock ref.*expected" err
>  '
>  
>  test_expect_success 'delete pseudoref with correct old value' '

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Han-Wen Nienhuys wrote (reply to this):

On Mon, Jul 6, 2020 at 10:27 PM Junio C Hamano <[email protected]> wrote:
>
> "Han-Wen Nienhuys via GitGitGadget" <[email protected]> writes:
>
> > Subject: Re: [PATCH 1/2] Modify pseudo refs through ref backend storage
> > From: Han-Wen Nienhuys <[email protected]>
>
> With what definition of "pseudo refs" has this change been made?
> Those things like HEAD, CHERRY_PICK_HEAD, FETCH_HEAD etc. that have
> traditionally been written as a plain text file in $GIT_DIR and are
> used to name objects by having a full object name in it?
>
> Or the entities that behave like refs and stored in ref backends,
> with all-uppercase-names but do not sit inside refs/ hierarchy?

I thought these were the same? - see below :)

> I think it is OK (and possibly a good move in the longer term, but
> that is just my gut feeling) to make ref backends resopnsible for
> enumerating, reading and writing them (i.e. I think we want to use
> the latter definition for the longer term health of the project).
> And we would want to ...
>
> > The previous behavior was introduced in commit 74ec19d4be
> > ("pseudorefs: create and use pseudoref update and delete functions",
> > Jul 31, 2015), with the justification "alternate ref backends still
> > need to store pseudorefs in GIT_DIR".
>
> ... declare that justification invalid for that purpose.
>
> Is that what is going on?  I just want to make sure I am following
> your flow of thought.

yep.

> >               const char *refname = refnames->items[i].string;
> > -
> >               if (refs_delete_ref(&refs->base, msg, refname, NULL, flags))
> >                       result |= error(_("could not remove reference %s"), refname);
> >       }
>
> Unreleated change?

oops.

> > @@ -2436,7 +2435,9 @@ static int lock_ref_for_update(struct files_ref_store *refs,
> >       update->backend_data = lock;
> >
> >       if (update->type & REF_ISSYMREF) {
> > -             if (update->flags & REF_NO_DEREF) {
> > +             if (update->flags & REF_NO_DEREF ||
> > +                 (ref_type(update->refname) == REF_TYPE_PSEUDOREF &&
> > +                  strcmp(update->refname, "HEAD"))) {
> >                       /*
> >                        * We won't be reading the referent as part of
> >                        * the transaction, so we have to read it here
>
> The old "if we are not dereferencing" condition in if() exactly
> matched the comment, but the condition in if() after this change is
> not "if we are not dereferencing".  Even if we are dereferencing,
> under some new condition, we would still drop into this block and do
> not follow the "else" side that creates a new update for the
> referent.  Is this part of "modify pseudo refs via backend" topic,
> or should it be a separate modification?  Why is this change needed?

There is a  test (mentioned in the commit message), t1405, which does

$RUN create-symref FOO refs/heads/master nothing &&
# calls refs_create_symref()
..
$RUN delete-refs 0 nothing FOO refs/tags/new-tag &&
# calls refs_delete_refs, which calls delete_pseudoref
..
git rev-parse master >expected &&

Previously, the delete_pseudoref code would delete just .git/FOO file.
Without the change here, going through the ref backend ends up in the
!REF_NO_DEREF case, deleting the master branch.

I don't know what the correct semantics are (are symrefs used for
anything except HEAD ?), so


> It seems that no matter where in the refs/ hierarchy (or even
> outside) a symbolic ref resides, the way to update itself (not the
> referent through the symbolic ref) should be the same, which is what
> the original says, and we want to change that reasoning here, but it
> is not quite clear to me why.
>
> > @@ -2782,8 +2783,10 @@ static int files_transaction_finish(struct ref_store *ref_store,
> >               struct ref_update *update = transaction->updates[i];
> >               struct ref_lock *lock = update->backend_data;
> >
> > -             if (update->flags & REF_NEEDS_COMMIT ||
> > -                 update->flags & REF_LOG_ONLY) {
> > +             if ((ref_type(lock->ref_name) != REF_TYPE_PSEUDOREF ||
> > +                  !strcmp(lock->ref_name, "HEAD")) &&
> > +                 (update->flags & REF_NEEDS_COMMIT ||
> > +                  update->flags & REF_LOG_ONLY)) {
>
> And this one stops the files backend from touching all pseudorefs
> other than HEAD with this codepath.  That somehow feels totally
> opposite from what the log message explained above---weren't we
> updating the code to write these pseudorefs through the individual
> backends, which the files backend is one example of?  Isn't this
> change stopping the backend from writing the pseudorefs other than
> HEAD instead?

There is a test

  'core.logAllRefUpdates=always creates no reflog for ORIG_HEAD'

that wants there to not be reflog updates for ORIG_HEAD. I don't see
why that behavior exists, and would be happy to change it, but I'm too
much of a newbie to decide what is right here.

> > --- a/t/t1400-update-ref.sh
> > +++ b/t/t1400-update-ref.sh
> > @@ -476,7 +476,7 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER
> >  test_expect_success 'given old value for missing pseudoref, do not create' '
> >       test_must_fail git update-ref PSEUDOREF $A $B 2>err &&
> >       test_path_is_missing .git/PSEUDOREF &&
>
> The reason why I asked what this patch thinks the definition of
> pseudoref is is because of this thing.  Shouldn't this line be fixed
> not to depend on the files backend?  Likewise for $(cat .git/PSEUDOREF)
> in the remaining two hunks.

This patch doesn't introduce reftable yet, so both definitions are
equivalent for the sake of this  patch.

-- 
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Han-Wen Nienhuys <[email protected]> writes:

>> Or the entities that behave like refs and stored in ref backends,
>> with all-uppercase-names but do not sit inside refs/ hierarchy?
>
> I thought these were the same? - see below :)

>> > --- a/t/t1400-update-ref.sh
>> > +++ b/t/t1400-update-ref.sh
>> > @@ -476,7 +476,7 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER
>> >  test_expect_success 'given old value for missing pseudoref, do not create' '
>> >       test_must_fail git update-ref PSEUDOREF $A $B 2>err &&
>> >       test_path_is_missing .git/PSEUDOREF &&
>>
>> The reason why I asked what this patch thinks the definition of
>> pseudoref is is because of this thing.  Shouldn't this line be fixed
>> not to depend on the files backend?  Likewise for $(cat .git/PSEUDOREF)
>> in the remaining two hunks.
>
> This patch doesn't introduce reftable yet, so both definitions are
> equivalent for the sake of this patch.

But at the end of the tunnel, we do want to see people stop
expecting that .git/PSEUDOREF file is what PSEUDOREF ref is, no?  I
think that is what "pseudo refs are not just files directly under
$GIT_DIR, but managed by the backend" means, and that in turn is a
valid justification for other changes introduced in this patch.
Expecting the effect of modifying pseudo refs _after_ the code is
changed that such modifications are properly done through the ref
API to appear on the filesystem feels like it goes against the
reason why we are making this change, compared to checking to see if
the pseudoref really got updated as desired via the ref API, at
least to me.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Han-Wen Nienhuys wrote (reply to this):

On Tue, Jul 7, 2020 at 5:20 PM Junio C Hamano <[email protected]> wrote:

> >> The reason why I asked what this patch thinks the definition of
> >> pseudoref is is because of this thing.  Shouldn't this line be fixed
> >> not to depend on the files backend?  Likewise for $(cat .git/PSEUDOREF)
> >> in the remaining two hunks.
> >
> > This patch doesn't introduce reftable yet, so both definitions are
> > equivalent for the sake of this patch.
>
> But at the end of the tunnel, we do want to see people stop
> expecting that .git/PSEUDOREF file is what PSEUDOREF ref is, no?  I
> think that is what "pseudo refs are not just files directly under
> $GIT_DIR, but managed by the backend" means, and that in turn is a
> valid justification for other changes introduced in this patch.
> Expecting the effect of modifying pseudo refs _after_ the code is
> changed that such modifications are properly done through the ref
> API to appear on the filesystem feels like it goes against the
> reason why we are making this change, compared to checking to see if
> the pseudoref really got updated as desired via the ref API, at
> least to me.

I can fix this specific instance here (which is what I think you
want), for this commit to practice what it preaches. At the same time
there are probably about 100 or so other places where the tests check
the file system directly for ref(log) existence, so it would never be
totally consistent.

The only way to systematically find the offending places is to
introduce a new ref backend and then fix all the tests, and I think
that goes outside the scope of this small series.

--
Han-Wen Nienhuys - Google Munich
I work 80%. Don't expect answers from me on Fridays.
--

Google Germany GmbH, Erika-Mann-Strasse 33, 80636 Munich

Registergericht und -nummer: Hamburg, HRB 86891

Sitz der Gesellschaft: Hamburg

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Han-Wen Nienhuys <[email protected]> writes:

> I can fix this specific instance here (which is what I think you
> want), for this commit to practice what it preaches. At the same time
> there are probably about 100 or so other places where the tests check
> the file system directly for ref(log) existence, so it would never be
> totally consistent.
>
> The only way to systematically find the offending places is to
> introduce a new ref backend and then fix all the tests, and I think
> that goes outside the scope of this small series.

Good.  I think we are on the same page.  I suggested to make the
patch internally consistent, nothing more.  And fixing everything in
the world is outside the scope of these two patches.

Thanks.

}

static int write_pseudoref(const char *pseudoref, const struct object_id *oid,
const struct object_id *old_oid, struct strbuf *err)
{
const char *filename;
int fd;
struct lock_file lock = LOCK_INIT;
struct strbuf buf = STRBUF_INIT;
int ret = -1;

if (!oid)
return 0;

strbuf_addf(&buf, "%s\n", oid_to_hex(oid));

filename = git_path("%s", pseudoref);
fd = hold_lock_file_for_update_timeout(&lock, filename, 0,
get_files_ref_lock_timeout_ms());
if (fd < 0) {
strbuf_addf(err, _("could not open '%s' for writing: %s"),
filename, strerror(errno));
goto done;
}

if (old_oid) {
struct object_id actual_old_oid;

if (read_ref(pseudoref, &actual_old_oid)) {
if (!is_null_oid(old_oid)) {
strbuf_addf(err, _("could not read ref '%s'"),
pseudoref);
rollback_lock_file(&lock);
goto done;
}
} else if (is_null_oid(old_oid)) {
strbuf_addf(err, _("ref '%s' already exists"),
pseudoref);
rollback_lock_file(&lock);
goto done;
} else if (!oideq(&actual_old_oid, old_oid)) {
strbuf_addf(err, _("unexpected object ID when writing '%s'"),
pseudoref);
rollback_lock_file(&lock);
goto done;
}
}

if (write_in_full(fd, buf.buf, buf.len) < 0) {
strbuf_addf(err, _("could not write to '%s'"), filename);
rollback_lock_file(&lock);
goto done;
}

commit_lock_file(&lock);
ret = 0;
done:
strbuf_release(&buf);
return ret;
}

static int delete_pseudoref(const char *pseudoref, const struct object_id *old_oid)
{
const char *filename;

filename = git_path("%s", pseudoref);

if (old_oid && !is_null_oid(old_oid)) {
struct lock_file lock = LOCK_INIT;
int fd;
struct object_id actual_old_oid;

fd = hold_lock_file_for_update_timeout(
&lock, filename, 0,
get_files_ref_lock_timeout_ms());
if (fd < 0) {
error_errno(_("could not open '%s' for writing"),
filename);
return -1;
}
if (read_ref(pseudoref, &actual_old_oid))
die(_("could not read ref '%s'"), pseudoref);
if (!oideq(&actual_old_oid, old_oid)) {
error(_("unexpected object ID when deleting '%s'"),
pseudoref);
rollback_lock_file(&lock);
return -1;
}

unlink(filename);
rollback_lock_file(&lock);
} else {
unlink(filename);
}

return 0;
}

int refs_delete_ref(struct ref_store *refs, const char *msg,
const char *refname,
const struct object_id *old_oid,
Expand All @@ -875,11 +778,6 @@ int refs_delete_ref(struct ref_store *refs, const char *msg,
struct ref_transaction *transaction;
struct strbuf err = STRBUF_INIT;

if (ref_type(refname) == REF_TYPE_PSEUDOREF) {
assert(refs == get_main_ref_store(the_repository));
return delete_pseudoref(refname, old_oid);
}

transaction = ref_store_transaction_begin(refs, &err);
if (!transaction ||
ref_transaction_delete(transaction, refname, old_oid,
Expand Down Expand Up @@ -1202,18 +1100,13 @@ int refs_update_ref(struct ref_store *refs, const char *msg,
struct strbuf err = STRBUF_INIT;
int ret = 0;

if (ref_type(refname) == REF_TYPE_PSEUDOREF) {
assert(refs == get_main_ref_store(the_repository));
ret = write_pseudoref(refname, new_oid, old_oid, &err);
} else {
t = ref_store_transaction_begin(refs, &err);
if (!t ||
ref_transaction_update(t, refname, new_oid, old_oid,
flags, msg, &err) ||
ref_transaction_commit(t, &err)) {
ret = 1;
ref_transaction_free(t);
}
t = ref_store_transaction_begin(refs, &err);
if (!t ||
ref_transaction_update(t, refname, new_oid, old_oid, flags, msg,
&err) ||
ref_transaction_commit(t, &err)) {
ret = 1;
ref_transaction_free(t);
}
if (ret) {
const char *str = _("update_ref failed for ref '%s': %s");
Expand Down
30 changes: 15 additions & 15 deletions t/t1400-update-ref.sh
Original file line number Diff line number Diff line change
Expand Up @@ -160,10 +160,10 @@ test_expect_success 'core.logAllRefUpdates=always creates reflog by default' '
git reflog exists $outside
'

test_expect_success 'core.logAllRefUpdates=always creates no reflog for ORIG_HEAD' '
test_expect_success 'core.logAllRefUpdates=always creates reflog for ORIG_HEAD' '
test_config core.logAllRefUpdates always &&
git update-ref ORIG_HEAD $A &&
test_must_fail git reflog exists ORIG_HEAD
git reflog exists ORIG_HEAD
'

test_expect_success '--no-create-reflog overrides core.logAllRefUpdates=always' '
Expand Down Expand Up @@ -475,57 +475,57 @@ test_expect_success 'git cat-file blob master@{2005-05-26 23:42}:F (expect OTHER

test_expect_success 'given old value for missing pseudoref, do not create' '
test_must_fail git update-ref PSEUDOREF $A $B 2>err &&
test_path_is_missing .git/PSEUDOREF &&
test_i18ngrep "could not read ref" err
test_must_fail git rev-parse PSEUDOREF &&
test_i18ngrep "unable to resolve reference" err
'

test_expect_success 'create pseudoref' '
git update-ref PSEUDOREF $A &&
test $A = $(cat .git/PSEUDOREF)
test $A = $(git rev-parse PSEUDOREF)
'

test_expect_success 'overwrite pseudoref with no old value given' '
git update-ref PSEUDOREF $B &&
test $B = $(cat .git/PSEUDOREF)
test $B = $(git rev-parse PSEUDOREF)
'

test_expect_success 'overwrite pseudoref with correct old value' '
git update-ref PSEUDOREF $C $B &&
test $C = $(cat .git/PSEUDOREF)
test $C = $(git rev-parse PSEUDOREF)
'

test_expect_success 'do not overwrite pseudoref with wrong old value' '
test_must_fail git update-ref PSEUDOREF $D $E 2>err &&
test $C = $(cat .git/PSEUDOREF) &&
test_i18ngrep "unexpected object ID" err
test $C = $(git rev-parse PSEUDOREF) &&
test_i18ngrep "cannot lock ref.*expected" err
'

test_expect_success 'delete pseudoref' '
git update-ref -d PSEUDOREF &&
test_path_is_missing .git/PSEUDOREF
test_must_fail git rev-parse PSEUDOREF
'

test_expect_success 'do not delete pseudoref with wrong old value' '
git update-ref PSEUDOREF $A &&
test_must_fail git update-ref -d PSEUDOREF $B 2>err &&
test $A = $(cat .git/PSEUDOREF) &&
test_i18ngrep "unexpected object ID" err
test $A = $(git rev-parse PSEUDOREF) &&
test_i18ngrep "cannot lock ref.*expected" err
'

test_expect_success 'delete pseudoref with correct old value' '
git update-ref -d PSEUDOREF $A &&
test_path_is_missing .git/PSEUDOREF
test_must_fail git rev-parse PSEUDOREF
'

test_expect_success 'create pseudoref with old OID zero' '
git update-ref PSEUDOREF $A $Z &&
test $A = $(cat .git/PSEUDOREF)
test $A = $(git rev-parse PSEUDOREF)
'

test_expect_success 'do not overwrite pseudoref with old OID zero' '
test_when_finished git update-ref -d PSEUDOREF &&
test_must_fail git update-ref PSEUDOREF $B $Z 2>err &&
test $A = $(cat .git/PSEUDOREF) &&
test $A = $(git rev-parse PSEUDOREF) &&
test_i18ngrep "already exists" err
'

Expand Down
5 changes: 4 additions & 1 deletion t/t1405-main-ref-store.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,10 @@ test_expect_success 'create_symref(FOO, refs/heads/master)' '
test_expect_success 'delete_refs(FOO, refs/tags/new-tag)' '
git rev-parse FOO -- &&
git rev-parse refs/tags/new-tag -- &&
$RUN delete-refs 0 nothing FOO refs/tags/new-tag &&
m=$(git rev-parse master) &&
REF_NO_DEREF=1 &&
$RUN delete-refs $REF_NO_DEREF nothing FOO refs/tags/new-tag &&
test_must_fail git rev-parse --symbolic-full-name FOO &&
test_must_fail git rev-parse FOO -- &&
test_must_fail git rev-parse refs/tags/new-tag --
'
Expand Down