-
Notifications
You must be signed in to change notification settings - Fork 141
vreportf: Fix interleaving issues #407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
dd828a6
to
756b1fc
Compare
Please ignore the osx failures for now... working on it. |
756b1fc
to
5fd06fa
Compare
/submit |
Submitted as [email protected] |
5fd06fa
to
54f0d6f
Compare
/submit |
Submitted as [email protected] |
@@ -6,17 +6,159 @@ | |||
#include "git-compat-util.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Alex,
On Tue, 22 Oct 2019, Alexandr Miloslavskiy via GitGitGadget wrote:
> From: Alexandr Miloslavskiy <[email protected]>
>
> This also fixes t5516 on Windows VS build.
Maybe this could do with an example? I could imagine that we might want
to use the log of
https://dev.azure.com/git/git/_build/results?buildId=3D1264&view=3Dms.vss-=
test-web.build-test-results-tab&runId=3D1016906&resultId=3D101011&paneView=
=3Dattachments:
=2D- snip --
[...]
++ eval '
want_trace && set -x
mkdir -p "$TRASH_DIRECTORY/prereq-test-dir" &&
(
cd "$TRASH_DIRECTORY/prereq-test-dir" &&
! git env--helper --type=3Dbool --default=3D0 --exit-code GIT_TEST_GETTEX=
T_POISON
)'
+++ want_trace
+++ test t =3D t
+++ test '' =3D t
+++ test t =3D t
+++ set -x
+++ mkdir -p '/d/a/1/s/t/trash directory.t5516-fetch-push/prereq-test-dir'
+++ cd '/d/a/1/s/t/trash directory.t5516-fetch-push/prereq-test-dir'
+++ git env--helper --type=3Dbool --default=3D0 --exit-code GIT_TEST_GETTE=
XT_POISON
prerequisite C_LOCALE_OUTPUT ok
error: 'grep remote error:.*not our ref.*64ea4c133d59fa98e86a771eda009872d=
6ab2886$ err' didn't find a match in:
fatal: git upload-pack: not our ref 64ea4c133d59fa98e86afatal: 771eda00987=
2d6abremote error: upload-pack2: not our re886
f 64ea4c133d59fa98e86a771eda009872d6ab2886
error: last command exited with $?=3D1
=2D- snap --
It is quite obvious that this `fatal:` line is garbled ;-)
> For detailed explanation please refer to code comments in this commit.
>
> There was a lot of back-and-forth already in vreportf():
> d048a96e (2007-11-09) - 'char msg[256]' is introduced to avoid interleav=
ing
> 389d1767 (2009-03-25) - Buffer size increased to 1024 to avoid truncatio=
n
> 625a860c (2009-11-22) - Buffer size increased to 4096 to avoid truncatio=
n
> f4c3edc0 (2015-08-11) - Buffer removed to avoid truncation
> b5a9e435 (2017-01-11) - Reverts f4c3edc0 to be able to replace control
> chars before sending to stderr
>
> This fix attempts to solve all issues:
> 1) avoid multiple fprintf() interleaving
> 2) avoid truncation
> 3) avoid char interleaving in fprintf() on some platforms
> 4) avoid buffer block interleaving when output is large
> 5) avoid out-of-order messages
> 6) replace control characters in output
>
> Other commits worthy of notice:
> 9ac13ec9 (2006-10-11) - Another attempt to solve interleaving.
> This is seemingly related to d048a96e.
> 137a0d0e (2007-11-19) - Addresses out-of-order for display()
> 34df8aba (2009-03-10) - Switches xwrite() to fprintf() in recv_sideband(=
)
> to support UTF-8 emulation
> eac14f89 (2012-01-14) - Removes the need for fprintf() for UTF-8 emulati=
on,
> so it's safe to use xwrite() again
> 5e5be9e2 (2016-06-28) - recv_sideband() uses xwrite() again
So far, it makes a lot of sense, and is well-researched. Thank you for
being very diligent.
>
> Signed-off-by: Alexandr Miloslavskiy <[email protected]>
> ---
> usage.c | 154 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 148 insertions(+), 6 deletions(-)
>
> diff --git a/usage.c b/usage.c
> index 2fdb20086b..ccdd91a7b9 100644
> --- a/usage.c
> +++ b/usage.c
> @@ -6,17 +6,159 @@
> #include "git-compat-util.h"
> #include "cache.h"
>
> +static void replace_control_chars(char* str, size_t size, char replacem=
ent)
> +{
> + size_t i;
> +
> + for (i =3D 0; i < size; i++) {
> + if (iscntrl(str[i]) && str[i] !=3D '\t' && str[i] !=3D '\n')
> + str[i] =3D replacement;
> + }
> +}
So this is just factored out from `vreportf()`, right?
> +
> +/*
> + * Atomically report (prefix + vsnprintf(err, params) + '\n') to stderr=
.
> + * Always returns desired buffer size.
> + * Doesn't write to stderr if content didn't fit.
> + *
> + * This function composes everything into a single buffer before
> + * sending to stderr. This is to defeat various non-atomic issues:
> + * 1) If stderr is fully buffered:
> + * the ordering of stdout and stderr messages could be wrong,
> + * because stderr output waits for the buffer to become full.
> + * 2) If stderr has any type of buffering:
> + * buffer has fixed size, which could lead to interleaved buffer
> + * blocks when two threads/processes write at the same time.
> + * 3) If stderr is not buffered:
> + * There are two problems, one with atomic fprintf() and another
> + * for non-atomic fprintf(), and both occur depending on platform
> + * (see details below). If atomic, this function still writes 3
> + * parts, which could get interleaved with multiple threads. If
> + * not atomic, then fprintf() will basically write char-by-char,
> + * which leads to unreadable char-interleaved writes if two
> + * processes write to stderr at the same time (threads are OK
> + * because fprintf() usually locks file in current process). This
> + * for example happens in t5516 where 'git-upload-pack' detects
> + * an error, reports it to parent 'git fetch' and both die() at the
> + * same time.
> + *
> + * Behaviors, at the moment of writing:
> + * a) libc - fprintf()-interleaved
> + * fprintf() enables temporary stream buffering.
> + * See: buffered_vfprintf()
> + * b) VC++ - char-interleaved
> + * fprintf() enables temporary stream buffering, but only if
> + * stream was not set to no buffering. This has no effect,
> + * because stderr is not buffered by default, and git takes
> + * an extra step to ensure that in swap_osfhnd().
> + * See: _iob[_IOB_ENTRIES],
> + * __acrt_stdio_temporary_buffering_guard,
> + * has_any_buffer()
> + * c) MinGW - char-interleaved (console), full buffering (file)
> + * fprintf() obeys stderr buffering. But it uses old MSVCRT.DLL,
> + * which eventually calls _flsbuf(), which enables buffering unle=
ss
> + * isatty(stderr) or buffering is disabled. Buffering is not disa=
bled
> + * by default for stderr. Therefore, buffering is enabled for
> + * file-redirected stderr.
> + * See: __mingw_vfprintf(),
> + * __pformat_wcputs(),
> + * _fputc_nolock(),
> + * _flsbuf(),
> + * _iob[_IOB_ENTRIES]
> + * 4) If stderr is line buffered: MinGW/VC++ will enable full
> + * buffering instead. See MSDN setvbuf().
> + */
> +static size_t vreportf_buf(char *buf, size_t buf_size, const char *pref=
ix, const char *err, va_list params)
> +{
> + int printf_ret =3D 0;
> + size_t prefix_size =3D 0;
> + size_t total_size =3D 0;
> +
> + /*
> + * NOTE: Can't use strbuf functions here, because it can be called whe=
n
> + * malloc() is no longer possible, and strbuf will recurse die().
> + */
> +
> + /* Prefix */
> + prefix_size =3D strlen(prefix);
> + if (total_size + prefix_size <=3D buf_size)
> + memcpy(buf + total_size, prefix, prefix_size);
> +
> + total_size +=3D prefix_size;
> +
> + /* Formatted message */
> + if (total_size <=3D buf_size)
> + printf_ret =3D vsnprintf(buf + total_size, buf_size - total_size, err=
, params);
> + else
> + printf_ret =3D vsnprintf(NULL, 0, err, params);
> +
> + if (printf_ret < 0)
> + BUG("your vsnprintf is broken (returned %d)", printf_ret);
> +
> + /*
> + * vsnprintf() returns _desired_ size (without terminating null).
> + * If vsnprintf() was truncated that will be seen when appending '\n'.
> + */
> + total_size +=3D printf_ret;
> +
> + /* Trailing \n */
> + if (total_size + 1 <=3D buf_size)
> + buf[total_size] =3D '\n';
> +
> + total_size +=3D 1;
> +
> + /* Send the buffer, if content fits */
> + if (total_size <=3D buf_size) {
> + replace_control_chars(buf, total_size, '?');
> + fwrite(buf, total_size, 1, stderr);
> + }
> +
> + return total_size;
> +}
> +
> void vreportf(const char *prefix, const char *err, va_list params)
> {
> + size_t res =3D 0;
> + char *buf =3D NULL;
> + size_t buf_size =3D 0;
> +
> + /*
> + * NOTE: This can be called from failed xmalloc(). Any malloc() can
> + * fail now. Let's try to report with a fixed size stack based buffer.
> + * Also, most messages should fit, and this path is faster.
> + */
> char msg[4096];
> - char *p;
> + res =3D vreportf_buf(msg, sizeof(msg), prefix, err, params);
> + if (res <=3D sizeof(msg)) {
> + /* Success */
> + return;
> + }
>
> - vsnprintf(msg, sizeof(msg), err, params);
> - for (p =3D msg; *p; p++) {
> - if (iscntrl(*p) && *p !=3D '\t' && *p !=3D '\n')
> - *p =3D '?';
> + /*
> + * Try to allocate a suitable sized malloc(), if possible.
> + * NOTE: Do not use xmalloc(), because on failure it will call
> + * die() or warning() and lead to recursion.
> + */
> + buf_size =3D res;
> + buf =3D malloc(buf_size);
Why not `alloca()`?
And to take a step back, I think that previous rounds of patches trying
to essentially address the same problem made the case that it is okay to
cut off insanely-long messages, so I am not sure we would want to open
that can of worms again...
> + if (buf) {
> + res =3D vreportf_buf(buf, buf_size, prefix, err, params);
> + FREE_AND_NULL(buf);
> +
> + if (res <=3D buf_size) {
> + /* Success */
> + return;
> + }
> }
> - fprintf(stderr, "%s%s\n", prefix, msg);
> +
> + /*
> + * When everything fails, report in parts.
> + * This can have all problems prevented by vreportf_buf().
> + */
> + fprintf(stderr, "vreportf: not enough memory (tried to allocate %lu by=
tes)\n", (unsigned long)buf_size);
> + fputs(prefix, stderr);
> + vfprintf(stderr, err, params);
> + fputc('\n', stderr);
Quite honestly, I would love to avoid that amount of complexity,
certainly in a part of the code that we would like to have rock solid
because it is usually exercised when things go very, very wrong and we
need to provide the user who is bitten by it enough information to take
to the Git contributors to figure out the root cause(s).
So let's take another step back and look at the original `vreportf()`:
void vreportf(const char *prefix, const char *err, va_list params)
{
char msg[4096];
char *p;
vsnprintf(msg, sizeof(msg), err, params);
for (p =3D msg; *p; p++) {
if (iscntrl(*p) && *p !=3D '\t' && *p !=3D '\n')
*p =3D '?';
}
fprintf(stderr, "%s%s\n", prefix, msg);
}
Is the problem that causes those failures with VS the fact that
`fprintf(stderr, ...)` might be interleaved with the output of another
process that _also_ wants to write to `stderr`? I assume that this _is_
the problem.
Further, I guess that the problem is compounded by the fact that we
usually run the tests in a Git Bash on Windows, i.e. in a MinTTY that
emulates a console (there _is_ work under way to support the newly
introduces ptys, but that work is far from done), so the standard error
file handle might behave in unexpected ways in that scenario.
But I do wonder whether replacing that `fprintf()` by a `write()` would
work better. After all, we could write the `prefix` into the `msg`
already:
size_t off =3D strlcpy(msg, prefix, sizeof(msg));
int ret =3D vsnprintf(msg + off, sizeof(msg) - off, err, params);
[...]
if (ret > 0)
write(2, msg, off + ret);
Would that also work around the problem?
Ciao,
Dscho
> }
>
> static NORETURN void usage_builtin(const char *err, va_list params)
> --
> gitgitgadget
>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Alexandr Miloslavskiy wrote (reply to this):
> Maybe this could do with an example?
I myself observed results like this when running t5516:
------
fatal: git fatal: remote errouploar: upload-pack: not our ref
64ea4c133d59fa98e86a771eda009872d6ab2886d-pack: not o
ur ref 64ea4c133d59fa98e86a771eda009872d6ab2886
------
Do you want me to add this garbled string to commit message?
>> +static void replace_control_chars(char* str, size_t size, char replacement)
> So this is just factored out from `vreportf()`, right?
Yes.
>> + buf = malloc(buf_size);
> Why not `alloca()`?
Allocating large chunks on stack is usually not recommended. There is a
funny test "init allows insanely long --template" in t0001 which
demonstrates that sometimes vreportf() can attempt to print very long
strings. Crashing due to stack overflow doesn't sound like a good thing.
> And to take a step back, I think that previous rounds of patches trying
> to essentially address the same problem made the case that it is okay to
> cut off insanely-long messages, so I am not sure we would want to open
> that can of worms again...
I draw a different conclusion here. Each author thought that "1024 must
definitely be enough!" but then discovered that it's not enough again,
for example due to long "usage" output. At some point, f4c3edc0 even
tried to remove all limits after a chain of limits that were too small.
So I would say that this is still a problem.
> Quite honestly, I would love to avoid that amount of complexity,
> certainly in a part of the code that we would like to have rock solid
> because it is usually exercised when things go very, very wrong and we
> need to provide the user who is bitten by it enough information to take
> to the Git contributors to figure out the root cause(s).
It's a choice between simpler code and trying to account for everything
that could happen. I think we'd rather have more complex code that
handles more cases, exactly to try and deliver output to user no matter
what.
> Is the problem that causes those failures with VS the fact that
> `fprintf(stderr, ...)` might be interleaved with the output of another
> process that _also_ wants to write to `stderr`? I assume that this _is_
> the problem.
This is where I started. But if you look at comment in vreportf_buf,
there are more problems, such as interleaving blocks of larger messages,
which could happen on any platform. I tried to make it work in most
cases possible.
> Further, I guess that the problem is compounded by the fact that we
> usually run the tests in a Git Bash on Windows, i.e. in a MinTTY that
> emulates a console (there _is_ work under way to support the newly
> introduces ptys, but that work is far from done), so the standard error
> file handle might behave in unexpected ways in that scenario.
To my knowledge, this is not related. t5516 failures are because git
explicitly wants stderr to be unbuffered. VC++ and MinGW runtimes take
that literally. fprintf() outputs char-by-char, and all of that results
in char-interleaving.
> But I do wonder whether replacing that `fprintf()` by a `write()` would
> work better. After all, we could write the `prefix` into the `msg`
> already:
>
> size_t off = strlcpy(msg, prefix, sizeof(msg));
> int ret = vsnprintf(msg + off, sizeof(msg) - off, err, params);
> [...]
> if (ret > 0)
> write(2, msg, off + ret);
>
> Would that also work around the problem?
You forgot to add '\'n. But yes, that would solve many problems, except
truncation to 4096. Then I would expect a patch to increase buffer size
to 8192 in the next couple years. And if you also try to solve
truncation, it will get you very close to my code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Alex,
On Fri, 25 Oct 2019, Alexandr Miloslavskiy wrote:
> > Maybe this could do with an example?
>
> I myself observed results like this when running t5516:
> ------
> fatal: git fatal: remote errouploar: upload-pack: not our ref
> 64ea4c133d59fa98e86a771eda009872d6ab2886d-pack: not o
> ur ref 64ea4c133d59fa98e86a771eda009872d6ab2886
> ------
>
> Do you want me to add this garbled string to commit message?
I think that would make things a lot more relatable ;-)
My example is even worse (read: more convincing), though:
fatal: git uploadfata-lp: raemcokte :error: upload-pnot our arcef k6: n4o=
t our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
8e86a771eda009872d6ab2886
So maybe you want to use that?
> > > + buf =3D malloc(buf_size);
>
> > Why not `alloca()`?
>
> Allocating large chunks on stack is usually not recommended. There is a =
funny
> test "init allows insanely long --template" in t0001 which demonstrates =
that
> sometimes vreportf() can attempt to print very long strings. Crashing du=
e to
> stack overflow doesn't sound like a good thing.
Indeed. A clipped message will be a lot more helpful in such a scenario.
> > And to take a step back, I think that previous rounds of patches tryin=
g
> > to essentially address the same problem made the case that it is okay =
to
> > cut off insanely-long messages, so I am not sure we would want to open
> > that can of worms again...
>
> I draw a different conclusion here. Each author thought that "1024 must
> definitely be enough!" but then discovered that it's not enough again, f=
or
> example due to long "usage" output. At some point, f4c3edc0 even tried t=
o
> remove all limits after a chain of limits that were too small. So I woul=
d say
> that this is still a problem.
The most important aspect of `vreportf()` is, at least as far as I am
concerned, to "get the message out".
As such, no, I don't want it to fail, neither due to `alloca()` nor due
to `malloc()`. I'd rather have it produce a cut-off message that at
least shows the first 4095 bytes of the message.
> > Quite honestly, I would love to avoid that amount of complexity,
> > certainly in a part of the code that we would like to have rock
> > solid because it is usually exercised when things go very, very
> > wrong and we need to provide the user who is bitten by it enough
> > information to take to the Git contributors to figure out the root
> > cause(s).
>
> It's a choice between simpler code and trying to account for
> everything that could happen. I think we'd rather have more complex
> code that handles more cases, exactly to try and deliver output to
> user no matter what.
Complex code usually has more bugs than simple code. I don't want
`vreportf()` to have potential bugs that we don't know about.
> > Is the problem that causes those failures with VS the fact that
> > `fprintf(stderr, ...)` might be interleaved with the output of
> > another process that _also_ wants to write to `stderr`? I assume
> > that this _is_ the problem.
>
> This is where I started. But if you look at comment in vreportf_buf,
> there are more problems, such as interleaving blocks of larger
> messages, which could happen on any platform. I tried to make it work
> in most cases possible.
Again, I don't think that it is wise to try to make this work for
arbitrary sizes of error messages.
At some stage, the scrollback of the console won't be large enough to
fix that message!
So I think it is very sane to say that at some point, enough is enough.
Four thousand bytes seems a really long message, anyway.
> > Further, I guess that the problem is compounded by the fact that we
> > usually run the tests in a Git Bash on Windows, i.e. in a MinTTY
> > that emulates a console (there _is_ work under way to support the
> > newly introduces ptys, but that work is far from done), so the
> > standard error file handle might behave in unexpected ways in that
> > scenario.
>
> To my knowledge, this is not related. t5516 failures are because git
> explicitly wants stderr to be unbuffered. VC++ and MinGW runtimes take
> that literally. fprintf() outputs char-by-char, and all of that
> results in char-interleaving.
Yes, as my example above demonstrates. (Ugh!)
> > But I do wonder whether replacing that `fprintf()` by a `write()` woul=
d
> > work better. After all, we could write the `prefix` into the `msg`
> > already:
> >
> > size_t off =3D strlcpy(msg, prefix, sizeof(msg));
> > int ret =3D vsnprintf(msg + off, sizeof(msg) - off, err, params);
> > [...]
> > if (ret > 0)
> > write(2, msg, off + ret);
> >
> > Would that also work around the problem?
>
> You forgot to add '\'n. But yes, that would solve many problems,
... and indeed, I verified that this patch fixes the problem:
=2D- snip --
diff --git a/usage.c b/usage.c
index 2fdb20086bd..7f5bdfb0f40 100644
=2D-- a/usage.c
+++ b/usage.c
@@ -10,13 +10,16 @@ void vreportf(const char *prefix, const char *err, va_=
list params)
{
char msg[4096];
char *p;
-
- vsnprintf(msg, sizeof(msg), err, params);
+ size_t off =3D strlcpy(msg, prefix, sizeof(msg));
+ int ret =3D vsnprintf(msg + off, sizeof(msg) - off, err, params);
for (p =3D msg; *p; p++) {
if (iscntrl(*p) && *p !=3D '\t' && *p !=3D '\n')
*p =3D '?';
}
- fprintf(stderr, "%s%s\n", prefix, msg);
+ if (ret > 0) {
+ msg[off + ret] =3D '\n'; /* we no longer need a NUL */
+ write_in_full(2, msg, off + ret + 1);
+ }
}
static NORETURN void usage_builtin(const char *err, va_list params)
=2D- snap --
> except truncation to 4096. Then I would expect a patch to increase
> buffer size to 8192 in the next couple years. And if you also try to
> solve truncation, it will get you very close to my code.
My point is: I don't want to "fix" truncation. I actually think of it as
a feature. An error message that is longer than the average news article
I read is too long, period.
BTW I have a couple more tidbits to add to the commit message, if you
would be so kind as to pick them up: I know _which_ two processes battle
for `stderr`. I instrumented the code slightly, and this is what I got:
=2D- snip --
$ GIT_TRACE=3D1 GIT_TEST_PROTOCOL_VERSION=3D ../git.exe --exec-path=3D$PWD=
/.. -C trash\ directory.t5516-fetch-push/shallow2/ fetch ../testrepo/.gi=
t 64ea4c133d59fa98e86a771eda009872d6ab2886
14:55:55.360382 exec-cmd.c:238 trace: resolved executable dir: C:=
/git-sdk-64/usr/src/vs2017-test
14:55:55.362379 exec-cmd.c:54 RUNTIME_PREFIX requested, but pref=
ix computation failed. Using static fallback '/mingw64'.
14:55:55.387189 git.c:444 trace: built-in (pid=3D21620): git=
fetch ../testrepo/.git 64ea4c133d59fa98e86a771eda009872d6ab2886
14:55:55.392644 run-command.c:663 trace: run_command: unset GIT_PREF=
IX; 'git-upload-pack '\''../testrepo/.git'\'''
14:55:55.659992 exec-cmd.c:238 trace: resolved executable dir: C:=
/git-sdk-64/usr/src/vs2017-test
14:55:55.661762 exec-cmd.c:54 RUNTIME_PREFIX requested, but pref=
ix computation failed. Using static fallback '/mingw64'.
14:55:55.662759 git.c:444 trace: built-in (pid=3D27452): git=
upload-pack ../testrepo/.git
14:55:55.681188 run-command.c:663 trace: run_command: git rev-list -=
-stdin
fatal: git upload-pack: not our ref 64ea4c133d59fa98e86a771eda009872d6ab28=
86
fatal: remote error (pid=3D21620): upload-pack: not our ref 64ea4c133d59fa=
98e86a771eda009872d6ab2886
=2D- snap --
As you can see, the two error messages stem from the `git fetch` process
(with the prefix "remote error:") and the process it spawned, `git upload-=
pack`.
BTW if you pick up the indicated patch and the tidbits for the commit
message and then send out a new iteration via GitGitGadget, I would not
mind being co-author at all ;-)
Ciao,
Dscho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Alexandr Miloslavskiy wrote (reply to this):
On 25.10.2019 16:02, Johannes Schindelin wrote:
> My example is even worse (read: more convincing), though:
>
> fatal: git uploadfata-lp: raemcokte :error: upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
> 8e86a771eda009872d6ab2886
>
> So maybe you want to use that?
OK.
> Again, I don't think that it is wise to try to make this work for
> arbitrary sizes of error messages.
> My point is: I don't want to "fix" truncation. I actually think of it
> as a feature
It would be helpful to hear opinions from someone else, before the patch
is reworked significantly.
> I know _which_ two processes battle for `stderr`.
I think I said the same in code comment, bullet 3, near t5516?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Alex,
On Fri, 25 Oct 2019, Alexandr Miloslavskiy wrote:
> On 25.10.2019 16:02, Johannes Schindelin wrote:
> > My example is even worse (read: more convincing), though:
> >
> > fatal: git uploadfata-lp: raemcokte :error: upload-pnot our arcef k6:=
n4ot
> > our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9
> > 8e86a771eda009872d6ab2886
> >
> > So maybe you want to use that?
>
> OK.
>
> > Again, I don't think that it is wise to try to make this work for
> > arbitrary sizes of error messages.
>
> > My point is: I don't want to "fix" truncation. I actually think of it
> > as a feature
>
> It would be helpful to hear opinions from someone else, before the patch=
is
> reworked significantly.
If you must wait, well, then you must.
The commits you found seem to suggest already that there is support for
clipping the message, but hey, what do I know, maybe the mood changed
over the years.
Since I have to re-run CI/PR builds regularly that failed due to t5516,
I will be very tempted _not_ to wait, though.
> > I know _which_ two processes battle for `stderr`.
>
> I think I said the same in code comment, bullet 3, near t5516?
Probably.
A code comment about a test case that is not in the very vicinity of
said comment is _prone_ to get stale.
In other words: this information does not belong into a code comment. It
belongs into the commit message.
If you needed any indication that this is true: I would not have missed
this important piece if it had been in the commit message (instead of
the code with whose added complexity I disagree).
Ciao,
Dscho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Peff,
On Fri, 25 Oct 2019, Jeff King wrote:
> On Fri, Oct 25, 2019 at 04:02:36PM +0200, Johannes Schindelin wrote:
>
> > ... and indeed, I verified that this patch fixes the problem:
> >
> > -- snip --
> > diff --git a/usage.c b/usage.c
> > index 2fdb20086bd..7f5bdfb0f40 100644
> > --- a/usage.c
> > +++ b/usage.c
> > @@ -10,13 +10,16 @@ void vreportf(const char *prefix, const char *err,=
va_list params)
> > {
> > char msg[4096];
> > char *p;
> > -
> > - vsnprintf(msg, sizeof(msg), err, params);
> > + size_t off =3D strlcpy(msg, prefix, sizeof(msg));
> > + int ret =3D vsnprintf(msg + off, sizeof(msg) - off, err, params);
> > for (p =3D msg; *p; p++) {
> > if (iscntrl(*p) && *p !=3D '\t' && *p !=3D '\n')
> > *p =3D '?';
> > }
> > - fprintf(stderr, "%s%s\n", prefix, msg);
> > + if (ret > 0) {
> > + msg[off + ret] =3D '\n'; /* we no longer need a NUL */
> > + write_in_full(2, msg, off + ret + 1);
> > + }
> > }
>
> Heh. This is quite similar to what I posted in:
>
> https://public-inbox.org/git/[email protected].=
net/
>
> though I missed the cleverness with "we no longer need a NUL" to get an
> extra byte. ;)
:-)
I also use `xwrite()` instead of `write()`...
> > > except truncation to 4096. Then I would expect a patch to increase
> > > buffer size to 8192 in the next couple years. And if you also try to
> > > solve truncation, it will get you very close to my code.
> >
> > My point is: I don't want to "fix" truncation. I actually think of it =
as
> > a feature. An error message that is longer than the average news artic=
le
> > I read is too long, period.
>
> Yeah. As the person responsible for many of the "avoid truncation" works
> referenced in the original patch, I have come to the conclusion that it
> is not worth the complexity. Even when we do manage to produce a
> gigantic error message correctly, it's generally not very readable.
>
> That's basically what I came here to say, and I was pleased to find that
> you had already argued for it quite well. So I'll just add my support
> for the direction you've taken the conversation.
Thank you for affirming. I have to admit that I would have loved for my
argument to work on its own, and not require the additional force of a
second opinion. In my mind, there is little opinion required here.
> I _do_ wish we could do the truncation more intelligently. I'd much
> rather see:
>
> error: unable to open 'absurdly-long-file-name...': permission denied
>
> than:
>
> error: unable to open 'absurdly-long-file-name-that-goes-on-forever-an=
d-ev
>
> But I don't think it's possible without reimplementing snprintf
> ourselves.
Indeed. I _did_ start to implement `strbuf_vaddf()` from scratch, over
ten years ago:
https://public-inbox.org/git/[email protected]=
/
I am not sure whether we want to resurrect it, it would need to grow
support _at least_ for `%PRIuMAX` and `%PRIdMAX`, but that should not be
hard.
Back to the issue at hand: I did open a GitGitGadget PR with my proposed
change, in the hopes that I could somehow fast-track this fix into the
CI/PR builds over at https://github.com/gitgitgadget/git, but there are
problems: it seems that now there is an at least occasional broken pipe
in the same test when run on macOS.
There _also_ seems to be something spooky going on in t3510.12 and .13,
where the expected output differs from the actual output only by a
re-ordering of the lines:
=2D- snip --
[...]
+++ diff -u expect advice
=2D-- expect 2019-10-25 22:17:44.982884700 +0000
+++ advice 2019-10-25 22:17:45.278884500 +0000
@@ -1,3 +1,3 @@
error: cherry-pick is already in progress
-hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
fatal: cherry-pick failed
+hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
=2D- snap --
For details, see:
https://dev.azure.com/gitgitgadget/git/_build/results?buildId=3D19336&view=
=3Dms.vss-test-web.build-test-results-tab
and
https://dev.azure.com/Git-for-Windows/git/_build/results?buildId=3D44549&v=
iew=3Dms.vss-test-web.build-test-results-tab
(You need to click on a test case title to open the logs, then inspect
the Attachments to get to the full trace)
So much as I would love to see the flakiness of t5516 be fixed as soon
as possible, I fear we will have to look at the underlying issue a bit
closer: there are two processes writing to `stderr` concurrently. I
don't know whether there would be a good way for the `stderr` of the
`upload-pack` process to be consumed by the `fetch` process, and to be
printed by the latter.
Ciao,
Dscho
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Jeff King wrote (reply to this):
On Sat, Oct 26, 2019 at 10:56:45PM +0200, Johannes Schindelin wrote:
> Back to the issue at hand: I did open a GitGitGadget PR with my proposed
> change, in the hopes that I could somehow fast-track this fix into the
> CI/PR builds over at https://github.com/gitgitgadget/git, but there are
> problems: it seems that now there is an at least occasional broken pipe
> in the same test when run on macOS.
Yes, I think that's another issue in the same test. There's more
discussion further down in the thread I linked earlier, starting here:
https://public-inbox.org/git/[email protected]/
and I think Gábor's solution here:
https://public-inbox.org/git/[email protected]/
is the right direction (and note that this _isn't_ just a test artifact,
but a bug that occasionally hits real-world cases, too).
> There _also_ seems to be something spooky going on in t3510.12 and .13,
> where the expected output differs from the actual output only by a
> re-ordering of the lines:
>
> -- snip --
> [...]
> +++ diff -u expect advice
> --- expect 2019-10-25 22:17:44.982884700 +0000
> +++ advice 2019-10-25 22:17:45.278884500 +0000
> @@ -1,3 +1,3 @@
> error: cherry-pick is already in progress
> -hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> fatal: cherry-pick failed
> +hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> -- snap --
Hrm. I'd have thought those are both coming from the same process. Which
implies that we're not fflushing stderr before calling write(2). But
your patch seems to do so...
<scratches head> Aha. I think you force-pushed up as I am typing this.
:) So I think that is indeed the solution.
> So much as I would love to see the flakiness of t5516 be fixed as soon
> as possible, I fear we will have to look at the underlying issue a bit
> closer: there are two processes writing to `stderr` concurrently. I
> don't know whether there would be a good way for the `stderr` of the
> `upload-pack` process to be consumed by the `fetch` process, and to be
> printed by the latter.
The worst part is that this message already _is_ consumed by fetch: we
send it twice, once over the sideband, and once directly to stderr. In
most cases the stderr version is lost, but some server providers might
be collecting it. I wouldn't mind seeing the direct-to-stderr one
dropped. There's some more discussion in (from the same thread linked
earlier):
https://public-inbox.org/git/[email protected]/
-Peff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Hi Peff,
On Sat, 26 Oct 2019, Johannes Schindelin wrote:
> [...] I did open a GitGitGadget PR with my proposed change,
I should have mentioned the URL:
https://github.com/gitgitgadget/git/pull/428
FWIW, in the meantime I managed to address below-mentioned breakages
(apart from the broken pipe problem that is discussed over here:
https://public-inbox.org/git/[email protected]/) and the
build is green.
Alex asked to be given time to brush his patch up on Monday, so I am
holding off sending my version (for now...).
Ciao,
Dscho
> in the hopes that I could somehow fast-track this fix into the
> CI/PR builds over at https://github.com/gitgitgadget/git, but there are
> problems: it seems that now there is an at least occasional broken pipe
> in the same test when run on macOS.
>
> There _also_ seems to be something spooky going on in t3510.12 and .13,
> where the expected output differs from the actual output only by a
> re-ordering of the lines:
>
> -- snip --
> [...]
> +++ diff -u expect advice
> --- expect 2019-10-25 22:17:44.982884700 +0000
> +++ advice 2019-10-25 22:17:45.278884500 +0000
> @@ -1,3 +1,3 @@
> error: cherry-pick is already in progress
> -hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> fatal: cherry-pick failed
> +hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> -- snap --
>
> For details, see:
> https://dev.azure.com/gitgitgadget/git/_build/results?buildId=3D19336&vi=
ew=3Dms.vss-test-web.build-test-results-tab
> and
> https://dev.azure.com/Git-for-Windows/git/_build/results?buildId=3D44549=
&view=3Dms.vss-test-web.build-test-results-tab
> (You need to click on a test case title to open the logs, then inspect
> the Attachments to get to the full trace)
>
> So much as I would love to see the flakiness of t5516 be fixed as soon
> as possible, I fear we will have to look at the underlying issue a bit
> closer: there are two processes writing to `stderr` concurrently. I
> don't know whether there would be a good way for the `stderr` of the
> `upload-pack` process to be consumed by the `fetch` process, and to be
> printed by the latter.
>
> Ciao,
> Dscho
>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
Sorry, me again,
On Sat, 26 Oct 2019, Johannes Schindelin wrote:
> On Sat, 26 Oct 2019, Johannes Schindelin wrote:
>
> > [...] I did open a GitGitGadget PR with my proposed change,
>
> I should have mentioned the URL:
>
> https://github.com/gitgitgadget/git/pull/428
I should have mentioned that I also opened
https://github.com/git-for-windows/git/pull/2373 with the same branch,
which exercizes the Visual Studio build, too, so it is a bit more
informative than the GitGitGadget PR (which only runs the regular GCC
build on Windows which, as per Alex' analysis, is not affected by this
problem).
Ciao,
Dscho
> FWIW, in the meantime I managed to address below-mentioned breakages
> (apart from the broken pipe problem that is discussed over here:
> https://public-inbox.org/git/[email protected]/) and the
> build is green.
>
> Alex asked to be given time to brush his patch up on Monday, so I am
> holding off sending my version (for now...).
>
> Ciao,
> Dscho
>
> > in the hopes that I could somehow fast-track this fix into the
> > CI/PR builds over at https://github.com/gitgitgadget/git, but there ar=
e
> > problems: it seems that now there is an at least occasional broken pip=
e
> > in the same test when run on macOS.
> >
> > There _also_ seems to be something spooky going on in t3510.12 and .13=
,
> > where the expected output differs from the actual output only by a
> > re-ordering of the lines:
> >
> > -- snip --
> > [...]
> > +++ diff -u expect advice
> > --- expect 2019-10-25 22:17:44.982884700 +0000
> > +++ advice 2019-10-25 22:17:45.278884500 +0000
> > @@ -1,3 +1,3 @@
> > error: cherry-pick is already in progress
> > -hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> > fatal: cherry-pick failed
> > +hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> > -- snap --
> >
> > For details, see:
> > https://dev.azure.com/gitgitgadget/git/_build/results?buildId=3D19336&=
view=3Dms.vss-test-web.build-test-results-tab
> > and
> > https://dev.azure.com/Git-for-Windows/git/_build/results?buildId=3D445=
49&view=3Dms.vss-test-web.build-test-results-tab
> > (You need to click on a test case title to open the logs, then inspect
> > the Attachments to get to the full trace)
> >
> > So much as I would love to see the flakiness of t5516 be fixed as soon
> > as possible, I fear we will have to look at the underlying issue a bit
> > closer: there are two processes writing to `stderr` concurrently. I
> > don't know whether there would be a good way for the `stderr` of the
> > `upload-pack` process to be consumed by the `fetch` process, and to be
> > printed by the latter.
> >
> > Ciao,
> > Dscho
> >
>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the Git mailing list, Johannes Schindelin wrote (reply to this):
This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.
--8323328-1973625787-1572278755=:46
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Hi Peff,
On Sat, 26 Oct 2019, Jeff King wrote:
> On Sat, Oct 26, 2019 at 10:56:45PM +0200, Johannes Schindelin wrote:
>
> > Back to the issue at hand: I did open a GitGitGadget PR with my propos=
ed
> > change, in the hopes that I could somehow fast-track this fix into the
> > CI/PR builds over at https://github.com/gitgitgadget/git, but there ar=
e
> > problems: it seems that now there is an at least occasional broken pip=
e
> > in the same test when run on macOS.
>
> Yes, I think that's another issue in the same test. There's more
> discussion further down in the thread I linked earlier, starting here:
>
> https://public-inbox.org/git/[email protected]=
et/
>
> and I think G=C3=A1bor's solution here:
>
> https://public-inbox.org/git/[email protected]/
>
> is the right direction (and note that this _isn't_ just a test artifact,
> but a bug that occasionally hits real-world cases, too).
That sounds good! I guess I should continue _that_ thread.
> > There _also_ seems to be something spooky going on in t3510.12 and .13=
,
> > where the expected output differs from the actual output only by a
> > re-ordering of the lines:
> >
> > -- snip --
> > [...]
> > +++ diff -u expect advice
> > --- expect 2019-10-25 22:17:44.982884700 +0000
> > +++ advice 2019-10-25 22:17:45.278884500 +0000
> > @@ -1,3 +1,3 @@
> > error: cherry-pick is already in progress
> > -hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> > fatal: cherry-pick failed
> > +hint: try "git cherry-pick (--continue | --skip | --abort | --quit)"
> > -- snap --
>
> Hrm. I'd have thought those are both coming from the same process. Which
> implies that we're not fflushing stderr before calling write(2). But
> your patch seems to do so...
>
> <scratches head> Aha. I think you force-pushed up as I am typing this.
> :) So I think that is indeed the solution.
Yes, sorry, I had this idea and it worked locally, and I wanted to know
whether it would turn the PR build green.
> > So much as I would love to see the flakiness of t5516 be fixed as soon
> > as possible, I fear we will have to look at the underlying issue a bit
> > closer: there are two processes writing to `stderr` concurrently. I
> > don't know whether there would be a good way for the `stderr` of the
> > `upload-pack` process to be consumed by the `fetch` process, and to be
> > printed by the latter.
>
> The worst part is that this message already _is_ consumed by fetch: we
> send it twice, once over the sideband, and once directly to stderr. In
> most cases the stderr version is lost, but some server providers might
> be collecting it. I wouldn't mind seeing the direct-to-stderr one
> dropped. There's some more discussion in (from the same thread linked
> earlier):
>
> https://public-inbox.org/git/[email protected].=
net/
It is tricky all right.
Full disclosure: I am mainly interested in having lots less failing
builds (which I all re-run manually when I see that a known-flaky test
failed).
Ciao,
Dscho
--8323328-1973625787-1572278755=:46--
FWIW I think that I was able to present a much less controversial fix in #428. |
54f0d6f
to
7827b2c
Compare
|
||
ret = vsnprintf(msg + prefix_size, sizeof(msg) - prefix_size, err, params); | ||
if (ret < 0) | ||
BUG("your vsnprintf is broken (returned %d)", ret); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'll cause an infinite loop, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not to my knowledge, because BUG() uses a different format string, which isn't broken.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you agree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, this is too fragile. If we're here, things are already going wrong, and I'd rather not add a recursion into the same function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just exit silently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say, if BUG()
is trustworthy, then it shouldn't be a problem to have 1 level of recursion. If BUG()
is suspected to do very weird things, then it should rather be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BUG()
calls vreportf()
. This recursion makes me uneasy. You might say that it cannot cause problems. But even you will have to admit that reasoning about this is not easy, and therein lies the opposite of perfect.
In any case, Git requires a vsnprintf()
that returns the desired size, otherwise pretty much every strbuf_addf()
call is broken (and there are many). So yes, if things go so wrong as to let vsnprintf()
return something negative, yes, I would exit silently.
And I would certainly not introduce such a check in a commit that does many other things at the same time, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will change that to exiting silently.
I'm currently testing this patch. Any feedback welcome, seriously. I do try to achieve high quality, but yes, it takes some time. |
ebaf407
to
862d726
Compare
Previously, `vreportf()` wasn't atomic enough. Depending on `fprintf()` implementation and `stderr` type (console or file), that could lead to interleaved and sometimes hard to read outputs. This fixes user visible character-interleaved output on Windows. This also fixes t5516 on Windows. Example ------- This output was produced by t5516 on Windows, where both `git fetch` and child `git upload-pack` call `die()` at the same time: fatal: git uploadfata-lp: raemcokte :error: upload-pnot our arcef k6: n4ot our ea4cr1e3f 36d45ea94fca1398e86a771eda009872d63adb28598f6a9 8e86a771eda009872d6ab2886 Idea behind this patch ---------------------- Pre-format everything to memory and use low-level `xwrite()` to send string to `stderr` in a more atomic way. I'm not sure if `xwrite()` has guarantees to be atomic, but this is definitely better then `fprintf()` and has shown to work fine. Message trimming ---------------- Initially I wrote code to also support messages of any length. However, during patch discussion both Johannes Schindelin and Jeff King voted against extra complexity added by trying to dodge failing `malloc` and still report something to user. It is not expected that a reasonable error message can exceed 4096 bytes and still be better then a trimmed message. Pitfalls -------- 1) `vsnprintf()` returns desired size instead of trimmed size when full output is too long to fit. In that case, the returned value must not be used for indexing. 2) If `vsnprintf()` was trimmed, `msg` could contain uninitialized bytes at the end if last character that didn't fit was a multibyte character. In this case, fitting '\n' as last character of `msg` could result in sending corrupt multibyte data. 3) `strlcpy()`, which could be used instead of `memcpy()`, also returns desired size instead of trimmed size. This again makes an easy pitfall in the next line with `vsnprintf()`. fprintf() has problems with any buffering settings -------------------------------------------------- 1) If `stderr` is fully buffered: the ordering of `stdout` and `stderr` messages could be wrong, because `stderr` output waits for the buffer to become full. 2) If `stderr` has any type of buffering: buffer has fixed size, which could lead to interleaved buffer blocks when two threads/processes write at the same time. 3) If `stderr` is not buffered: Some implementations, such as VC++ and MinGW, literally disable buffering and `fprintf()` will output char-by-char, which leads to unreadable char-interleaved writes if two processes write to `stderr` at the same time (threads are OK because `fprintf()` usually locks `FILE*` in current process). 4) If stderr is line buffered: MinGW/VC++ will enable full buffering instead. See MSDN for `setvbuf()`. fprintf() behavior in git, per platform --------------------------------------- a) libc - large outputs can be block-interleaved fprintf() enables temporary stream buffering. Code references: buffered_vfprintf() b) VC++ - char-interleaved fprintf() enables temporary stream buffering, but only if stream was not set to no buffering. This has no effect, because stderr is not buffered by default, and git takes an extra step to ensure that in `swap_osfhnd()`. Code references: _iob[_IOB_ENTRIES] __acrt_stdio_temporary_buffering_guard has_any_buffer() c) MinGW - char-interleaved (console), full buffering (file) `fprintf()` obeys `stderr` buffering. But it uses old MSVCRT.DLL, which eventually calls `_flsbuf()`, which enables buffering unless `isatty(stderr)` or buffering is disabled. Buffering is not disabled by default for `stderr`. Therefore, buffering is enabled only for file-redirected `stderr`. Code references: __mingw_vfprintf() __pformat_wcputs() _fputc_nolock() _flsbuf() _iob[_IOB_ENTRIES] History of interleaving in `vreportf()` --------------------------------------- d048a96 (2007-11-09) - `char msg[256]` used to avoid interleaving, but it still uses `fprintf()` and issues remain 389d176 (2009-03-25) - Buffer size increased to 1024 to avoid truncation 625a860 (2009-11-22) - Buffer size increased to 4096 to avoid truncation f4c3edc (2015-08-11) - Buffer removed to avoid truncation b5a9e43 (2017-01-11) - Reverts f4c3edc to be able to replace control chars before sending to `stderr` Non `vreportf()` commits worthy of notice ----------------------------------------- 9ac13ec (2006-10-11) - Another attempt to solve interleaving. This is seemingly related to d048a96. 137a0d0 (2007-11-19) - Addresses out-of-order for `display()` 34df8ab (2009-03-10) - Switches `xwrite()` to `fprintf()` in `recv_sideband()` to support UTF-8 emulation eac14f8 (2012-01-14) - Removes the need for `fprintf()` for UTF-8 emulation, so it's safe to use `xwrite()` again 5e5be9e (2016-06-28) - `recv_sideband()` uses `xwrite()` again Co-authored-by: Johannes Schindelin <[email protected]> Signed-off-by: Alexandr Miloslavskiy <[email protected]>
862d726
to
cb68173
Compare
Looks good to me now. Any objections? |
All of sudden, a failure on Linux in |
It's another known-flaky test... See #258 for details. |
/azp run |
Azure Pipelines successfully started running 2 pipeline(s). |
:( |
BUG("your vsnprintf is broken (returned %d)", ret); | ||
|
||
total_size = strlen(msg); /* vsnprintf() returns _desired_ size */ | ||
msg[total_size++] = '\n'; /* it's ok to overwrite terminating NULL */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a NUL
, not a NULL
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://linux.die.net/man/3/fprintf
... the terminating null ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from there, just used capital letters by habit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I don't think you want to hear what I have to say, so I'll just stop.
total_size = strlen(msg); /* vsnprintf() returns _desired_ size */ | ||
msg[total_size++] = '\n'; /* it's ok to overwrite terminating NULL */ | ||
|
||
replace_control_chars(msg, total_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I find it totally unnecessary to refactor this part of the code into its own function. It's not like we introduce another caller, or that those four lines were particularly complex.
In any case, a refactoring needs to go into its own commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked like a good function candidate to me. I guess git coding style also suggests to give names instead of writing comments. As for commit granularity, I'm not used to it. OK, I will inline the function again.
|
||
replace_control_chars(msg, total_size); | ||
|
||
fflush(stderr); /* flush FILE* before writing lowlevel fd */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment just repeats what the statement says, and is therefore non-DRY.
A more interesting bit of information is why this is necessary, and this information naturally should go into the commit message, not into the code, as code comments like ones talking about specifics of a certain platform are prone to become stale, which we try to avoid in Git's source code. The reasoning behind a patch is pretty much tied to the commit itself, and should therefore be described in the commit message.
Speaking of commit messages: the style you chose deviates noticeably from the style found in the existing commit history, and I wonder what reasons there are for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you suggest a better wording? It's necessary because FILE* and fd are different layers of abstraction. I think that problems can occur on any platform.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for commit message, what do you mean by style? Headers? I thought there's too much information to put in one stream, easier to navigate with headers. Just like in any other text.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I cannot suggest a better wording, as my suggestion to remove the unnecessary comment still stands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't expect that an average reader will understand quickly why fflush() could be needed here, so sounds like exactly when comments are good?
How about this wording? We're going to write to low-level fd while higher-level FILE* might still have something cached
replace_control_chars(msg, total_size); | ||
|
||
fflush(stderr); /* flush FILE* before writing lowlevel fd */ | ||
xwrite(2, msg, total_size); /* writing directly to fd is most atomic */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having seen this function, I see this PR edging closer and closer, step by step, to what I proposed already. There are still a couple of unnecessary bits and pieces that could be chopped off to arrive there, but I guess I really don't see what big improvements this PR offers compared to my proposed code (which was very similar to the code Jeff King proposed originally).
But maybe I am still missing something crucial?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the absence of bugs counts much more then small personal preferences about this and that.
Please refer to commit message for details.