Skip to content

fixup! trace2: collect Windows-specific process information #117

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jeffhostetler
Copy link

Guard against infinite loop while computing the parent process hierarchy.

CreateToolhelp32Snapshot() is used to get a list of all processes on the
system. Each process entry contains the process PID and PPID (alive at the
time of the snapshot). We compute the set of ancestors of the current process
by repeated searches on this list.

Testing revealed that the snapshot can contain PPID cycles. This causes an
infinite loop during the set construction and causes the git.exe command to
hang.

Testing found an instance where 3 processes were in a PPID cycle. The
snapshot implied that each of these processes was its own great-grandparent.
This should not be possible unless a PID was recycled and just happened to
match up.

For full disclosure, the Windows "System Idle Process" has PID and PPID 0.
If it were to launch a Git command, it could cause a similar infinite loop.
Or more properly, if any ancestor of the current Git command has PPID 0, it
will appear to be a descendant of the idle process and trigger the problem.

This commit fixes both cases by maintaining a list of the PIDs seen during
the ancestor walk and stopping if a cycle is detected.

Additionally, code was added to truncate the search after a reasonable fixed
depth limit.

Signed-off-by: Jeff Hostetler [email protected]

Guard against infinite loop while computing the parent process hierarchy.

CreateToolhelp32Snapshot() is used to get a list of all processes on the
system.  Each process entry contains the process PID and PPID (alive at the
time of the snapshot).  We compute the set of ancestors of the current process
by repeated searches on this list.

Testing revealed that the snapshot can contain PPID cycles.  This causes an
infinite loop during the set construction and causes the git.exe command to
hang.

Testing found an instance where 3 processes were in a PPID cycle.  The
snapshot implied that each of these processes was its own great-grandparent.
This should not be possible unless a PID was recycled and just happened to
match up.

For full disclosure, the Windows "System Idle Process" has PID and PPID 0.
If it were to launch a Git command, it could cause a similar infinite loop.
Or more properly, if any ancestor of the current Git command has PPID 0, it
will appear to be a descendant of the idle process and trigger the problem.

This commit fixes both cases by maintaining a list of the PIDs seen during
the ancestor walk and stopping if a cycle is detected.

Additionally, code was added to truncate the search after a reasonable fixed
depth limit.

Signed-off-by: Jeff Hostetler <[email protected]>
@jeffhostetler
Copy link
Author

This change are cherry-picked from my upstream submission: gitgitgadget#108

Copy link
Member

@jrbriggs jrbriggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple questions.


while (find_pid(pid, hSnapshot, &pe32)) {
jw_array_string(jw, pe32.szExeFile);
if (nr_pids == NR_PIDS_LIMIT) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Paranoia nit: if (nr_pids >= NR_PIDS_LIMIT) ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

true, but we'd already have a buffer overrun at that point.

if (!find_pid(pid, hSnapshot, &pe32))
return;
pid = pe32.th32ParentProcessID;
/* Check for cycle in snapshot. (Yes, it happened.) */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check for cycles or should we just give up once we exceed NR_PIDS_LIMIT?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, i could go either way here. my way lets us see "(cycle)" vs "(truncate)" in the Kusto data.
i thought that might be interesting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we check for a cycle (which we know happens), we will avoid having NR_PIDS_LIMIT lines of repeated entries every time it does happen...

Copy link

@benpeart benpeart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a reasonable compromise. Process IDs can and are reused. Most code paths follow the parent to child link so must not suffer from the same potential for a loop.

@jeffhostetler jeffhostetler merged commit 178619c into microsoft:vfs-2.20.1 Feb 11, 2019
@jeffhostetler jeffhostetler deleted the gvfs-trace2-next-fixup branch July 1, 2019 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants