Skip to content

Store the Windows process HANDLE in Execution to avoid pid reuse races #93

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 30, 2025

Conversation

jakepetroules
Copy link
Contributor

Unlike Unix, Windows doesn't have a "zombie" state where the numeric pid remains valid after the subprocess has exited. Instead, the numeric process ID becomes invalid immediately once the process has exited. Per the Windows documentation of GetProcessId and related APIs, "Until a process terminates, its process identifier uniquely identifies it on the system."

On Windows, instead of closing/discarding the process HANDLE immediately after CreateProcess returns, retain the HANDLE in Execution and only close it once monitorProcessTermination returns. This resolve a race condition where the process could have exited between the period where CloseProcess was called and monitorProcessTermination is called. Now we simply use the original process handle to wait for the process to exit and retrieve its exit code, rather than "reverse engineering" its HANDLE from its numeric pid using OpenProcess.

Closes #92

jakepetroules added a commit that referenced this pull request Jun 24, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
jakepetroules added a commit that referenced this pull request Jun 24, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
jakepetroules added a commit that referenced this pull request Jun 24, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
@jakepetroules jakepetroules force-pushed the eng/PR-always-monitor branch from b66271c to 59caa66 Compare June 25, 2025 00:12
jakepetroules added a commit that referenced this pull request Jun 25, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
@jakepetroules
Copy link
Contributor Author

@compnerd Any thoughts on this or the associated GitHub issue?

Copy link
Member

@compnerd compnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels a lot more safe to me. I think that the execution needs to hold on to the process handle. Once the execution is torn down, we can close the handle.

@jakepetroules
Copy link
Contributor Author

This feels a lot more safe to me. I think that the execution needs to hold on to the process handle. Once the execution is torn down, we can close the handle.

Yep, that's what it's doing. monitorProcessTermination calls CloseHandle once the process is terminated and we've collected its output.

jakepetroules added a commit that referenced this pull request Jun 27, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
Base automatically changed from eng/PR-always-monitor to main June 27, 2025 18:21
jakepetroules added a commit that referenced this pull request Jun 27, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
@jakepetroules jakepetroules force-pushed the eng/PR-windows-pid-reuse-eng/PR-always-monitor branch from 7f1597d to 6b7b547 Compare June 27, 2025 18:28
jakepetroules added a commit that referenced this pull request Jun 27, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
@@ -35,13 +35,16 @@ public struct Execution: Sendable {
public let processIdentifier: ProcessIdentifier

#if os(Windows)
internal nonisolated(unsafe) let processHandle: HANDLE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should save the process HANDLE but I don't think we should put it on Execution. Why not leave it in ProcessIdentifier since it serves that purpose? Therefore you don't have to make other changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at that, but ProcessIdentifier is Codable and Hashable, so HANDLE shouldn't go in there because it's a pointer (which we should not be serializing to disk)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, since the HANDLE is only valid for the lifetime of the process and becomes invalid once CloseHandle is called, it seemed a better for to be on Execution.

@@ -35,13 +35,16 @@ public struct Execution: Sendable {
public let processIdentifier: ProcessIdentifier

#if os(Windows)
internal nonisolated(unsafe) let processHandle: HANDLE
Copy link
Contributor

@iCharlesHu iCharlesHu Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand you used nonisolated(unsafe) here because HANDLE isn't Sendable, and we want Execution and ProcessIdentifier to be. @compnerd do you know what the best approach is? Should we use nonisolated(unsafe) or mark the struct @unchecked Sendable (or other approaches)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HANDLE is an opaque pointer, so it's effectively Sendable in practice in the logical context of how it's used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you meant @compnerd not @ compend

jakepetroules added a commit that referenced this pull request Jun 27, 2025
…ier and make the HANDLE public

Currently runDetached returns a ProcessIdentifier, which on Unix is a pid and on Windows is a numeric process identifier. Similar to #92, there is a Windows-specific race condition here. On Unix the pid will be valid until someone waitid's it. But on Windows the ProcessIdentifier may already be invalid by the time runDetached returns.

This patch builds on #93 (where the HANDLE is stored in the Execution instead of being closed immediately), and now provides that HANDLE as part of the public Execution API. This allows callers to properly free the handle using CloseHandle, as well as not having to reverse engineer the handle from the numeric pid. #93 by itself would solve the race condition but introduce a resource leak. The API now also better matches the Unix behavior since it is the caller's responsibility to waitid the pid to clean up the resources, which is semantically similar to calling CloseHandle on Windows.

Closes #94
@jakepetroules jakepetroules force-pushed the eng/PR-windows-pid-reuse-eng/PR-always-monitor branch 2 times, most recently from 0ddc201 to 2b8c265 Compare June 29, 2025 00:48
Unlike Unix, Windows doesn't have a "zombie" state where the numeric pid remains valid after the subprocess has exited. Instead, the numeric process ID becomes invalid immediately once the process has exited. Per the Windows documentation of GetProcessId and related APIs, "Until a process terminates, its process identifier uniquely identifies it on the system."

On Windows, instead of closing/discarding the process HANDLE immediately after CreateProcess returns, retain the HANDLE in Execution and only close it once monitorProcessTermination returns. This resolve a race condition where the process could have exited between the period where CloseProcess was called and monitorProcessTermination is called. Now we simply use the original process handle to wait for the process to exit and retrieve its exit code, rather than "reverse engineering" its HANDLE from its numeric pid using OpenProcess.

Closes #92
@jakepetroules jakepetroules force-pushed the eng/PR-windows-pid-reuse-eng/PR-always-monitor branch from 2b8c265 to 6422350 Compare June 29, 2025 05:34
@jakepetroules jakepetroules merged commit 2988b8b into main Jun 30, 2025
21 checks passed
@jakepetroules jakepetroules deleted the eng/PR-windows-pid-reuse-eng/PR-always-monitor branch June 30, 2025 00:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Execution should store the HANDLE internally on Windows, not just the numeric PID
4 participants