Skip to content

[Explicit Module Builds] Add support for creating a reproducer when clang process crashes. #455

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

vsapsai
Copy link

@vsapsai vsapsai commented Apr 29, 2025

For the explicitly built modules generated script contains commands how to build all necessary .pcm files and how to compile the final translation unit.

rdar://59743925

…lang process crashes.

For the explicitly built modules generated script contains commands how
to build all necessary .pcm files and how to compile the final translation unit.

rdar://59743925
@vsapsai
Copy link
Author

vsapsai commented Apr 29, 2025

This is an initial version that is lacking comments and tests. At first I want to get the general feedback if such approach is reasonable or if it won't work on some basic level. If you have better ideas to implement it, that would be great too.

argc, const_cast<const char**>(argv), workingDirectory,
/*ReproducerLocation=*/NULL, /*UseUniqueReproducerName=*/true, &messageString);
if (message) {
*message = strdup(lib->fns.clang_getCString(messageString));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is libclang guaranteed to return a non null string in messageString here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For success case I think we can make it mandatory for messageString to be present (you need to tell where to find the generated reproducer). Error cases might not populate messageString. Even though they should provide more details, it is fairly easy to return InvalidArgument error without any extra message.

But we can guarantee messageString isn't corrupted and in the worst case it would be an empty string.

I'll think about these cases and would document API contract on libclang side.

serializer.serialize(kind)
serializer.serialize(files)
serializer.serialize(includeTreeID)
serializer.serialize(modules)
serializer.serialize(workingDirectory)
serializer.serialize(commands)
serializer.serialize(scanningCommandLine)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should try to measure the impact of serializing an extra command line here, it should be shorter than the cc1 command but we do want to keep this data small...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the impact in the consumed storage space? Just to know what to measure.

@@ -301,6 +301,19 @@ public final class ClangCompileTaskAction: TaskAction, BuildValueValidatingTaskA
outputDelegate.emitOutput("Failed frontend command:\n")
outputDelegate.emitOutput(ByteString(encodingAsUTF8: commandString) + "\n")
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to the question about serialization overhead, we should consider whether this should be opt-in or opt-out. I'm not sure I have a strong opinion right now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have plans to add reproducer-on-error. That one should be opt-in. But crashes seem to be rare enough that generating a reproducer can be don unconditionally. Honestly, the crashes seem to be rare enough that this entire feature in its current form isn't particularly useful. The value should come from iterating on the feature.

@owenv
Copy link
Collaborator

owenv commented Apr 29, 2025

this approach lgtm, just had a few minor comments

@@ -126,6 +126,8 @@ bool libclang_has_cas_up_to_date_checks_feature(libclang_t lib);
/// Whether the libclang has current working directory optimization support.
bool libclang_has_current_working_directory_optimization(libclang_t lib);

bool libclang_has_reproducer_feature(libclang_t lib);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rebase and tag this with CSUPPORT_EXPORT, and do the same for libclang_scanner_generate_reproducer so it gets the __declspec(dllexport) modifier on windows

@@ -301,6 +301,19 @@ public final class ClangCompileTaskAction: TaskAction, BuildValueValidatingTaskA
outputDelegate.emitOutput("Failed frontend command:\n")
outputDelegate.emitOutput(ByteString(encodingAsUTF8: commandString) + "\n")
}

if case .some(.exit(.uncaughtSignal, _)) = outputDelegate.result {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We treat some signals as cancellation; should we avoid creating reproducers in that case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have a cancellation, does TaskProcessDelegate.commandResult == .cancelled? We can generate a reproducer only when lastResult == .failed to avoid useless reproducers. I think in my testing I've encountered signals to some compilations after the build encountered crashes for a different file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants