Skip to content

3C inserts cast into unwritable inline function body that calls function pointer with itype #423

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
john-h-kastner opened this issue Feb 10, 2021 · 5 comments
Assignees
Labels

Comments

@john-h-kastner
Copy link
Collaborator

john-h-kastner commented Feb 10, 2021

Running 3c stdlib_checked.h -- -O2 gives an error

In file included from /home/cc/checkedc/include/stdlib_checked.h:14:
In file included from /usr/include/stdlib.h:825:
/usr/include/x86_64-linux-gnu/bits/stdlib-bsearch.h:1:1: error: 3C internal error: 3C generated changes to this file even though it is not allowed to write to the file (https://github.com/correctcomputation/checkedc-clang/issues/387)
/* Perform binary search - inline version.
^
/usr/include/x86_64-linux-gnu/bits/stdlib-bsearch.h:1:1: note: use the -dump-unwritable-changes option to see the new version of the file
1 error generated.

This error is currently affecting the benchmark programs tested in our actions repository. -O2 makes gcc set __USE_EXTERN_INLINES which causes the definition for bsearch to be inlined in the stdlib.h. 3C then tries to insert a cast in the definition which triggers the unwritable change error.

The example below demonstrates the same error.

#define __USE_EXTERN_INLINES
#include <stdlib.h>
_Itype_for_any(T) void *bsearch(const void *key : itype(_Ptr<const T>),
                                const void *base : itype(_Array_ptr<const T>) byte_count(nmemb * size),
                                size_t nmemb, size_t size,
                                int ((*compar)(const void *, const void *)) :
                                  itype(_Ptr<int(_Ptr<const T>, _Ptr<const T>)>)
                                ) : itype(_Ptr<T>);
@mattmccutchen-cci
Copy link
Member

@john-h-kastner Want me to look at this or do you think you will be faster? Hopefully it's just a matter of adding an if (canWrite(PSL)) in the right place.

@mattmccutchen-cci mattmccutchen-cci self-assigned this Feb 10, 2021
@mattmccutchen-cci
Copy link
Member

mattmccutchen-cci commented Feb 10, 2021

I tried this and suppressing the cast is indeed just a matter of adding a canWrite check to CastPlacementVisitor::surroundByCast. While this is probably acceptable as a quick fix, we should consider whether it's the right thing to do in general. In more detail:

  1. If 3C thinks it should insert a cast but we don't let it, could that lead to a Checked C type checking error that would violate any of the usual expectations that 3C output will pass the type checker? E.g., we know that -alltypes may lead to type checking failures; might casts be another potential source of type checking failures, or can we (John?) argue that that will never happen?

    In the bsearch case, the cast insertion seems to be caused by the redeclaration of bsearch in stdlib_checked.h with itypes; if I remove the declaration, 3C doesn't insert the cast. What if the function were redeclared with an outright checked type? Of course, the Checked C system headers won't do that because they want to maintain compatibility with plain C code. Until now I thought it was OK for a user to redeclare an external function with checked types if they are willing to commit to making all calls in their own project checked, but if doing so breaks calls from other inline external functions, that might be a problem. If the answer is "don't redeclare external functions with checked types; instead use itypes and use #pragma CHECKED_SCOPE or similar if you want to enforce that you make only checked calls", then we may need to document that somewhere. Update 2021-02-16: I tested and the Checked C compiler doesn't allow a function to be declared once with unchecked types and once with checked types. AFAIK, an itype is the only thing that can differ.

  2. Even if suppressing the cast won't cause a compile error, is there any situation in which it would be better for 3C to generate constraints that remove the motivation for the cast? I'm guessing not. The normal case with unwritable files is that the user's program depends on an external library, not the other way around; in that case, the library shouldn't even reference elements of the user's program. Running 3C on a writable library and an unwritable dependent program is a scenario that AFAIK we currently don't support (and may never support).

  3. Unwritability aside, is it appropriate for 3C to insert an _Assume_bounds_cast when calling a function with an itype? (This would be a separate issue but probably makes sense to decide on here.) I'm guessing "yes": while a fully unchecked call would be allowed by Checked C's rules for itypes, we assume the user's goal is to make the code checked, so we insert _Assume_bounds_cast to that end, and the user will be able to remove the cast once they convert the code so that the original argument has a checked type. Update 2021-02-16: I learned that 3C tries not to insert a cast unless it believes it is necessary according to Checked C's rules. 3C has a bug where it believes the cast is needed in this case even though it isn't: see my later comment.

@mattmccutchen-cci mattmccutchen-cci added bug Something isn't working cast labels Feb 10, 2021
@mattmccutchen-cci mattmccutchen-cci changed the title 3C generates unwritable changes for stdlib_checked.h with -O2 3C inserts cast into unwritable inline function body that calls function with itype Feb 13, 2021
@john-h-kastner john-h-kastner added the benchmark failure A bug causing a failure in our nightly benchmark tests label Feb 15, 2021
@mattmccutchen-cci
Copy link
Member

I was trying to write a minimal regression test for this bug, and I found out that the circumstances that trigger it are narrower than I thought. In fact, I haven't found a way to trigger it at all without 3C thinking that the definition of bsearch is in a system header. I reduced the test case to the following:

#include "../bsearch_preproc_test.h"

where bsearch_preproc_test.h contains:

# 1 "foo.h" 3

const void *global_key;

void bsearch (void (*compar)(const void *));
inline void bsearch (void (*compar)(const void *)) {
  (*compar) (global_key);
}
_Itype_for_any(T) void bsearch(void ((*compar)(const void *)) :
                                 itype(_Ptr<void(_Ptr<const T>)>)
                               );

The # 1 "foo.h" 3 line is a marker generated by the preprocessor saying where the code originated. The 3 at the end indicates a system header (go figure). If I remove the 3, the bug doesn't trigger.

In fact, I found one place where 3C does something different based on whether code is marked as being in a system header. @john-h-kastner It looks like you added that check in the liberal itypes change (#356); why? Did it mitigate a problem that was later fixed in a better way via canWrite constraints (#391)? I'd guess that from now on, we should be relying on canWrite constraints and should not treat system headers any differently from other unwritable files. Furthermore, I'd think that setting Hasbody = false in 3C for a function that actually has a body, and whose body is actually traversed for cast insertion, would risk confusing us and leading to further bugs as we modify 3C. I tried removing the system header check and the regression tests still passed; I didn't try the benchmark tests.

Should I go ahead and submit a PR that removes the system header check? That may leave us without a known way to trigger the cast insertion bug. Should I still submit a PR to skip cast insertion in unwritable files even though I may not know of an example on which it makes a difference? What would be the appropriate regression tests for any of this? Obviously we don't want the bug to come back, but it's a little strange to write a regression test that tries to use the system header flag to trigger different behavior in 3C when we have no intention of 3C checking that flag ever again.

mattmccutchen-cci added a commit that referenced this issue Feb 16, 2021
The liberal itypes change (#356) added code to flag them as not having a
body even though they did. This may have been a useful mitigation for
certain problems with unwritable files at the time, but it risks causing
confusion and violating 3C invariants in the future. From now on, we
want to handle all unwritable files (including system headers) via the
new canWrite constraints approach.

It appears that removing this code will immediately fix one problem with
cast insertion that is currently blocking the benchmark tests (#423), so
we'll go ahead and make this change even though we don't fully
understand the cast insertion problem or how to appropriately
regression-test it.
@mattmccutchen-cci
Copy link
Member

I'm continuing to investigate exactly how the system header check led to the undesired cast insertion. The rabbit hole only gets deeper: it appears there may be additional bugs coming into play into this example, and once those bugs are fixed, it may become clearer whether there is still a reasonable example that can trigger cast insertion in an unwritable file. (Specifically, the system header check is altering the way the itype of compar is handled during declaration merging, but I'm now thinking the way that itype is handled without the system header check may be wrong in a different way.) In the meantime, I submitted #431 so we can continue to make progress on fixing the benchmark tests.

mattmccutchen-cci added a commit that referenced this issue Feb 16, 2021
The liberal itypes change (#356) added code to flag them as not having a
body even though they did. This may have been a useful mitigation for
certain problems with unwritable files at the time, but it risks causing
confusion and violating 3C invariants in the future. From now on, we
want to handle all unwritable files (including system headers) via the
new canWrite constraints approach.

It appears that removing this code will immediately fix one problem with
cast insertion that is currently blocking the benchmark tests (#423), so
we'll go ahead and make this change even though we don't fully
understand the cast insertion problem or how to appropriately
regression-test it.
@mattmccutchen-cci mattmccutchen-cci removed the benchmark failure A bug causing a failure in our nightly benchmark tests label Feb 16, 2021
@mattmccutchen-cci mattmccutchen-cci changed the title 3C inserts cast into unwritable inline function body that calls function with itype 3C inserts cast into unwritable inline function body that calls function pointer with itype Feb 17, 2021
@mattmccutchen-cci
Copy link
Member

I finally found a way to reproduce the problem that doesn't depend on any of the system header or declaration merging stuff: 3c bsearchlike.c -- where:

bsearchlike.c:

#include "../bsearchlike.h"

../bsearchlike.h:

void f(void ((*g)(int *q)) : itype(_Ptr<void(_Ptr<int>)>)) {
  int *p = 0;
  (*g)(_Assume_bounds_cast<_Ptr<int>>(p));
}

3C inserts a cast only if it believes the cast is necessary according to Checked C's rules, but apparently 3C doesn't correctly implement the Checked C rules in this case because 3C thinks a cast is needed but it actually isn't (the Checked C compiler accepts bsearchlike.h as written). When 3C processes the parameter g, it initializes both the internal and the external PVConstraints based on the itype and attaches a note to those PVConstraints that there was originally an itype. But the nested PVConstraint for q just has type _Ptr<int> with no note about an itype, so of course 3C thinks that p needs to be casted to _Ptr<int>. The Checked C compiler must be doing something different that allows g to be called using the unchecked type for q.

So, in what is becoming a theme, if (A) an unwritable file never references a writable file (which 3C might modify), (B) the existing unwritable files pass the Checked C type checker, and (C) 3C's cast insertion logic perfectly matches the Checked C type checker's logic, then 3C should never determine that an existing unwritable file needs any more casts inserted. But if any of these conditions fails ((C) in the example above), then we get an unwritable change.

AFAIK, none of our real-world code currently violates condition (C) on the 3C main branch, so we can hold off a bit on addressing this issue, but once #433 is fixed, I believe the original bsearch example will violate condition (C) again. At that point, we can choose to:

  1. Sweep the problem under the rug once and for all by making 3C not insert casts in unwritable files.
  2. If feasible, fix 3C's cast insertion logic in the bsearch case and leave any other violations of condition (C) to generate unwritable change errors so we can review them and decide what to do.
  3. Some compromise: for example, when 3C tries to insert a cast in an unwritable file, skip it with a warning rather than generating an unwritable change that will cause an error later. We may or may not ever prioritize monitoring these warnings, but at least they will be available. Optionally, we could also fix the logic in the bsearch case.

Opinions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants