Skip to content

[checked-c-convert] Porting Tool Updates #638

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
secure-sw-dev-bot opened this issue Jan 16, 2022 · 3 comments
Closed

[checked-c-convert] Porting Tool Updates #638

secure-sw-dev-bot opened this issue Jan 16, 2022 · 3 comments

Comments

@secure-sw-dev-bot
Copy link

This issue was copied from checkedc/checkedc-clang#642


This PR adds several new features to checked-c-convert, including NT_array detection.

@secure-sw-dev-bot
Copy link
Author

Comment from @Machiry:

This will fix a few regressions caused by the recent commits. We also fix the resolution of function types by following the function subtyping rules as discussed here: https://gist.github.com/Machiry/962bf8c24117bc5f56a31598e6782100

@secure-sw-dev-bot
Copy link
Author

Comment from @Machiry:

I've started to take a look at this. Here is some initial feedback:

  • Is there a design document or note somewhere on the inference algorithm changes? I didn't notice one, but maybe I missed it. It would be helpful in reviewing the code.

We do not have a specific design document for this pull request. But, we tried to faithfully implement the highlevel idea as described in the paper (Section 5.3 of https://www.microsoft.com/en-us/research/uploads/prod/2019/05/checkedc-post2019.pdf)

If you see the Section 5.3 of the paper, there depending on how a parameter to a function is used by the callers and used inside the callee, we use itypes or casting. To do this, we need to have two sets of constraint variables for each of the function parameters i.e., one for the calleers (i.e., as seen outside the function) and one for the function body. Once we solve the constraints we can assign itype or casts. Similarly for return types.

However, our old implementation had inconsistencies in inferring checked types of the function parameters and returns. Specifically, when a function do not have a corresponding declaration e.g., static methods,

We changed this so that irrespective of whether a function has a explicit declaration or not we always maintain two sets of constraint variables for parameters and returns.

Function Subtyping

We introduced the support for _Nt_array_ptrs, this resulted in function subtyping issues as discussed in detail in: https://gist.github.com/Machiry/962bf8c24117bc5f56a31598e6782100
We implemented support for this.

  • It is great that you have written additional functionality tests. Could you have move them over to the clang\test\CheckedCRewriter directory and use the LLVM lit framework, instead of having a custom python script?

Agree. I converted the tests to lit and added them to CheckedCRewriter directory.

  • We try to follow to the clang coding convention. I know that the code for the rewriter hasn't always followed this. The things that I noticed about the new code include: there need to be space after if keywords, a space before and after = in default initializers, and * for pointer types should be next to the identifier (T *v, not T* v).

Sorry, I wasn't aware of this. I fixed them.

  • What is the runtime performance of the converter now? I noticed some potential sources of inefficiency.

Actually current implementation is faster. This is mainly because of a single line change (Thanks to @rchyena) in ConstraintVariables.cpp:
from:

214: auto env = CS.getVariables();

to

214: auto &env = CS.getVariables();

Numbers for Icecast :

New

real    3m51.869s  
user    3m50.132s  
sys    0m0.804s

Old

real    6m9.581s  
user    6m0.688s  
sys    0m8.004s

In general the current version of the tool is sufficiently fast, for libarchive (~170K lines) it took ~10min.

@secure-sw-dev-bot
Copy link
Author

Comment from @Machiry:

Overall, this looks great! Great work!

I have some changes that I've requested:

  • There's a space appearing after the itype constraint when it is added: `f(int * : itype(_Ptr) ). Could you remove it? The output of the tool is much nicer to read.

✅ Fixed.

  • The new code often uses maybe 30 characters a line for comments, spacing the comment across multiple lines. That is hard to read. The clang line limit is 80 characters. Could you take advantage of that and have longer lines?

✅ Fixed.

  • There were a number of typos in the comments that I pointed.

✅ Fixed. I used a spell checker this time :)

  • The formatting for if statements is still off in the new code that you've written. There are a number of places where you wrote if(, and a space needs to be placed after the if . I pointed a few out, and left many more for you to fix.

✅ Fixed. I used a format checker this time.

There are markdown files for Checked C in clang\docs\checkedc. Could you create one that capture the content that you posted about the algorithm in the conversion tool? That was really helpful for me. If you want, you can post it later as a separate commit. I just want to make it gets documented).

📚 Created one.

dtarditi pushed a commit that referenced this issue Jan 21, 2022
#638)

And migrate ConstraintVariable::mkString to use it.

Fixes #621.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant