-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Debugtypes #3046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debugtypes #3046
Conversation
This PR implements issue #2986. Note that the debug info model relies on the use of data_location attributes on class, array, interface and header info record types to ensure gdb translates oop references to memory addresses. There is a bug in printing of objects which employ data_location attributes in gdb releases 7.x, 8.x and 9.x. This means that attempts to print objects field by field fail with an internal error. This problem has been fixed in gdb 10.1. I am currently negotiating with the Red Hat gdb team to try to get the relevant patch back-ported to gdb 9.x and 8.x. Meanwhile, anyone wanting to use this upgrade to the debuginfo feature on Linux will need to ensure that they employ gdb 10. This also means that the debuginfo gate test probably needs to be updated so it is skipped when the gdb release is less that 10. Unless the machines used to run the gate tests can be configuerd to use gdb 10. |
1c263a8
to
58a9ea2
Compare
@olpaw I have fixed all the outstanding style issues and corrected the debuginfo script so that it only tests printing of objects field by field when gdb10+ is available. So, this PR is now complete and ready for review. |
@olpaw I forgot to mention that this version works whether or not the build employs isolates. |
Hi @adinn looking forward to go through your PR! |
@olpaw Thanks. There is no rush. I am officially on holiday for all of December (official return Jan 4th). However, I will be around up to Christmas week keeping tabs on a few things, the most important being this PR and the patch backport I requested from our gdb team. I'll do my best to respond to any feedback you can give in a timely fashion so this doesn't go cold and require you to effectively restart the review in January. |
Hi @adinn Unfortunately I was not able to look into your PR. Looking closer into serialization more & more work was/is needed to get serialization support into some form that will allow us to support all possible serialization usage patterns. I will be off from next week on till Jan 6th. Debuginfo will have to wait till then. Happy holidays & best wishes for the new year! |
@olpaw Sure, no problem. Have a good holiday and new year! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @adinn :)
I have left a few minor comments
...tratevm/src/com.oracle.objectfile/src/com/oracle/objectfile/debuginfo/DebugInfoProvider.java
Show resolved
Hide resolved
...tevm/src/com.oracle.objectfile/src/com/oracle/objectfile/elf/dwarf/DwarfInfoSectionImpl.java
Outdated
Show resolved
Hide resolved
...tevm/src/com.oracle.objectfile/src/com/oracle/objectfile/elf/dwarf/DwarfInfoSectionImpl.java
Outdated
Show resolved
Hide resolved
.../src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/NativeImageDebugInfoProvider.java
Show resolved
Hide resolved
.../src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/NativeImageDebugInfoProvider.java
Outdated
Show resolved
Hide resolved
@olpaw Hi Paul, any chance of a review for this PR soon? |
Hi @adinn, I have a bit more backlog to go through this week but I should be able to take a look sometime next week. |
Hi @olpaw. Great. Thanks for the update. |
Hi @adinn, I get a NPE when I build the native-image image with debuginfo on this PR. It only happens when I build with
|
Hmm, thanks for letting me know. I'll take a look and see what I can find. |
@olpaw Hi Paul. I am confused by the error you got because it happens on line 216 where there is this code
So, an NPE at that line can only imply that
So, this is a bit of an odd result. Perhaps you could check your build and also that you have the right code for the patch? Also, are you running with -J-ea -J-esa If all that passes then perhaps yo ucan give me th eful lcommand line and test code. However, before doing that ... yes, wait, there's more! I ran native-image on Hello.java with -O0 and got an assertion rather than an NPE. I rebased my PR over the latest upstream repo and it was still there. So, there is a bug but not the one you mentioned: The problem I encountered was that with -O0 I end up seeing two compilation ranges for what is essentially the same source method. There are two distinct code ranges for code compiled from method
One of these code ranges appears to be a generated stub. It is associated with file If -O0 is omitted I only see the compiled range for the Isolates stub so the problem does not arise. |
n.b. I have pushed a fix for the assert that happens when -O0 is specified. |
I'm on e745c48. To reproduce you have to use a Java11 JVMCI JDK. It happens when
fails (as expected).
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good
Ok, thanks.
Well, I have been using jvmci-20.3-b05. I'll switch to b09 and see if that reproduces the problem.
Ok, good. Glad to see I had the foresight to plant a correct sanity check even if I didn't get the code right! I'll see if I can catch that specific case and work out what is happening. Is there a specific app you are compiling that causes this error? I suspect that what is happening is that there is a disparity in the determination of the class name between the point where the ClassEntry is created and this point where it is looked up. The difficulty here is that there are two classes in play (Module and Target_java_lang_Module) contributing code elements to what must appear as a single class in the debug view of the native image. The general policy adopted in my implementation is to use the originating class (i.e Module) to group any such original and susbtituted elements (n.b. the methods that identify the relevant class by name is in the NativeImageDebugInfo suite of classes). It looks like this is a case where a substituted method or a newly introduced method is being reported as belonging to the original class at one point and to the substituted (Target) class at another point. |
Yes. The native-image image (the image that gets built out of the native-image driver). Build it with
If you are building this with JAVA_HOME set to a Java 11 JVMCI JDK you should have no problem replicating the issue. |
@olpaw I found the problem with resolving the type name -- the lookup of the original class name for a method was being sidetracked into using the Target class when the method came from the Target. The same problem was present for fields. Constructor name substitutuion (class name to replace for init) was also being done wrong. I have pushed a fix for all of them. |
@olpaw Thanks for the javadoc, Paul. I now have a patch implemented and tested that should work for CE and EE. I'm working on a minor re-organization to make the array length field an element of each array layout type rather than bring it in from a special array header. I'll push them both together 'real soon'. |
@olpaw I have pushed all the changes needed to 1) correct indirect oop conversion and 2) make arrays inherit a normal object header with the length field overlaid in each array implementation layout. I also updated the gdb test script used by mx debuginfotest to 1) expect the new layouts and 2) bypass testing of value printing for all gdb versions (including gdb 10.1). This is temporary until I can get the fedora/RHEL gdb bug fixed. If you have a local gdb built without the fedora/RHEL bug you can override to use that gdb and to enable printing by setting two env vars:
So this version ought to be complete and to address all review comments. |
Thanks for the hint. I will re-review and test your PR today. |
...tevm/src/com.oracle.objectfile/src/com/oracle/objectfile/elf/dwarf/DwarfInfoSectionImpl.java
Outdated
Show resolved
Hide resolved
.../src/com.oracle.svm.hosted/src/com/oracle/svm/hosted/image/NativeImageDebugInfoProvider.java
Outdated
Show resolved
Hide resolved
substratevm/src/com.oracle.objectfile/src/com/oracle/objectfile/debugentry/ClassEntry.java
Outdated
Show resolved
Hide resolved
substratevm/src/com.oracle.objectfile/src/com/oracle/objectfile/debugentry/DebugInfoBase.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better now. Only a few more changes required (see prev comments).
In addition please
- Rebase to latest master
- Squash commits where it makes sense and make messages start uppercase
Btw, I have tested a bit locally and also -H:+SpawnIsolates
seems to work as expected (running debuginfotest fails though).
If you rebase to master I can easily test some more locally with the other reference access modes.
first cut of debug info model and DWARF generator employing class layouts rebase on updated master + doc and style fixes fix problem with method name computation test type and instance printing from debuginfotest script building without isolates fix method name and file lookup update documentation to include details of debug type info support apply consistent file lookup rules
…oops ensure gdb can decode object references to addresses style fixes rebase and correct object header layout style fixes fix debuginfotest script eclipse style fixes disable test of object layout printing if gdb version less than 10 correct pattern for gdb version match corrections after zakkak review eclipse style fixes fix index key duplication when a stub and the method it is derived from both get included with -O0 fix error in lookup of original class for methods and fields fix related problem with constructor method names use gdb from env var GDB_BIN if available Fix typos and out of date content simplify code add check for printing of statci fields avoid numerous static imports of constants style fixes add details of currently missing features
minor tweak to object sizing routines
…test gdb bug minor code cleanups style fixes fix comment text further fixes after latest review
Hi @olpaw, Sorry for the delay in pushing the remaining fixes -- a meeting got in the way.
Now done.
I'm not sure what you are saying is failing here, the test in any circumstances or when using -H:+SpawnIsolates. Testing with mx debuginfotest only ever tests the -H:-SpawnIsolates case. It works for me when I have a vanilla gdb in the path or specified using GDB_BIN and I also set env var GDB_CAN_PRINT=True. It should work with any gdb if GDB_CAN_PRINT is unset or set to another value. Can you explain precisely how you are testing and how it is failing? The python driver script (mx_substratevm.py) only builds the native image for Hello using command line option -H:-SpawnIsolates. That's because the gdb script (testhello.py) is written to expect output in the format you would see for that mode. That includes details of both printed objects and printed types. Testing an image built with -H:+SpawnIsolates will require either 1) providing a new gdb script which looks for print object/type output in a different format or 2) modifying the existing script to check the output for a format determined by a config setting supplied when the script is run. I plan to implement such a change. Does that need to be done as part of this commit or provided as a follow-up?
Should be possible now. |
Only when I run with
With --- a/substratevm/mx.substratevm/mx_substratevm.py
+++ b/substratevm/mx.substratevm/mx_substratevm.py
@@ -734,7 +734,7 @@ def _debuginfotest(native_image, path, build_only, args):
'-cp', classpath('com.oracle.svm.test'),
'-Dgraal.LogFile=graal.log',
'-g',
- '-H:-SpawnIsolates',
+ '-H:+SpawnIsolates',
'-H:DebugInfoSourceSearchPath=' + sourcepath,
'-H:DebugInfoSourceCacheRoot=' + join(path, 'sources'),
'hello.Hello'] + args I get
I figured that this needs tweaking of the checking script. It's fine if this will be added in a follow up PR. |
Alright. Looks good now. I will create the internal PR and run it through our gates. Thanks a lot for your great contribution! |
Excellent. Let's hope it all just sails through :-) Of course, I'll be happy to help when it turns out there is more rework :-p
Thanks for your patience and very careful reviewing. Your help in getting this right (and right for all cases) is very much appreciated. |
@adinn your PR is on master. Our gates found one small issue. See 912d73e I also tested your implementation against our third reference access usecase were we have:
Unfortunately that did not work. But I found the reason and have already fixed it locally. |
@olpaw Hurrah! Thanks very much for providing the requisite patch-ups. I'll keep an eye out for the patch. I have a patch ready for the debuginfotypes test script and driver code that tests both settings for SpawnIsolates. I'll create a new PR for that asap. |
No description provided.