Skip to content

[GR-47106] Loading DWARF debug info into debugger can be very slow after GR-45654 #6936

Open
@adinn

Description

@adinn

Commit GR-45654 (#6414) modified DWARF debug info generation to encode Java info in a single compilation unit (CU). This was done in order to allow code for top level methods belonging to different classes to be interleaved in the code cache. However, for moderate to large size programs this has caused noticeable slow down when gdb first starts executing.

The problem is only severe when full debug info generation is enabled (passing flags H:GenerateDebugInfo=1, -H:+SourceLevelDebug, -H:+DebugCodeInfoUseSourceMappings, -H:-DeleteLocalSymbols, -H:+TrackNodeSourcePosition). However, it promises to also cause problems for very large programs even when only a subset of those arguments are provided. Mandrel integration test issue 160 details the severity for a moderate sized Quarkus application, where placement of the initial break takes up to 300 seconds.

The problem arises because of two stages of DWARF processing. Line info processing is polynomial in the number of files in a CU's file table. The single CU DWARF for the above example includes > 10,000 files whereas the per-class CU DWARF has tables that usually contain 1-10 files, with rare cases of a few 100s. With single CU DWARF gdb has to process all the line info at startup. With multi-CU DWARF gdb only needs to load a subset of the CUs. While this subset is large the time is significantly reduced because of the polynomial processing time.

The second stage that costs a lot is inline method tree processing which is proportional to the number of inline tree nodes in the DWARF info. Splitting the DWARF into multiple CUs does not result in less nodes. However, with single CU DWARF gdb has to process all the inline entries at startup while with multi-CU DWARF gdb only processes a subset of them.

It would be beneficial to return to multiple, per-class CUs and find some other way to deal with the fact that the code range for a specific class's CU is split into subranges which may interleave with code ranges belonging to another class/CU. It should be possible to achieve this by adding a DWARF debug_ranges section and labelling each CU with a DW_AT_ranges attribute that references the relevant details in the debug_ranges` section.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions