Skip to content

well-typed/ghc-stack-profiler

Repository files navigation

ghc-stack-profiler

A lightweight profiler that doesn't need to compile your program with profiling information (i.e. -prof).

The main idea is to periodically sample the Haskell callstack and use IPE and stack annotation information in order to understand the source locations which correspond to the stack frames.

The profiling samples can be exported to speedscope.app for rendering:

GHC stack trace obtained with ghc-stack-profiler and viewed in speedscope.app

🚧 Under Construction 🚧

This project is currently being developed. The API may change at any moment and is unstable.

This project requires the WIP GHC branch wip/fendor/ghc-stack-profiler.

Usage

To profile a program it needs to be compiled and instrumented with the ghc-stack-profiler package via:

import GHC.Stack.Profiler.Sampler

main :: IO ()
main = withSampleProfilerForMyThread (SampleIntervalMs 10) $ do
    ...

This will spawn a profiling thread that will periodically take a snapshot of the current RTS callstack of your program and serialises it to the eventlog.

To improve readability of the profile, compile the program with -finfo-table-map and -fdistinct-constructor-tables. Using cabal, this can be achieved with an appropriate cabal.project file:

packages: ...

...

package *
    ghc-options: -finfo-table-map -fdistinct-constructor-tables

To emit the eventlog messages by the profiler, you need to run your program with the -l RTS flag, for example via:

./<program> ... +RTS -l -RTS

This will write out an eventlog to <program>.eventlog which can be transformed for speedscope.app via the script ghc-stack-profiler-speedscope.

ghc-stack-profiler-speedscope <program>.eventlog

The resulting profile <program>.eventlog.json can be viewed and further analysed in speedscope.app.

Note that the results are affected by compilation optimisation options, such as -fno-omit-yields.

Example simple

The simple project is a typical fibonacci implementation. Run it via cabal, assuming a supported GHC version on $PATH:

cabal run exe:simple -- +RTS -l
cabal run exe:ghc-stack-profiler-speedscope -- simple.eventlog

Uploading the resulting simple.eventlog.json to speedscope.app shows something similar to:

Profile of the simple program

Performance

Our initial testing has been instrumenting GHC.

We observe an overhead of around 5% - 10% for compiling Cabal-syntax. The sampling interval was 10 milliseconds. The overhead depends on the callstack depth, number of alive threads and sampling interval.

More sophisticated benchmarks are expected soon.

About

A profiler which samples the RTS callstack

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •