Skip to content

runtime/pprof cpu profiling is not implemented for windows #2041

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
alexbrainman opened this issue Jul 6, 2011 · 6 comments
Closed

runtime/pprof cpu profiling is not implemented for windows #2041

alexbrainman opened this issue Jul 6, 2011 · 6 comments

Comments

@alexbrainman
Copy link
Member

See https://groups.google.com/d/topic/golang-nuts/-5Pu5MTxRng/discussion for suggestions.
@gopherbot
Copy link
Contributor

Comment 1 by john.arbash.meinel:

A few direct links:
http://www.osronline.com/showthread.cfm?link=186286
http://msdn.microsoft.com/en-us/library/aa363784%28VS.85%29.aspx
http://undocumented.ntinternals.net/UserMode/Undocumented%20Functions/NT%20Objects/Profile/NtCreateProfile.html
It looks like "Event Trace for Windows" is the official api, using NT Kernel Logger
session. Though there may be some issues, because there are comments like "there can be
only one consumer." which might mean you can only profile 1 application at a time. (This
may not be true when using private logging.)
There is also the ZwCreateProfile (and NtCreateProfile) functions. Which are officially
undocumented.
It may depend on whether we just want to get updates, and do the sampling ourselves, or
whether we want to ask the OS to do the sampling, and just turn those into file/line
offsets ourselves.

@gopherbot
Copy link
Contributor

Comment 2 by john.arbash.meinel:

The main limitation for Event Types for Windows seems to be that it wants
Administrator/Performance Users access. I wrote a C++ executable, and it only works if I
RunAsAdministrator. Though I'm on Vista Home Basic, so it isn't easy to add myself to
the Performance Users group. Also, the headers from Express Edition don't seem to
#define some of the constants described in the online docs.
Digging more, I came across stuff like SetTimer and CreateWaitableTimer.
SetTimer doesn't work very well, because it requires creating a "window", and then
polling for messages on that window. (Which is, I believe, how "events" occur on
Windows. It isn't something async, but a master thread polling for new messages.)
CreateWaitableTimer could work, and SetWaitableTimer allows you to specify a callback.
However, you can only callback into a thread that is waiting/sleeping. In which case,
spawning a thread and just sleeping seems like a better method for this use case.
So in the end, the best I could find was just spawning a thread and calling Sleep() in
it. Sleep takes a millisecond parameter, which has 'clock-tick' granularity. On my
system, that appears to be 1ms, on XP I think it was 15ms.
You could use WaitableTimer which has 100ns granularity. I don't know how well it
sleeps, though.
So I would say, if you can find the right headers, then using OpenTrace
http://msdn.microsoft.com/en-us/library/aa364089%28VS.85%29.aspx
To get at the built-in performance monitoring would be nice, but requires the user also
has elevated privileges. Falling back to just CreateThread + Sleep seems a more
immediately useful option.

@gopherbot
Copy link
Contributor

Comment 3 by [email protected]:

I created a CL (http://golang.org/cl/4983048/) with implementation pprof using
CreateTimerQueueTimer()
It is a multimedia timer with millisecond resolution.
It can work either as one-shot or periodic timer.
Its callbacks run on a dedicated thread or threadpool (thus callback can work longer
than timer period).
It does not require Administrator privileges. 
It works on Windows 2000.
So, it looks not worse than SIGPROF and seems a good replacement.
Each timer callback, all the threads and their IP and SP recorded by the procedure which
does the samejob on Linux.
It may lead to wrong result.
First, sampling is performed more often than with SIGPROF.
Each timer period as many samples are recorded as the number of threads executing Go
code.
Second, absolute time is used.
There is a posibility to get thread working time.

@gopherbot
Copy link
Contributor

Comment 4 by [email protected]:

Also, a good option would be to emit debugging info in Microsoft format (.PDB).
It would allow to use a lot of external profiles (for example, Intel VTune)

@rsc
Copy link
Contributor

rsc commented Oct 6, 2011

Comment 5:

Pretty sure this is fixed now?

@gopherbot
Copy link
Contributor

Comment 6 by hectorchu:

Yep, fixed by http://code.google.com/p/go/source/detail?r=9c5c0cbadb4d.

Status changed to Fixed.

@golang golang locked and limited conversation to collaborators Jun 24, 2016
This issue was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants