-
Notifications
You must be signed in to change notification settings - Fork 1.2k
OTel experimentation #49409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
OTel experimentation #49409
Conversation
| // Exceptions during telemetry shouldn't cause anything else to fail | ||
| // Exceptions during _telemetry shouldn't cause anything else to fail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overeager find & replace 😸
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exactly right 😄 Part of me cutting out parts of this PR will involve minimizing those diffs
|
Rebased on top of #49749 |
0a0d749 to
a923755
Compare
… CLI application and codebase
…nts and several nullable disable removals. Changed s_source to Source as a static property.
…ader and ToolPackageDownloaderBase since those will be updated in main. Not interested in cleaning up test files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR explores replacing Application Insights telemetry with OpenTelemetry (OTel) for the .NET CLI. The change removes the AppInsights dependency and custom persistence channel implementation, switching to OTel's Azure Monitor exporter with built-in durable storage. This enables AOT compilation support while maintaining telemetry functionality.
Key Changes:
- Replaced Application Insights with OpenTelemetry SDK, adding Azure Monitor and OTLP exporters
- Removed custom persistence channel infrastructure (storage, transmission, sender classes)
- Converted telemetry events to OTel Activities with events
- Implemented activity context propagation to child processes via environment variables
Reviewed Changes
Copilot reviewed 45 out of 46 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Cli/dotnet/dotnet.csproj | Added OpenTelemetry package references |
| src/Cli/dotnet/Program.cs | Major refactor: initialized OTel providers, removed AppInsights code, added activity tracking and signal handlers |
| src/Cli/dotnet/Telemetry/Telemetry.cs | Simplified telemetry to use Activities instead of AppInsights TelemetryClient |
| src/Cli/dotnet/Telemetry/TelemetryCommonProperties.cs | Updated return types to support nullable object values |
| src/Cli/dotnet/Telemetry/ITelemetry.cs | Removed Flush/Dispose methods, updated TrackEvent signature |
| src/Cli/dotnet/Telemetry/PersistenceChannel/* | Deleted entire persistence channel implementation |
| test/dotnet.Tests/TelemetryTests/* | Removed tests for deleted persistence channel |
| test/dotnet.Tests/TelemetryCommandTest.cs | Updated test calls removing startup time parameter |
| src/Cli/dotnet/Commands/MSBuild/MSBuildLogger.cs | Removed sentinel dependency, simplified telemetry initialization |
| src/Cli/dotnet/CommandFactory/CommandResolution/ActivityContext.cs | Added new class for propagating activity context to child processes |
src/Cli/dotnet/Program.cs
Outdated
| PerformanceLogEventSource.Log.CLIStop(); | ||
| commandParsingException.ParseResult.ShowHelp(); | ||
| } | ||
| _mainActivity.AddTag("process.exit.code", exitCode); |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable 'exitCode' is used but not declared in this scope. The exit code should be captured from the catch block above (line 127 returns hardcoded 1). Consider adding int exitCode = 1; before this line or removing this tag addition from the catch block.
src/Cli/dotnet/Program.cs
Outdated
| // If telemetry object has not been initialized yet. It cannot be collected | ||
| TelemetryEventEntry.SendFiltered(e); | ||
| Reporter.Error.WriteLine(e.ToString().Red().Bold()); | ||
| _mainActivity.AddTag("process.exit.code", exitCode); |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable 'exitCode' is used but not declared in this scope. The exit code should be captured from the catch block (line 136 returns hardcoded 1). Consider adding int exitCode = 1; before this line or removing this tag addition from the catch block.
| [Fact] | ||
| public void TelemetryCommonPropertiesShouldReturnIsLLMDetection() | ||
| { | ||
| var unitUnderTest = new TelemetryCommonProperties(getMACAddress: () => null, userLevelCacheWriter: new NothingCache()); | ||
| unitUnderTest.GetTelemetryCommonProperties("dummySessionId")["llm"].Should().BeOneOf("claude", null); | ||
| } |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test method is a duplicate of the test at lines 158-161. The method name and implementation are identical, which will cause test execution issues. One of these duplicate tests should be removed.
| [Fact] | |
| public void TelemetryCommonPropertiesShouldReturnIsLLMDetection() | |
| { | |
| var unitUnderTest = new TelemetryCommonProperties(getMACAddress: () => null, userLevelCacheWriter: new NothingCache()); | |
| unitUnderTest.GetTelemetryCommonProperties("dummySessionId")["llm"].Should().BeOneOf("claude", null); | |
| } |
| // Licensed to the .NET Foundation under one or more agreements. | ||
| // The .NET Foundation licenses this file to you under the MIT license. | ||
|
|
||
| #nullable disable |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new file with #nullable disable at the top, which is discouraged for new code. Consider enabling nullable reference types and properly annotating nullability throughout the file.
| public FrozenDictionary<string, object?> GetTelemetryCommonProperties(string currentSessionId) | ||
| { | ||
| return new Dictionary<string, string> | ||
| return new Dictionary<string, object?> |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method returns FrozenDictionary<string, object?> but creates a Dictionary<string, object?>. The dictionary should be converted to a frozen dictionary by calling .ToFrozenDictionary() on the dictionary before returning to match the return type and provide the immutability/performance benefits of FrozenDictionary.
|
|
||
| public static void TrackEvent( | ||
| string? eventName = null, | ||
| string eventName, |
Copilot
AI
Oct 23, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter 'eventName' is now non-nullable (removed the ?), but the method body still checks for null at line 18 with if (string.IsNullOrEmpty(eventName)). Either the parameter should remain nullable with string?, or the null check should be removed since non-nullable parameters cannot be null when nullable reference types are enabled.
Fixes #49668
This is entirely exploratory so far - we can't use AppInsights libraries if we do AOT in the SDK, so I'm exploring what an OTel-based way of doing telemetry might look like.
Q: telemetry filtering/hashing
A: for now we should keep everything as it is. As we make new activities, we should use
Activity.IsAllDataRequestedas a check before adding/computing expensive data.Q: do we keep the store-out-of-band-then-rehydrate-and-send approach, or just do a best-case flush?
A: We can be very optimistic! We can have multiple collectors, each for different purposes. The default OTel collector can be for local use only (by default) and can be verbose, while the Azure Monitor collector can be for production only and be more restrained. Each Collector has its own implementation, and the Azure Monitor collector even handles durable message persistence, so we can delete all of our current implementation. We should still have it store its durable store to our owned directory, though.
Example Aspire Dashboard outputs for

dotnet new classlib:Telemetry events are visible in this view if the CLI is opted in to telemetry.