-
Notifications
You must be signed in to change notification settings - Fork 401
Reducing allocations in parsing #2001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… when it's actually null
* don't allocate new string to remove brackets * don't allocate an array of characters (string.Split argument) * don't allocate an array of strings (result of string.Split)
…nternal methods that we control)
…it's creating a copy every time it's called!
@@ -99,7 +100,7 @@ public class ParseResult | |||
/// Gets the directives found while parsing command line input. | |||
/// </summary> | |||
/// <remarks>If <see cref="CommandLineConfiguration.EnableDirectives"/> is set to <see langword="false"/>, then this collection will be empty.</remarks> | |||
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives { get; } | |||
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the future we are going to use IReadOnlyDictionary<TKey, TValue>.Empty
introduced very recently in .NET 8: dotnet/runtime#76028
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Static Empty
properties were not added to the interfaces after all, but ReadOnlyDictionary<TKey, TValue>.Empty
was added.
@@ -23,9 +23,10 @@ public TypoCorrection(int maxLevenshteinDistance) | |||
|
|||
public void ProvideSuggestions(ParseResult result, IConsole console) | |||
{ | |||
for (var i = 0; i < result.UnmatchedTokens.Count; i++) | |||
var unmatchedTokens = result.UnmatchedTokens; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to store the result, because currently it's always allocating:
public IReadOnlyList<string> UnmatchedTokens => _unmatchedTokens.Select(t => t.Value).ToArray(); |
ofc I am going to propose a change to this API
{ | ||
throw new ArgumentException($"Incorrect token type: {token}"); | ||
} | ||
Debug.Assert(token.Type == TokenType.Argument, $"Incorrect token type: {token}"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these types are internal and we control the input, so there is no need to the throw condition. Debug.Assert
will keep us safe (the CI runs all the tests in debug) with 0 cost for the Release
build (less IL to compile, possibility to inline the method)
@@ -41,9 +41,11 @@ public Parser() : this(new RootCommand()) | |||
/// <param name="rawInput">The complete command line input prior to splitting and tokenization. This input is not typically available when the parser is called from <c>Program.Main</c>. It is primarily used when calculating completions via the <c>dotnet-suggest</c> tool.</param> | |||
/// <returns>A <see cref="ParseResult"/> providing details about the parse operation.</returns> | |||
public ParseResult Parse( | |||
IReadOnlyList<string> arguments, | |||
IReadOnlyList<string>? arguments, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a test that verifies that the user can pass null
here. So I made it more visible.
|
||
var knownTokens = configuration.RootCommand.ValidTokens(); | ||
|
||
for (var i = 0; i < argList.Count; i++) | ||
int i = FirstArgumentIsRootCommand(args, configuration.RootCommand, inferRootCommand) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the NormalizeRootCommand
was allocating a new list just to ensure that the list of arguments starts with the root command alias.
argList.InsertRange(i + 1, newTokens ?? Array.Empty<string>()); | ||
if (newTokens is not null && newTokens.Count > 0) | ||
{ | ||
List<string> listWithReplacedTokens = args.ToList(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be rare
var errorList = new List<string>(); | ||
const int FirstArgIsNotRootCommand = -1; | ||
|
||
List<string>? errorList = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
usually there are no errors, so we delay the allocation until first error is found
… the parsing, we take advantage of its mutability and remove the root command token instead of creating a copy of the whole list.
…allocates a list)
… allocates a list)
…ntResult> duplicate the code rather than introducing new public type like IdentifierSymbolResult
{ | ||
AddCompletionsFor(commands[i]); | ||
var commands = Subcommands; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For all the var
s that don't comply with dotnet/runtime guidelines, are you planning on addressing them all at once in the future?
var commands = Subcommands; | |
IList<Command> commands = Subcommands; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to discuss it with other contributors. And the usage of is { }
instead is not null
which drives me crazy ;p
@@ -99,7 +98,7 @@ public class ParseResult | |||
/// Gets the directives found while parsing command line input. | |||
/// </summary> | |||
/// <remarks>If <see cref="CommandLineConfiguration.EnableDirectives"/> is set to <see langword="false"/>, then this collection will be empty.</remarks> | |||
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives { get; } | |||
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new (); | |
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this is a matter of personal preference. We should discuss it with others, choose one policy and enforce it with static code analysis tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there an analyzer for that yet? I don't see one listed at https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/style-rules/csharp-formatting-options#spacing-options. (There is IDE0090 "Simplify new
expression" for target-typed new
, though.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…ting a Union of the same hash set
While reading parsing logic I found plenty of places where we could remove some small allocations.