Skip to content

Reducing allocations in parsing #2001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Dec 23, 2022
Merged

Conversation

adamsitnik
Copy link
Member

While reading parsing logic I found plenty of places where we could remove some small allocations.

@adamsitnik adamsitnik requested a review from jozkee December 21, 2022 11:56
@@ -99,7 +100,7 @@ public class ParseResult
/// Gets the directives found while parsing command line input.
/// </summary>
/// <remarks>If <see cref="CommandLineConfiguration.EnableDirectives"/> is set to <see langword="false"/>, then this collection will be empty.</remarks>
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives { get; }
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new ();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future we are going to use IReadOnlyDictionary<TKey, TValue>.Empty introduced very recently in .NET 8: dotnet/runtime#76028

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static Empty properties were not added to the interfaces after all, but ReadOnlyDictionary<TKey, TValue>.Empty was added.

@@ -23,9 +23,10 @@ public TypoCorrection(int maxLevenshteinDistance)

public void ProvideSuggestions(ParseResult result, IConsole console)
{
for (var i = 0; i < result.UnmatchedTokens.Count; i++)
var unmatchedTokens = result.UnmatchedTokens;
Copy link
Member Author

@adamsitnik adamsitnik Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to store the result, because currently it's always allocating:

public IReadOnlyList<string> UnmatchedTokens => _unmatchedTokens.Select(t => t.Value).ToArray();

ofc I am going to propose a change to this API

{
throw new ArgumentException($"Incorrect token type: {token}");
}
Debug.Assert(token.Type == TokenType.Argument, $"Incorrect token type: {token}");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these types are internal and we control the input, so there is no need to the throw condition. Debug.Assert will keep us safe (the CI runs all the tests in debug) with 0 cost for the Release build (less IL to compile, possibility to inline the method)

@@ -41,9 +41,11 @@ public Parser() : this(new RootCommand())
/// <param name="rawInput">The complete command line input prior to splitting and tokenization. This input is not typically available when the parser is called from <c>Program.Main</c>. It is primarily used when calculating completions via the <c>dotnet-suggest</c> tool.</param>
/// <returns>A <see cref="ParseResult"/> providing details about the parse operation.</returns>
public ParseResult Parse(
IReadOnlyList<string> arguments,
IReadOnlyList<string>? arguments,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a test that verifies that the user can pass null here. So I made it more visible.


var knownTokens = configuration.RootCommand.ValidTokens();

for (var i = 0; i < argList.Count; i++)
int i = FirstArgumentIsRootCommand(args, configuration.RootCommand, inferRootCommand)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the NormalizeRootCommand was allocating a new list just to ensure that the list of arguments starts with the root command alias.

argList.InsertRange(i + 1, newTokens ?? Array.Empty<string>());
if (newTokens is not null && newTokens.Count > 0)
{
List<string> listWithReplacedTokens = args.ToList();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be rare

var errorList = new List<string>();
const int FirstArgIsNotRootCommand = -1;

List<string>? errorList = null;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usually there are no errors, so we delay the allocation until first error is found

…ntResult>

duplicate the code rather than introducing new public type like IdentifierSymbolResult
{
AddCompletionsFor(commands[i]);
var commands = Subcommands;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the vars that don't comply with dotnet/runtime guidelines, are you planning on addressing them all at once in the future?

Suggested change
var commands = Subcommands;
IList<Command> commands = Subcommands;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss it with other contributors. And the usage of is { } instead is not null which drives me crazy ;p

@@ -99,7 +98,7 @@ public class ParseResult
/// Gets the directives found while parsing command line input.
/// </summary>
/// <remarks>If <see cref="CommandLineConfiguration.EnableDirectives"/> is set to <see langword="false"/>, then this collection will be empty.</remarks>
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives { get; }
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new ();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new ();
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is a matter of personal preference. We should discuss it with others, choose one policy and enforce it with static code analysis tools.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an analyzer for that yet? I don't see one listed at https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/style-rules/csharp-formatting-options#spacing-options. (There is IDE0090 "Simplify new expression" for target-typed new, though.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamsitnik adamsitnik merged commit 88fb455 into dotnet:main Dec 23, 2022
@adamsitnik adamsitnik deleted the allocations branch December 23, 2022 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants