Skip to content

Reducing allocations in parsing #2001

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Dec 23, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
52af22d
delay error list allocation until it's needed
adamsitnik Dec 20, 2022
105e592
add nullable annotation for Parser.Parse(arguments), avoid allocation…
adamsitnik Dec 20, 2022
02c0daf
minor Token cleanup
adamsitnik Dec 20, 2022
9741080
don't allocate a new list to normalize RootCommand
adamsitnik Dec 20, 2022
d6a5026
don't call EndsWith just to check one character
adamsitnik Dec 20, 2022
bc087b6
avoid allocations for parsing directives:
adamsitnik Dec 20, 2022
c38ccc1
delay List<SyntaxNode> allocation
adamsitnik Dec 20, 2022
d0e99ee
token type checks can be performed only for debug builds (these are i…
adamsitnik Dec 20, 2022
e919201
remove unused property that was allocating a list
adamsitnik Dec 20, 2022
11d72b0
delay List<OptionResult> allocation
adamsitnik Dec 20, 2022
4a71f26
delay List<ArgumentResult allocation
adamsitnik Dec 20, 2022
c0e3594
delay Dictionary<string, IReadOnlyList<string>> allocation
adamsitnik Dec 20, 2022
e353656
store the results of ParseResult.UnmatchedTokens and re-use it since …
adamsitnik Dec 20, 2022
b9c4d42
avoid allocations when there are no unmatched tokens
adamsitnik Dec 20, 2022
ac725a3
Since TokenizeResult.Tokens is not public and not used anywhere after…
adamsitnik Dec 21, 2022
fea7753
remove TokenizeResult to reduce JIT time and allocations
adamsitnik Dec 21, 2022
a31e1ec
delay List<ParseError> allocation
adamsitnik Dec 21, 2022
1002343
don't access Command.Options if there are no options defined (it allo…
adamsitnik Dec 21, 2022
30671e8
don't access Command.Arguments if there are no arguments defined (it …
adamsitnik Dec 21, 2022
8e096c8
don't access Command.Subcommands if there are no commands defined (it…
adamsitnik Dec 21, 2022
7eefbeb
delay List<SymbolResult> allocation
adamsitnik Dec 21, 2022
8e04f50
delay List<Token> allocation
adamsitnik Dec 21, 2022
d6f38a2
not every SymbolResult needs to contain a Dictionary<Argument, Argume…
adamsitnik Dec 21, 2022
4afe5b5
Aliases are internally backed by a HashSet, there is no point in crea…
adamsitnik Dec 23, 2022
96c8e77
avoid list allocation
adamsitnik Dec 23, 2022
0bb3b51
address code review feedback
adamsitnik Dec 23, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 31 additions & 15 deletions src/System.CommandLine/Command.cs
Original file line number Diff line number Diff line change
Expand Up @@ -57,18 +57,22 @@ public IEnumerable<Symbol> Children
/// </summary>
public IList<Argument> Arguments => _arguments ??= new(this);

internal bool HasArguments => _arguments is not null;
internal bool HasArguments => _arguments is not null && _arguments.Count > 0 ;

/// <summary>
/// Represents all of the options for the command, including global options that have been applied to any of the command's ancestors.
/// </summary>
public IList<Option> Options => _options ??= new (this);

internal bool HasOptions => _options is not null && _options.Count > 0;

/// <summary>
/// Represents all of the subcommands for the command.
/// </summary>
public IList<Command> Subcommands => _subcommands ??= new(this);

internal bool HasSubcommands => _subcommands is not null && _subcommands.Count > 0;

/// <summary>
/// Validators to the command. Validators can be used
/// to create custom validation logic.
Expand Down Expand Up @@ -151,34 +155,44 @@ public override IEnumerable<CompletionItem> GetCompletions(CompletionContext con

if (context.WordToComplete is { } textToMatch)
{
var commands = Subcommands;
for (int i = 0; i < commands.Count; i++)
if (HasSubcommands)
{
AddCompletionsFor(commands[i]);
var commands = Subcommands;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the vars that don't comply with dotnet/runtime guidelines, are you planning on addressing them all at once in the future?

Suggested change
var commands = Subcommands;
IList<Command> commands = Subcommands;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss it with other contributors. And the usage of is { } instead is not null which drives me crazy ;p

for (int i = 0; i < commands.Count; i++)
{
AddCompletionsFor(commands[i]);
}
}

var options = Options;
for (int i = 0; i < options.Count; i++)
if (HasOptions)
{
AddCompletionsFor(options[i]);
var options = Options;
for (int i = 0; i < options.Count; i++)
{
AddCompletionsFor(options[i]);
}
}

var arguments = Arguments;
for (int i = 0; i < arguments.Count; i++)
if (HasArguments)
{
var argument = arguments[i];
foreach (var completion in argument.GetCompletions(context))
var arguments = Arguments;
for (int i = 0; i < arguments.Count; i++)
{
if (completion.Label.ContainsCaseInsensitive(textToMatch))
var argument = arguments[i];
foreach (var completion in argument.GetCompletions(context))
{
completions.Add(completion);
if (completion.Label.ContainsCaseInsensitive(textToMatch))
{
completions.Add(completion);
}
}
}
}

foreach (var parent in Parents.FlattenBreadthFirst(p => p.Parents))
ParentNode? parent = FirstParent;
while (parent is not null)
{
if (parent is Command parentCommand)
if (parent.Symbol is Command parentCommand && parentCommand.HasOptions)
{
for (var i = 0; i < parentCommand.Options.Count; i++)
{
Expand All @@ -190,6 +204,8 @@ public override IEnumerable<CompletionItem> GetCompletions(CompletionContext con
}
}
}

parent = parent.Next;
}
}

Expand Down
20 changes: 13 additions & 7 deletions src/System.CommandLine/Help/HelpBuilder.Default.cs
Original file line number Diff line number Diff line change
Expand Up @@ -201,11 +201,14 @@ public static Action<HelpContext> OptionsSection() =>
// by making this logic more complex, we were able to get some nice perf wins elsewhere
List<TwoColumnHelpRow> options = new();
HashSet<Option> uniqueOptions = new();
foreach (Option option in ctx.Command.Options)
if (ctx.Command.HasOptions)
{
if (!option.IsHidden && uniqueOptions.Add(option))
foreach (Option option in ctx.Command.Options)
{
options.Add(ctx.HelpBuilder.GetTwoColumnRow(option, ctx));
if (!option.IsHidden && uniqueOptions.Add(option))
{
options.Add(ctx.HelpBuilder.GetTwoColumnRow(option, ctx));
}
}
}

Expand All @@ -218,12 +221,15 @@ public static Action<HelpContext> OptionsSection() =>
{
if ((parentCommand = parent.Symbol as Command) is not null)
{
foreach (var option in parentCommand.Options)
if (parentCommand.HasOptions)
{
// global help aliases may be duplicated, we just ignore them
if (option.IsGlobal && !option.IsHidden && uniqueOptions.Add(option))
foreach (var option in parentCommand.Options)
{
options.Add(ctx.HelpBuilder.GetTwoColumnRow(option, ctx));
// global help aliases may be duplicated, we just ignore them
if (option.IsGlobal && !option.IsHidden && uniqueOptions.Add(option))
{
options.Add(ctx.HelpBuilder.GetTwoColumnRow(option, ctx));
}
}
}

Expand Down
11 changes: 7 additions & 4 deletions src/System.CommandLine/Help/HelpBuilder.cs
Original file line number Diff line number Diff line change
Expand Up @@ -125,22 +125,25 @@ IEnumerable<string> GetUsageParts()
{
if (!displayOptionTitle)
{
displayOptionTitle = parentCommand.Options.Any(x => x.IsGlobal && !x.IsHidden);
displayOptionTitle = parentCommand.HasOptions && parentCommand.Options.Any(x => x.IsGlobal && !x.IsHidden);
}

yield return parentCommand.Name;

yield return FormatArgumentUsage(parentCommand.Arguments);
if (parentCommand.HasArguments)
{
yield return FormatArgumentUsage(parentCommand.Arguments);
}
}

var hasCommandWithHelp = command.Subcommands.Any(x => !x.IsHidden);
var hasCommandWithHelp = command.HasSubcommands && command.Subcommands.Any(x => !x.IsHidden);

if (hasCommandWithHelp)
{
yield return LocalizationResources.HelpUsageCommand();
}

displayOptionTitle = displayOptionTitle || command.Options.Any(x => !x.IsHidden);
displayOptionTitle = displayOptionTitle || (command.HasOptions && command.Options.Any(x => !x.IsHidden));

if (displayOptionTitle)
{
Expand Down
24 changes: 16 additions & 8 deletions src/System.CommandLine/Invocation/TypoCorrection.cs
Original file line number Diff line number Diff line change
Expand Up @@ -23,31 +23,39 @@ public TypoCorrection(int maxLevenshteinDistance)

public void ProvideSuggestions(ParseResult result, IConsole console)
{
for (var i = 0; i < result.UnmatchedTokens.Count; i++)
var unmatchedTokens = result.UnmatchedTokens;
Copy link
Member Author

@adamsitnik adamsitnik Dec 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to store the result, because currently it's always allocating:

public IReadOnlyList<string> UnmatchedTokens => _unmatchedTokens.Select(t => t.Value).ToArray();

ofc I am going to propose a change to this API

for (var i = 0; i < unmatchedTokens.Count; i++)
{
var token = result.UnmatchedTokens[i];
var suggestions = GetPossibleTokens(result.CommandResult.Command, token).ToList();
if (suggestions.Count > 0)
var token = unmatchedTokens[i];

bool first = true;
foreach (string suggestion in GetPossibleTokens(result.CommandResult.Command, token))
{
console.Out.WriteLine(result.CommandResult.LocalizationResources.SuggestionsTokenNotMatched(token));
foreach(string suggestion in suggestions)
if (first)
{
console.Out.WriteLine(suggestion);
console.Out.WriteLine(result.CommandResult.LocalizationResources.SuggestionsTokenNotMatched(token));
first = false;
}

console.Out.WriteLine(suggestion);
}
}
}

private IEnumerable<string> GetPossibleTokens(Command targetSymbol, string token)
{
if (!targetSymbol.HasOptions && !targetSymbol.HasSubcommands)
{
return Array.Empty<string>();
}

IEnumerable<string> possibleMatches = targetSymbol
.Children
.OfType<IdentifierSymbol>()
.Where(x => !x.IsHidden)
.Where(x => x.Aliases.Count > 0)
.Select(symbol =>
symbol.Aliases
.Union(symbol.Aliases)
.OrderBy(x => GetDistance(token, x))
.ThenByDescending(x => GetStartsWithDistance(token, x))
.First()
Expand Down
32 changes: 16 additions & 16 deletions src/System.CommandLine/ParseResult.cs
Original file line number Diff line number Diff line change
Expand Up @@ -14,44 +14,41 @@ namespace System.CommandLine
/// </summary>
public class ParseResult
{
private readonly List<ParseError> _errors;
private readonly IReadOnlyList<ParseError> _errors;
private readonly RootCommandResult _rootCommandResult;
private readonly IReadOnlyList<Token> _unmatchedTokens;
private Dictionary<string, IReadOnlyList<string>>? _directives;
private CompletionContext? _completionContext;

internal ParseResult(
Parser parser,
RootCommandResult rootCommandResult,
CommandResult commandResult,
IReadOnlyDictionary<string, IReadOnlyList<string>> directives,
TokenizeResult tokenizeResult,
Dictionary<string, IReadOnlyList<string>>? directives,
List<Token> tokens,
IReadOnlyList<Token>? unmatchedTokens,
List<ParseError>? errors,
string? commandLineText = null)
{
Parser = parser;
_rootCommandResult = rootCommandResult;
CommandResult = commandResult;
Directives = directives;
_directives = directives;

// skip the root command when populating Tokens property
if (tokenizeResult.Tokens.Count > 1)
if (tokens.Count > 1)
{
var tokens = new Token[tokenizeResult.Tokens.Count - 1];
for (var i = 0; i < tokenizeResult.Tokens.Count - 1; i++)
{
var token = tokenizeResult.Tokens[i + 1];
tokens[i] = token;
}

// Since TokenizeResult.Tokens is not public and not used anywhere after the parsing,
// we take advantage of its mutability and remove the root command token
// instead of creating a copy of the whole list.
tokens.RemoveAt(0);
Tokens = tokens;
}
else
{
Tokens = Array.Empty<Token>();
}

_errors = errors ?? new List<ParseError>();
CommandLineText = commandLineText;

if (unmatchedTokens is null)
Expand All @@ -67,10 +64,12 @@ internal ParseResult(
for (var i = 0; i < _unmatchedTokens.Count; i++)
{
var token = _unmatchedTokens[i];
_errors.Add(new ParseError(parser.Configuration.LocalizationResources.UnrecognizedCommandOrArgument(token.Value), rootCommandResult));
(errors ??= new()).Add(new ParseError(parser.Configuration.LocalizationResources.UnrecognizedCommandOrArgument(token.Value), rootCommandResult));
}
}
}

_errors = errors is not null ? errors : Array.Empty<ParseError>();
}

internal static ParseResult Empty() => new RootCommand().Parse(Array.Empty<string>());
Expand Down Expand Up @@ -99,7 +98,7 @@ internal ParseResult(
/// Gets the directives found while parsing command line input.
/// </summary>
/// <remarks>If <see cref="CommandLineConfiguration.EnableDirectives"/> is set to <see langword="false"/>, then this collection will be empty.</remarks>
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives { get; }
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new ();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the future we are going to use IReadOnlyDictionary<TKey, TValue>.Empty introduced very recently in .NET 8: dotnet/runtime#76028

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static Empty properties were not added to the interfaces after all, but ReadOnlyDictionary<TKey, TValue>.Empty was added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new ();
public IReadOnlyDictionary<string, IReadOnlyList<string>> Directives => _directives ??= new();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is a matter of personal preference. We should discuss it with others, choose one policy and enforce it with static code analysis tools.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an analyzer for that yet? I don't see one listed at https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/style-rules/csharp-formatting-options#spacing-options. (There is IDE0090 "Simplify new expression" for target-typed new, though.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


/// <summary>
/// Gets the tokens identified while parsing command line input.
Expand All @@ -115,7 +114,8 @@ internal ParseResult(
/// <summary>
/// Gets the list of tokens used on the command line that were not matched by the parser.
/// </summary>
public IReadOnlyList<string> UnmatchedTokens => _unmatchedTokens.Select(t => t.Value).ToArray();
public IReadOnlyList<string> UnmatchedTokens
=> _unmatchedTokens.Count == 0 ? Array.Empty<string>() : _unmatchedTokens.Select(t => t.Value).ToArray();

/// <summary>
/// Gets the completion context for the parse result.
Expand Down
11 changes: 7 additions & 4 deletions src/System.CommandLine/Parsing/ArgumentResult.cs
Original file line number Diff line number Diff line change
Expand Up @@ -64,11 +64,14 @@ public void OnlyTake(int numberOfTokens)
throw new InvalidOperationException($"{nameof(OnlyTake)} can only be called once.");
}

var passedOnTokensCount = _tokens.Count - numberOfTokens;
if (_tokens is not null)
{
var passedOnTokensCount = _tokens.Count - numberOfTokens;

PassedOnTokens = new List<Token>(_tokens.GetRange(numberOfTokens, passedOnTokensCount));

PassedOnTokens = new List<Token>(_tokens.GetRange(numberOfTokens, passedOnTokensCount));

_tokens.RemoveRange(numberOfTokens, passedOnTokensCount);
_tokens.RemoveRange(numberOfTokens, passedOnTokensCount);
}
}

/// <inheritdoc/>
Expand Down
7 changes: 3 additions & 4 deletions src/System.CommandLine/Parsing/CommandArgumentNode.cs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
// Copyright (c) .NET Foundation and contributors. All rights reserved.
// Licensed under the MIT license. See LICENSE file in the project root for full license information.

using System.Diagnostics;

namespace System.CommandLine.Parsing
{
internal class CommandArgumentNode : SyntaxNode
Expand All @@ -10,10 +12,7 @@ public CommandArgumentNode(
Argument argument,
CommandNode parent) : base(token, parent)
{
if (token.Type != TokenType.Argument)
{
throw new ArgumentException($"Incorrect token type: {token}");
}
Debug.Assert(token.Type == TokenType.Argument, $"Incorrect token type: {token}");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these types are internal and we control the input, so there is no need to the throw condition. Debug.Assert will keep us safe (the CI runs all the tests in debug) with 0 cost for the Release build (less IL to compile, possibility to inline the method)


Argument = argument;
ParentCommandNode = parent;
Expand Down
9 changes: 9 additions & 0 deletions src/System.CommandLine/Parsing/CommandResult.cs
Original file line number Diff line number Diff line change
@@ -1,13 +1,17 @@
// Copyright (c) .NET Foundation and contributors. All rights reserved.
// Licensed under the MIT license. See LICENSE file in the project root for full license information.

using System.Collections.Generic;

namespace System.CommandLine.Parsing
{
/// <summary>
/// A result produced when parsing a <see cref="Command" />.
/// </summary>
public class CommandResult : SymbolResult
{
private Dictionary<Argument, ArgumentResult>? _defaultArgumentValues;

internal CommandResult(
Command command,
Token token,
Expand Down Expand Up @@ -36,5 +40,10 @@ internal override bool UseDefaultValueFor(Argument argument) =>
arg.Tokens.Count == 0,
_ => false
};

internal ArgumentResult GetOrCreateDefaultArgumentResult(Argument argument) =>
(_defaultArgumentValues ??= new()).GetOrAdd(
argument,
arg => new ArgumentResult(arg, this));
}
}
7 changes: 3 additions & 4 deletions src/System.CommandLine/Parsing/DirectiveNode.cs
Original file line number Diff line number Diff line change
@@ -1,6 +1,8 @@
// Copyright (c) .NET Foundation and contributors. All rights reserved.
// Licensed under the MIT license. See LICENSE file in the project root for full license information.

using System.Diagnostics;

namespace System.CommandLine.Parsing
{
internal class DirectiveNode : SyntaxNode
Expand All @@ -11,10 +13,7 @@ public DirectiveNode(
string name,
string? value) : base(token, parent)
{
if (token.Type != TokenType.Directive)
{
throw new ArgumentException($"Incorrect token type: {token}");
}
Debug.Assert(token.Type == TokenType.Directive, $"Incorrect token type: {token}");

Name = name;
Value = value;
Expand Down
Loading