Skip to content

[ApiXmlAdjuster] Use different data model structures for better performance #756

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 10, 2020

Conversation

jpobst
Copy link
Contributor

@jpobst jpobst commented Dec 8, 2020

Today, our ApiXmlAdjuster process builds an in-memory model of every Java type we know about, and then queries this model many times in order to ensure we can resolve every needed Java type to build a binding. The data structure of this model is:

public class JavaApi
{
  public List<JavaPackage> Packages { get; }
}

public class JavaPackage
{
  public List<JavaType> Types { get; }
}

Then it is queried using LINQ like this:

// Bad
var type = api.Packages.SelectMany (p => p.Types).FirstOrDefault (t => t.Name == type_name);

// Less bad
var type = api.Packages.FirstOrDefault (p => p.Name == pkg_name)?.Types.FirstOrDefault (t => t.Name == type_name);

In the random GPS package I used for testing, JavaApi contained 310 packages and a total of 5941 types. Repeatedly looping through them looking for the correct type takes a considerable amount of time.

Instead, we can use a Dictionary to store packages and types keyed by name to significantly speed up type resolution:

public class JavaApi
{
  public Dictionary<string, JavaPackage> Packages { get; }
}

public class JavaPackage
{
  public Dictionary<string, List<JavaType>> Types { get; }
}

For the GPS project, this reduced time taken considerably:

Version Time
master 9.3s
This PR 1.4s

The only "interesting" detail is that we can have multiple types with the same Java name, such as MyInterface and MyInterfaceConsts. Thus we need to use Dictionary<string, List<JavaType>> to ensure we collect them all.

For good measure I ran a XA build with this to ensure Mono.Android ApiCompat didn't find any issues: https://devdiv.visualstudio.com/DevDiv/_build/results?buildId=4282925.

@jpobst jpobst changed the title [ApiAdjuster] Use different data model structures for better performance [ApiXmlAdjuster] Use different data model structures for better performance Dec 8, 2020
@dotnet dotnet deleted a comment from azure-pipelines bot Dec 9, 2020
@jpobst jpobst marked this pull request as ready for review December 9, 2020 15:45
// It's debatable whether we handle this "properly", as most callers just
// do `First ()`, but it's been working for years so I'm not changing it.
// Exposes an IReadOnlyDictionary so caller cannot bypass our AddType/RemoveType code.
public IReadOnlyDictionary<string, List<JavaType>> Types { get; }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…alternatively, should this be Set<JavaType> instead of List<JavaType>? Does it make sense to allow for the same instance to be added multiple times?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same "compatibility"/"diff minimization" question/concern applies to Types vs. AllTypes: I suspect that if Types became an IEnumerable<JavaType> property and AllTypes were the "Dictionary-like" property, fewer changes would be required.

@jpobst
Copy link
Contributor Author

jpobst commented Dec 9, 2020

IMO, the changed Packages and Types properties are the canonical API here that every consumer should use, and are the correct types.

I added AllPackages/AllTypes as hacks to minimize the diffs. In an ideal world every consumer would be updated to use the canonical API and ensure they properly handle any duplicate types if needed, but I opted for a smaller diff instead. 😄

}

public partial class JavaPackage
{
private Dictionary<string, List<JavaType>> types = new Dictionary<string, List<JavaType>> (StringComparer.OrdinalIgnoreCase);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java type names are not filesystem-bound constructs; they are case sensitive, and it is entirely valid to have a package example and package Example in the same app. (It'll cause pain & suffering on case-insensitive filesystems! But Java allows it.)

I think this should use StringComparer.Ordinal.

Copy link
Contributor Author

@jpobst jpobst Dec 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, fixed.

@@ -67,11 +70,11 @@ public static void Load (this JavaPackage package, XmlReader reader, bool isRefe
if (reader.LocalName == "class") {
var kls = new JavaClass (package) { IsReferenceOnly = isReferenceOnly };
kls.Load (reader);
package.Types.Add (kls);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

…back on the "smaller diffs!" front, if we did have a class JavaTypes : KeyedCollection<string, JavaType> class, this statement wouldn't need to change, nor the many other statements just like it…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested out KeyedCollection, but it cannot handle duplicate keys:

System.ArgumentException: 'An item with the same key has already been added. Key: key1'

@jonpryor jonpryor merged commit 0cb8e2d into master Dec 10, 2020
@jonpryor jonpryor deleted the adjuster-perf branch December 10, 2020 18:39
@jpobst jpobst added this to the 11.1 (16.9 / 8.9) milestone Jan 5, 2021
@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants