Skip to content

[API Proposal]: System.Numerics Distance, Vector, Math Operations #83209

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
luisquintanilla opened this issue Mar 9, 2023 · 13 comments
Closed
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Numerics
Milestone

Comments

@luisquintanilla
Copy link

luisquintanilla commented Mar 9, 2023

Backgound and motivation

System.Numerics is a powerful library in .NET that provides support for mathematical operations on vectors, matrices, and complex numbers. Because these are SIMD-enabled types, operations on those types can be optimized. The library already contains a wide range of mathematical operations, but there are some common operations that are missing. In this proposal, we propose the addition of APIs to System.Numerics for the following operations:

  • Distance Metrics
    • Cosine Similarity
    • Euclidean Distance: An implementation is already in System.Numerics but limited to vectors of size 2-4
  • Vector Operations
    • Dot Product: already in System.Numerics but limited to vectors.
    • Normalize (L2 / Euclidean Length)
  • Math Operations
    • Softmax
    • Sigmoid
    • Tanh: Already in System.Math but only on scalars.
    • Exponential. Already in System.Math but only for scalars.

By providing a common optimized implementation for these operations, .NET library authors and developers can focus on high-level components of their application.

API proposal

The following are the proposed APIs for the operations listed above.

Additional details:

Property Value
Expected Vector Size 1-8191
Precision FP32, FP16, INT8

Cosine Similarity

Cosine similarity is a measure of similarity between two non-zero vectors of an inner product space. It is widely used in natural language processing and information retrieval. The proposed API will take two floating point collections as input and return the cosine similarity between them.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static T CosineSimilarity<T>(ReadOnlySpan<T> vector1, ReadOnlySpan<T> vector2)
	
public static T CosineSimilarity<T>(T[] vector1, T[] vector2)
	
public static T CosineSimilarity<T>(Vector<T> vector1, Vector<T> vector2)

Euclidean Distance

Euclidean distance is a measure of distance between two points in a Euclidean space. It is widely used in mathematics, engineering, and machine learning. The proposed API will take two floating point collections as input and return the euclidean distance between the collections.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static T EuclideanDistance<T>(ReadOnlySpan<T> vector1, ReadOnlySpan<T> vector2)
	
public static T EuclideanDistance<T>(T[] vector1, T[] vector2)
	
public static T EuclideanDistance<T>(Vector<T> vector1, Vector<T> vector2)

Dot Product

Dot product is a mathematical operation that takes two vectors and returns a scalar. It is widely used in linear algebra and machine learning. The proposed API will take two floating point collections as input and return the dot product of the vectors.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static T Dot<T>(ReadOnlySpan<T> vector1, ReadOnlySpan<T> vector2)

public static T Dot<T>(T[] vector1, T[] vector2)

public static T Dot<T>(Vector<T> vector1, Vector<T> vector2)

Normalize (L2 / Euclidean Length)

Normalize is a mathematical operation that takes a vector and returns a unit vector in the same direction. It is widely used in linear algebra and machine learning. The proposed API will take a vector as input and return the normalized vector.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static ReadOnlySpan<T> Normalize<T>(ReadOnlySpan<T> vector)

public static T[] Normalize<T>(T[] vector)

public static Vector<T> Normalize<T>(Vector<T> vector)

Softmax

Softmax is a function that takes a collection of real numbers and returns a probability distribution. It is widely used in machine learning and deep learning. The proposed API will take a real number collection as input and return the softmax of the collection.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static ReadOnlySpan<T> Softmax<T>(ReadOnlySpan<T> vector)

public static T[] Softmax<T>(T[] vector)

public static Vector<T> Softmax<T>(Vector<T> vector)

Sigmoid

Sigmoid is a function that takes a real number and returns a value between 0 and 1. It is widely used in machine learning and deep learning. The proposed API will take a real number collection as input and return the sigmoid of the number for each element in the collection.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static ReadOnlySpan<T> Sigmoid<T>(ReadOnlySpan<T> vector)

public static T[] Sigmoid<T>(T[] vector)

public static Vector<T> Sigmoid<T>(Vector<T> vector)

Tanh

Tanh is a function that takes a real number and returns a value between -1 and 1. It is widely used in machine learning and deep learning as an activation function. The proposed API will take a real number collection as input and return the tanh for each element in the collection.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static ReadOnlySpan<T> Tanh<T>(ReadOnlySpan<T> vector)

public static T[] Tanh<T>(T[] vector)

public static Vector<T> Tanh<T>(Vector<T> vector)

Exponential (Exp)

Exponential is a function that takes a real number and returns e raised to the power of the number. It is widely used in mathematics, engineering, and machine learning. The proposed API will take a real number collectionas input and return the exponential for each element in the collection.

// Assembly System.Numerics.Vectors
namespace System.Numerics;

public static ReadOnlySpan<T> Exp<T>(ReadOnlySpan<T> vector)

public static T[] Exp<T>(T[] vector)

public static Vector<T> Exp<T>(Vector<T> vector)

API Usage

Cosine Similarity

var input1 = new []{-0.4165,  0.1393,  1.1077}

var input2 = new [] {
	new [] {0.5073, -0.2416,  1.7139},
  new [] {-0.1245, -0.4960, -0.0715}
}

// Outputs
var output = 
	input2
		.Select(v => CosineSimilarity(input1,v))

// Outputs
// [0.7694, -0.1568]

Euclidean Distance

var input1 = new []{-0.4165,  0.1393,  1.1077}

var input2 = new [] {
	new [] {0.5073, -0.2416,  1.7139},
  new [] {-0.1245, -0.4960, -0.0715}
}

// Outputs
var output = 
	input2
		.Select(v => Distance(input1,v))
// Outputs
// [1.1687, 1.3709]

Dot Product

//1-D collection
var input1 = new [] {2,3};

//1-D collection
var input2 = new [] {2,1};

// Outputs
var output = 
	Dot(input1, input2)

// Outputs
// 7

Normalize

var input = new [] {
	new [] {0.1146, -0.5104, -1.2457},
	new [] {0.3813, -0.6505, -0.1605}
}

// Outputs
var output = 
	input.Select(v => Normalize(v))

// Outputs
// [
//		[0.0849, -0.3778, -0.9220],
//		[0.4946, -0.8438, -0.2082],	
//	]

Related:
https://github.com/microsoft/semantic-kernel/blob/main/dotnet/src/SemanticKernel/AI/Embeddings/VectorOperations/NormalizeOperation.cs

Softmax

var input = new [] {
	new [] {0.1146, -0.5104, -1.2457},
	new [] {0.3813, -0.6505, -0.1605}
}

// Outputs
var output = 
	input.Select(v => Softmax(v))

// Outputs
// [
//		[0.5581, 0.2987, 0.1432],
//		[0.5160, 0.1839, 0.3001],	
//	]

Sigmoid

var input = new [] {
	new [] {0.1146, -0.5104, -1.2457},
	new [] {0.3813, -0.6505, -0.1605}
}

// Outputs
var output = 
	input.Select(v => Sigmoid(v))

// Outputs
// [
//		[0.5286, 0.3751, 0.2234],
//		[0.5942, 0.3429, 0.4600],	
//	]

Tanh

var input = new [] {
	new [] {0.1146, -0.5104, -1.2457},
	new [] {0.3813, -0.6505, -0.1605}
}

// Outputs
var output = 
	input.Select(v => Tanh(v))

// Outputs
// [
//		[0.1141, -0.4703, -0.8471],
//		[0.3638, -0.5720, -0.1591],	
//	]

Exponential (Exp)

var input = new [] {
	new [] {0.1146, -0.5104, -1.2457},
	new [] {0.3813, -0.6505, -0.1605}
}

// Outputs
var output = 
	input.Select(v => Tanh(v))

// Outputs
// [
//		[0.1141, -0.4703, -0.8471],
//		[0.3638, -0.5720, -0.1591],	
//	]

Use cases

Semantic Search, Clustering, and Recommendations using Embeddings

Distance metrics are useful when performing search, clustering and recommendations because you want to surface related items. The following is an examples from Semantic Kernel using Cosine Similarity to find similar embeddings from an embedding collection.

https://github.com/microsoft/semantic-kernel/blob/344bc54f329dd342bf9836913cdd96e25bbeb25f/samples/dotnet/kernel-syntax-examples/Example23_ReadOnlyMemoryStore.cs#L107-L138

public IAsyncEnumerable<(MemoryRecord, double)> GetNearestMatchesAsync(string collectionName, Embedding<float> embedding, int limit,
	double minRelevanceScore = 0, bool withEmbeddings = false, CancellationToken cancel = default)
{
	// Note: with this simple implementation, the MemoryRecord will always contain the embedding.
	if (this._memoryRecords == null || !this._memoryRecords.Any())
	{
		return AsyncEnumerable.Empty<(MemoryRecord, double)>();
	}


	if (embedding.Count != this._vectorSize)
	{
		throw new Exception($"Embedding vector size {embedding.Count} does not match expected size of {this._vectorSize}");
	}


	EmbeddingReadOnlySpan<float> embeddingSpan = new(embedding.AsReadOnlySpan());


	TopNCollection<MemoryRecord> embeddings = new(limit);


	foreach (var item in this._memoryRecords)
	{
		EmbeddingReadOnlySpan<float> itemSpan = new(item.Embedding.AsReadOnlySpan());
		double similarity = embeddingSpan.CosineSimilarity(itemSpan);
		if (similarity >= minRelevanceScore)
		{
			embeddings.Add(new(item, similarity));
		}
	}


	embeddings.SortByScore();


	return embeddings.Select(x => (x.Value, x.Score.Value)).ToAsyncEnumerable();
}

See the following documents for more details:

Normalize ML Model Outputs

Sometimes, the outputs from a machine learning model have to be post-processed. Some common post-processing operations include calculating the sigmoid or softmax

private float Sigmoid(float value)
{
    var k = (float)Math.Exp(value);
    return k / (1.0f + k);
}

private float[] Softmax(float[] values)
{
    var maxVal = values.Max();
    var exp = values.Select(v => Math.Exp(v - maxVal));
    var sumExp = exp.Sum();

    return exp.Select(v => (float)(v / sumExp)).ToArray();
}

private int GetOffset(int x, int y, int channel)
{
    // YOLO outputs a tensor that has a shape of 125x13x13, which 
    // WinML flattens into a 1D array.  To access a specific channel 
    // for a given (x,y) cell position, we need to calculate an offset
    // into the array
    return (channel * this.channelStride) + (y * COL_COUNT) + x;
}

private float GetConfidence(float[] modelOutput, int x, int y, int channel)
{
    return Sigmoid(modelOutput[GetOffset(x, y, channel + 4)]);
}

private CellDimensions MapBoundingBoxToCell(int x, int y, int box, BoundingBoxDimensions boxDimensions)
{
    return new CellDimensions
    {
        X = ((float)x + Sigmoid(boxDimensions.X)) * CELL_WIDTH,
        Y = ((float)y + Sigmoid(boxDimensions.Y)) * CELL_HEIGHT,
        Width = (float)Math.Exp(boxDimensions.Width) * CELL_WIDTH * anchors[box * 2],
        Height = (float)Math.Exp(boxDimensions.Height) * CELL_HEIGHT * anchors[box * 2 + 1],
    };
}

public float[] ExtractClasses(float[] modelOutput, int x, int y, int channel)
{
    float[] predictedClasses = new float[CLASS_COUNT];
    int predictedClassOffset = channel + BOX_INFO_FEATURE_COUNT;
    for (int predictedClass = 0; predictedClass < CLASS_COUNT; predictedClass++)
    {
        predictedClasses[predictedClass] = modelOutput[GetOffset(x, y, predictedClass + predictedClassOffset)];
    }
    return Softmax(predictedClasses);
}

In this case, the Sigmoid and Softmax functions are being used to:

  1. Calculate the confidence that an object exists within the detected bounding boxes.
  2. Get the probabilities for all of the potential classes and map that to each of the bounding boxes.

See the ML.NET ONNX Object Detection tutorial for more details.

Alternative Designs

Today, several of these operations are available in libraries like ML.NET, TorchSharp, Semantic Kernel, and TensorFlow.NET.

These implementations have challenges. In the case of ML.NET, the methods live in an internal class that's not publicly exposed. In the case of TorchSharp and TensorFlow.NET, the size of the library is large and the benefit of consuming the library just for a few functions does not outweigh the downsides of taking on such a large dependency.

The alternative is to implement your own as Semantic Kernel has done which although not hard, it's tedious and focused on functionality.

https://github.com/luisquintanilla/OAIDotnetZeroShotClassification/blob/7e37ba4fa3ff9d41387bbabe2cdd6de303135596/Program.cs#L45-L59

luisquintanilla/mlnet-workshop#16 (comment)

Open Questions

  • How do we extend these operations to N-dimensional collections / matrices?
  • For operations like Exp and Tanh which already exist in the .NET BCL, is there anything else that needs to be done here to support Vectors and other types of collections?
  • For operations like Dot which already exist in System.Numerics is there anything else that needs to be done here to support other types of collections?

Risks

  • Should these operations be part of an out-of-band package?
  • How would the implementation of these APIs be affected by down-level consumption?
@luisquintanilla luisquintanilla added the api-suggestion Early API idea and discussion, it is NOT ready for implementation label Mar 9, 2023
@ghost ghost added the area-System.Numerics label Mar 9, 2023
@ghost
Copy link

ghost commented Mar 9, 2023

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

When working with outputs from machine learning models, there are some operations applied to the outputs to normalize the data or measure distances. These operations include but are not limited to:

Today, those operations are available in libraries like ML.NET and TorchSharp.

Both of these implementations have challenges. In the case of ML.NET, the methods live in an internal class that's not publicly exposed. In the case of TorchSharp, the size of the library is large and the benefit of consuming the library just for a few functions does not outweigh the downsides of taking on such a large dependency.

The alternative is to implement your own which although not hard, it's tedious and not as performant as it could be.

https://github.com/luisquintanilla/OAIDotnetZeroShotClassification/blob/7e37ba4fa3ff9d41387bbabe2cdd6de303135596/Program.cs#L45-L59

luisquintanilla/mlnet-workshop#16 (comment)

API Proposal

The proposal would be to leverage existing components from System.Numerics such as Vectors and Generic Math to provide an all-up .NET implementation that can be consumed by other libraries such as ML.NET and others.

namespace System.Numerics;

public static T Softmax(Vector<T> inputs)
	
public static T Softmax(T[] inputs)

public static T Softmax(ReadOnlySpan<T> inputs)
	
public static T CosineSimilarity(Vector<T> a, Vector<T> b)
	
public static T CosineSimilarity(T[] a, T[] b)
	
public static T CosineSimilarity(ReadOnlySpan<T> a, ReadOnlySpan<T> b)	

API Usage

// Softmax
var probabilities = new Vector(new [] {0.89,0.75,0.6})

var v = new Vector(probabilities)
	
Softmax(v)
	
// Cosine Similarity
var item1 = new []{1.0,2.0,3.0};
var item2 = new []{1.2,1.9,4.0};

var v1 = new Vector(item1);
var v2 = new Vector(item2);

CosineSimilarity(v1,v2);

Alternative Designs

No response

Risks

There are a few things to consider:

  • Should these operations be part of an out-of-band package?
  • How would the implementation of these APIs be affected by down-level consumption?
Author: luisquintanilla
Assignees: -
Labels:

api-suggestion, area-System.Numerics

Milestone: -

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 9, 2023
@colombod
Copy link
Member

colombod commented Mar 9, 2023

what about one hot encoder ?

@colombod
Copy link
Member

colombod commented Mar 9, 2023

also assuming it can handle sequence of T and not jump straight into vectors

@luisquintanilla
Copy link
Author

luisquintanilla commented Mar 16, 2023

Additional operations to consider if not already available.

https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/SemanticKernel/AI/Embeddings/VectorOperations

  • DotProduct
  • Euclidean Distance
  • Normalize (Euclidean Length)

@colombod
Copy link
Member

Additional operations to consider if not already available.

https://github.com/microsoft/semantic-kernel/tree/main/dotnet/src/SemanticKernel/AI/Embeddings/VectorOperations

  • DotProduct
  • Euclidean Distance
  • Normalize (Euclidean Length)

makes total sense!

We start with those and the right dev and testing approach and will be very easy to expand. This is promising

@luisquintanilla luisquintanilla changed the title [API Proposal]: System.Numerics Normalization & Distance Metric Operations [API Proposal]: System.Numerics Normalization, Vector, and Distance Metric Operations Mar 20, 2023
@luisquintanilla luisquintanilla changed the title [API Proposal]: System.Numerics Normalization, Vector, and Distance Metric Operations [API Proposal]: System.Numerics Distance, Vector, Math Operations Apr 17, 2023
@stephentoub
Copy link
Member

stephentoub commented Apr 18, 2023

@luisquintanilla, for all of the APIs your proposing, can you clarify on what types you're expecting these to live? Are you envisioning some new static class of extensions? Would these be on the existing MemoryExtensions? Would anything here be on the generic math interfaces? Are there constraints on the Ts involved here? Etc. Or @tannergooding, do you have a vision for this? I can come up with some recommendations if needed, but I wasn't sure if it'd already been worked through and just not included here.

@luisquintanilla
Copy link
Author

luisquintanilla commented Apr 18, 2023

@stephentoub good questions. Here are my thoughts, though I'd defer to @tannergooding since he's much more familiar with these APIs.

what types you're expecting these to live?

I think we can use the Dot operation as a good example where these might live. Today the Dot operation is a static method on the Vector type.

https://learn.microsoft.com/dotnet/api/system.numerics.vector.dot?view=net-8.0

However, since we'd be looking to support other types like ReadOnlySpan and System.Collections like array, is Vectors still the best place for them?

Would anything here be on the generic math interfaces...Are there constraints on the Ts involved here?

This is possibly a good opportunity to use Generic Math interfaces here. Although we expect to mostly work with floating point numbers, there are some cases where integers may also be used. I've listed above the types of values we expect:

  • FP32 (most common)
  • FP16
  • INT8

What these map to in the context of Generic Math interfaces, I'm not sure.

Beyond numeric interfaces, I noticed that there are also function interfaces like IExponentialFunctions and IHyperbolicFunctions which I think satisfy some of the operations proposed here.

@stephentoub stephentoub added this to the 8.0.0 milestone Apr 28, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Apr 28, 2023
@ericstj
Copy link
Member

ericstj commented May 16, 2023

Are all of these operands single dimensional, or also 2-dim, n-dim? If the latter, we'll need to also think about how we describe dimension information. I doubt we'll want to use .NET multi-dimensional or jagged arrays (only) since those are not friendly to interop. Samples above show 2-dim jagged arrays.

@stephentoub
Copy link
Member

Samples above show 2-dim jagged arrays.

Which in particular? Skimming again, the inputs I see are all single dim... where there's jagged, they're actually treated as collections of arrays, where the inputs to the functions are all single dim, eg

var input1 = new []{-0.4165,  0.1393,  1.1077}

var input2 = new [] {
	new [] {0.5073, -0.2416,  1.7139},
  new [] {-0.1245, -0.4960, -0.0715}
}

// Outputs
var output = input2.Select(v => CosineSimilarity(input1,v))

// Outputs
// [0.7694, -0.1568]

@colombod
Copy link
Member

colombod commented May 16, 2023

I am still pondering the support for double, if we had proper generic support this would have been easier : dotnet/csharplang#6308 (comment)

@ericstj
Copy link
Member

ericstj commented May 16, 2023

where the inputs to the functions are all single dim

I see - I missed that it was using linq over the rows and listing the output as an array instead of IEnumerable. For single dim the specified inputs seem reasonable 👍. I think @colombod mentioned it might be interesting to add some interface as an input.

@tannergooding tannergooding modified the milestones: 8.0.0, Future Jul 24, 2023
@tannergooding tannergooding added api-ready-for-review API is ready for review, it is NOT ready for implementation api-suggestion Early API idea and discussion, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jul 28, 2023
@tannergooding
Copy link
Member

tannergooding commented Jul 28, 2023

Have opened #89639 since the new design builds on this initial proposal and adjusts it to take into account the overall future direction desired here.

The biggest note is that it separates out the more direct concern of the downlevel support needed from the broader future support for other types and scenarios, which will likely be for modern .NET only and likely generic

@tannergooding
Copy link
Member

Closing as this was superseded by #89639

@ghost ghost locked as resolved and limited conversation to collaborators Nov 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Numerics
Projects
None yet
Development

No branches or pull requests

5 participants