Raku implementation of a Streams Blending Recommender (SBR) framework.
Generally speaking, SBR is a "computer scientist" implementation of a recommendation system based on sparse linear algebra. See the article "Mapping Sparse Matrix Recommender to Streams Blending Recommender", [AA1], for detailed theoretical description of the data structures and operations with them.
This implementation is loosely based on the:
-
Software monad "MonadicSparseMatrixRecommender", [AAp1], in Mathematica
-
Software monad "SMRMon-R", [AAp2], in R
-
Object-Oriented Programming (OOP) implementation "SparseMatrixRecommender", [AAp3], in Python
Instead of "monads" the implementations in this package and [AAp3] use OOP classes. Instead of "monadic pipelines" method chaining is used.
See the org-mode file "Work-plan.org" for detailed status (including a TODO list.)
From GitHub:
zef install https://github.com/antononcube/Raku-ML-StreamsBlendingRecommender.git
From zef-ecosystem:
zef install ML::StreamsBlendingRecommender
In this section we show how to use the package by making a (nearest neighbors) classifier with it.
Here are the steps:
-
Take an example dataset
- Titanic data from "Data::Reshapers", [AAp4].
-
Make a recommender for that dataset
-
Use the
classify
method of the recommender over a random selection of rows.- Classify for the labels "died" or "survived" of Titanic's dataset column "passengerAge".
-
Evaluate the classification results
- Using Receiver Operating Characteristic (ROC) statistics via "ML::ROCFunctions", [AAp5].
use Data::Reshapers;
use Data::Summarizers;
my @dsTitanic = get-titanic-dataset();
records-summary(@dsTitanic)
# +-----------------+----------------+---------------+-------------------+----------------+
# | id | passengerClass | passengerSex | passengerSurvival | passengerAge |
# +-----------------+----------------+---------------+-------------------+----------------+
# | 1146 => 1 | 3rd => 709 | male => 843 | died => 809 | 20 => 334 |
# | 753 => 1 | 1st => 323 | female => 466 | survived => 500 | -1 => 263 |
# | 775 => 1 | 2nd => 277 | | | 30 => 258 |
# | 800 => 1 | | | | 40 => 190 |
# | 1263 => 1 | | | | 50 => 88 |
# | 572 => 1 | | | | 60 => 57 |
# | 15 => 1 | | | | 0 => 56 |
# | (Other) => 1302 | | | | (Other) => 63 |
# +-----------------+----------------+---------------+-------------------+----------------+
Here is a sample of the data:
to-pretty-table(@dsTitanic.roll(4));
# +----------------+--------------+-----+-------------------+--------------+
# | passengerClass | passengerAge | id | passengerSurvival | passengerSex |
# +----------------+--------------+-----+-------------------+--------------+
# | 3rd | -1 | 988 | died | female |
# | 2nd | 30 | 380 | survived | female |
# | 1st | 40 | 207 | died | male |
# | 3rd | 20 | 670 | died | male |
# +----------------+--------------+-----+-------------------+--------------+
Here we make the recommender object:
use ML::StreamsBlendingRecommender;
my ML::StreamsBlendingRecommender::CoreSBR $sbrObj .= new;
$sbrObj.makeTagInverseIndexesFromWideForm(@dsTitanic, tagTypes => @dsTitanic[0].keys.grep({ $_ ne 'id' }).Array, itemColumnName => <id>, :!addTagTypesToColumnNames).transposeTagInverseIndexes;
# ML::StreamsBlendingRecommender::CoreSBR.new(SMRMatrix => [])
Here we classify by profile
$sbrObj.classifyByProfile('passengerSurvival', ['1st', 'female']):!object
# [survived => 1 died => 0.052632]
Remark: Since we want to see the result and "dot-chain" with further method call we use
the adverb :!object
.
Here are the classification results of 5 randomly selected rows from the dataset:
my @clRes = @dsTitanic.pick(5).map({ $sbrObj.classifyByProfile('passengerSurvival', $_<passengerAge passengerClass passengerSex>):!object }).Array;
# [[died => 1 survived => 0.162791] [died => 1 survived => 0.098901] [died => 1 survived => 0.098901] [survived => 1 died => 0.15625] [survived => 1 died => 0.152709]]
TBF...
Here is a UML diagram that shows package's structure:
The
PlantUML spec
and
diagram
were obtained with the CLI script to-uml-spec
of the package "UML::Translators", [AAp6].
Here we get the PlantUML spec:
to-uml-spec ML::StreamsBlendingRecommender > ./resources/class-diagram.puml
Here get the diagram:
to-uml-spec ML::StreamsBlendingRecommender | java -jar ~/PlantUML/plantuml-1.2022.5.jar -pipe > ./resources/class-diagram.png
[AA1] Anton Antonov, "Mapping Sparse Matrix Recommender to Streams Blending Recommender", (2019), GitHub/antononcube.
[AAp1] Anton Antonov, Monadic Sparse Matrix Recommender Mathematica package, (2018), GitHub/antononcube.
[AAp2] Anton Antonov, Sparse Matrix Recommender Monad R package, (2018), R-packages at GitHub/antononcube.
[AAp3] Anton Antonov, SparseMatrixRecommender Python package, (2021), Python-packages at GitHub/antononcube.
[AAp4] Anton Antonov, Data::Reshapers Raku package, (2021), GitHub/antononcube.
[AAp5] Anton Antonov, ML::ROCFunctions Raku package, (2022), GitHub/antononcube.
[AAp6] Anton Antonov, UML::Translators Raku package, (2022), GitHub/antononcube.