Machinelearning: V1 Scenarios need to be covered by tests

Created on 11 Feb 2019 · 7Comments · Source: dotnet/machinelearning

In issue #584, we laid out a set of scenarios that we'd like to cover for V1.0 of ML.NET. We need high-level functional tests to make sure that these work well in the 1.0 library.

Here is a list of tests that cover the scenarios. Let's use this issue as a top-level issue to track coverage of the APIs.

Category | Scenarios | Link to Test | Completed PR | Blocked by Issue
-- | -- | -- | -- | --
Data I/O| I can use objects already in memory (as IEnumerable) as input to my ML pipeline/experiment | Link | #2518 |
Data I/O | I can use locally stored delimited files (.csv, .tsv, etc.) as input to my ML pipeline/experiment | Link | #2518 |
Data I/O | I can use locally stored binary files (.idv) as input to my ML pipeline/experiment | Link | #2518 |
Data I/O | I can go through any arbitrary data transformation / model training and save the output to disk as a delimited file (.csv, .tsv, etc.). | Link | #2518 |
Data I/O | I can go through any arbitrary data transformation / model training and save the output to disk as a binary file (.idv). | Link | #2518 |
Data I/O | I can go through any arbitrary data transformation / model training and convert the output to an IEnumerable. | Link | #2518 |
Data I/O | I can use data from a SQL database by reading it into memory or to disk using an existing SQL reader and then use that as input to my ML pipeline/experiment | (May be a sample) | |
Data Transformation, Feature Engineering | I can take an existing ONNX model and get predictions from it (as both final output and as input to downstream pipelines) | | |
Data Transformation, Feature Engineering | Extensible transformation: It should be possible to write simple row-mapping transforms. Examples: "I can add custom steps to my pipeline such as creating a new column that is the addition of two other columns, or easily add cosine similarity, without having to create my own build of ML.NET." | | #2803 |
Data Transformation, Feature Engineering | I can modify settings in the TextFeaturizer to update the number of word-grams and char-grams used along with things like the normalization. | | #2803 | #2802
Data Transformation, Feature Engineering | I can apply normalization to the columns of my data | | #2803 |
Data Transformation, Feature Engineering | I can take an existing TF model and get predictions from it or any layer in the model | | WIP Rogan |
Data Transformation and Feature Engineering | P1: I can take an existing TF model and use ML.NET APIs to identify the input and output nodes | | WIP Rogan |
Debugging | I can see how my data was read in to verify that I specified the schema correctly | | #2937 |
Debugging | I can see the output at the end of my pipeline to see which columns are available (score, probability, predicted label) | | #2937 |
Debugging | I can look at intermediate steps of the pipeline to debug my model. Example: > I were to have the text "Help I'm a bug!" I should be able to see the steps where it is normalized to "help i'm a bug" then tokenized into ["help", "i'm", "a", "bug"] then mapped into term numbers [203, 25, 3, 511] then projected into the sparse float vector {3:1, 25:1, 203:1, 511:1}, etc. etc. | | #2937 |
Debugging | P1: I can access the information needed for understanding the progress of my training (e.g. number of trees trained so far out of how many) | | #2937 |
Evaluation | I can evaluate a model trained for any of my tasks on test data. The evaluation outputs metrics that are relevant to the task (e.g. AUC, accuracy, P/R, and F1 for binary classification) | | #2646 |
Evaluation | P1: I can get the data that will allow me to plot PR curves | | #2646 | #2645
Explainability & Interpretability | I can get near-free (local) feature importance for scored examples (Feature Contributions) | | #2584 |
Explainability & Interpretability | I can view how much each feature contributed to each prediction for trees and linear models (Feature Contributions) | | #2584 |
Explainability & Interpretability | I can view the overall importance of each feature (Permutation Feature Importance, GetFeatureWeights) | | #2584 |
Explainability & Interpretability | I can train interpretable models (linear model, GAM) | | |
Introspective training | I can take an existing model file and inspect what transformers were included in the pipeline | | #2859 |
Introspective training | I can inspect the coefficients (weights and bias) of a linear model without much work. Easy to find via auto-complete. | | #2859|
Introspective training | I can inspect the normalization coefficients of a normalizer in my pipeline without much work. Easy to find via auto-complete. | | #2859 |
Introspective training | I can inspect the trees of a boosted decision tree model without much work. Easy to find via auto-complete. | | #2859 |
Introspective training | I can inspect the topics after training an LDA transform. Easy to find via auto-complete. | | #2859 |
Introspective training | I can inspect a categorical transform and see which feature values map to which key values. Easy to find via auto-complete. | | #2859 |
Introspective training | I can access the GAM feature histograms through APIs | | #2859 |
Model files | I can train a model and save it as a file. This model includes the learner as well as the transforms (e.g. Decomposability) | | |
Model files | I can use a model file in a completely different process to make predictions. (e.g. Decomposability) | | |
Model files | I can use newer versions of ML.NET with ML.NET model files of previous versions (for v1.x) | | test in V1.1 |
Model files | I can easily figure out which NuGets (and versions) I need to score an ML.NET model | | |
Model files | P2: I can move data between NimbusML and ML.NET (using IDV). Prepare with NimbusML and load with ML.NET | | V1.1 |
Model files | P2: I can use model files interchangeably between compatible versions of ML.NET and NimbusML. | | V1.1 |
Model files | P1: I can export ML.NET models to ONNX (limited to the existing internal functionality) | | |
Model files | I can save a model to text | | V1.1 |
Prediction | I can get predictions (scores, probabilities, predicted labels) for every row in a test dataset | | |
Prediction | I can reconfigure the threshold of my binary classification model based on analysis of the PR curves or other metrics scores. | Link | | #2465
Prediction | (Might not work?) I can map the score/probability for each class to the original class labels I provided in the pipeline (multiclass, binary classification). | | |
Tasks | I can train a model to do classification (binary and multiclass) | | #2646 |
Tasks | I can train a model to do regression | | #2646 |
Tasks | I can train a model to do anomaly detection | | #2646 |
Tasks | I can train a model to do recommendations | | #2646 |
Tasks | I can train a model to do ranking | | #2646 |
Tasks | I can train a model to do clustering | | #2646 |
Training | I can provide multiple learners and easily compare evaluation metrics between them. | | #2921 |
Training | I can use an initial predictor to update/train the model for some trainers (e.g. linear learners like averaged perceptron). Specifically, start the weights for the model from the existing weights. | | #2921 |
Training | Metacomponents smartly restrict their use to compatible components. Example: "When specifying what trainer OVA should use, a user will be able to specify any binary classifier. If they specify a regression or multi-class classifier ideally that should be a compile error." | | #2921 |
Training | I can train TF models when I bring a TF model topology | | WIP Rogan |
Training | I can use OVA and easily add any binary classifier to it | | #2921 |
Use in web environments | I can use ML.NET models to make predictions in multi-threaded environments like ASP.NET. (This doesn't have to be inherent in the prediction engine but should be easy to do.) | | |
Validation | Cross-validation: I can take a pipeline and easily do cross validation on it without having to know how CV works. | Link | #2470 |
Validation | I can use a validation set in a pipeline for learners that support them (e.g. FastTree, GAM) | Link | #2503 |

API P2 onnx test

Source

rogancarr

Most helpful comment

@jwood803 You should load the (textfile?) dataset into an in-memory database provider such as Microsoft.EntityFrameworkCore.InMemory. These are functionally equivalent to real DB providers and are used to build DB tests. You won't need a connection string.

endintiers on 10 Mar 2019

👍2

All 7 comments

So we aren't going to support training from an IEnumerable (directly) backed by an EF DbContext in V1? If so better note that somewhere, because it works up to a point and then fails in a confusing way.

endintiers on 13 Feb 2019

@endintiers For V1, we will only support training with IDataView, but that should still be possible with an IEnumerable backed by an EF DbContext. (@singlis and @Ivanidzo4ka)

Would you mind giving an example of what you've been doing and how it's been failing?

rogancarr on 13 Feb 2019

https://github.com/dotnet/machinelearning/issues/2159 for more details.

The actual problem is that EF Core DBContexts aren't thread-safe, so if cursors from multiple threads exhaust the pool and more than one tries to access the Enumerable to get more: 'boom'

I believe setting conc to 1 for mlContext should help, but it need verification.

Ivanidzo4ka on 13 Feb 2019

Adding SQL test back — I had misunderstood the requirements, and it looks like we can use an EF DbContext for it.

rogancarr on 13 Feb 2019

Tar. I have looked at how to modify mlContext.CreateStreamingDataView to be able to detect? and create an IDataView that could signal 'single-threaded source' downstream. This could be done (it's just the sync of buffer re-loads that is an issue). Given the time release-wise though just setting conc to 1 is a good move. Sadly this will slow training (on serious datasets with many available CPUs). I should volunteer to do at least this test... (using EF Core In-Memory). In the real world generating text files from the DB and training on them instead seems to be the best move.

endintiers on 14 Feb 2019

👍1

I have a sample that reads data from a SQL database. I can create another one using one of the data sets used in the samples. Would the connection string be hidden from the sample since it's not necessary for sample?