Machinelearning: Are ML .Net models deterministic?

Created on 18 Nov 2020 · 3Comments · Source: dotnet/machinelearning

Some models are inherently stochastic, others are deterministic. Are ML .Net models deterministic? In other words, given the same input, will an ML .Net model always return the same output/prediction? If so, to how many decimal places is this prediction deterministic?

question

Source

rebecca-burwei

Most helpful comment

@rebecca-burwei Yes, I believe ML.NET models are deterministic if you use them properly, we have bunch of tests verifying various of model outputs. Regarding the decimal places I believe it is related to the input data and model settings like how many rounds of training will be performed, floating point error will accumulate with the calculation.
@justinormont cc Justin see if Justin has more insights on this questions.

frank-dong-ms on 18 Nov 2020

👍2

All 3 comments

frank-dong-ms on 18 Nov 2020

👍2

Model prediction

If you have a trained model, it is almost always deterministic. One example where it's not is if it includes the CountTargetEncoder. See: https://github.com/dotnet/machinelearning/pull/4514#pullrequestreview-330200694

Model training

Some parts of ML․NET can be deterministic. In general practice I wouldn't assume model training is fully deterministic.

Setting a seed in the MLContext and disabling multi-threading gets you close.

Many components also have their own seed values to set. There's a bit of a usability bug in ML․NET as non-hashing seeds should fall-back to the global seed, but the code wasn't added to do so in all components. See: https://github.com/dotnet/machinelearning/issues/4752#issuecomment-580686290

If you're using the AutoML APIs, there's a bit of a butterfly effect in model sweeping due to the small model differences being amplified. See: https://github.com/dotnet/machinelearning/issues/4986#issuecomment-606521860