Machinelearning: Are ML .Net models deterministic?

Created on 18 Nov 2020  路  3Comments  路  Source: dotnet/machinelearning

Some models are inherently stochastic, others are deterministic. Are ML .Net models deterministic? In other words, given the same input, will an ML .Net model always return the same output/prediction? If so, to how many decimal places is this prediction deterministic?

question

Most helpful comment

@rebecca-burwei Yes, I believe ML.NET models are deterministic if you use them properly, we have bunch of tests verifying various of model outputs. Regarding the decimal places I believe it is related to the input data and model settings like how many rounds of training will be performed, floating point error will accumulate with the calculation.
@justinormont cc Justin see if Justin has more insights on this questions.

All 3 comments

@rebecca-burwei Yes, I believe ML.NET models are deterministic if you use them properly, we have bunch of tests verifying various of model outputs. Regarding the decimal places I believe it is related to the input data and model settings like how many rounds of training will be performed, floating point error will accumulate with the calculation.
@justinormont cc Justin see if Justin has more insights on this questions.

Model prediction

If you have a trained model, it is almost always deterministic. One example where it's not is if it includes the CountTargetEncoder. See: https://github.com/dotnet/machinelearning/pull/4514#pullrequestreview-330200694

Model training

Some parts of ML鈥ET can be deterministic. In general practice I wouldn't assume model training is fully deterministic.

Setting a seed in the MLContext and disabling multi-threading gets you close.

Many components also have their own seed values to set. There's a bit of a usability bug in ML鈥ET as non-hashing seeds should fall-back to the global seed, but the code wasn't added to do so in all components. See: https://github.com/dotnet/machinelearning/issues/4752#issuecomment-580686290

If you're using the AutoML APIs, there's a bit of a butterfly effect in model sweeping due to the small model differences being amplified. See: https://github.com/dotnet/machinelearning/issues/4986#issuecomment-606521860

Close this issue as answer has already been provided, @rebecca-burwei feel free to reopen if you have any follow up questions, thanks.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

neven10 picture neven10  路  3Comments

OneCyrus picture OneCyrus  路  4Comments

rogancarr picture rogancarr  路  3Comments

rogancarr picture rogancarr  路  4Comments

sethreidnz picture sethreidnz  路  3Comments