Hi, all,
I am wondering how is the predict probablity calculated in SQL Server 2005 algorithms? And what dose it stand for?
Any helpful answer in detail is very appreciated.
Thanks a lot.
In general PredictProbability means that the number of cases like the current input had an "x" probability of being the predicted state.
Specifically, it is calculated differently for each algorithm. For decision trees it is fairly simple - it's the probability at the node used for prediction. This is essentially the number of training cases that reached thed node with the predicted state divided by the total number of training cases that reaches the node. The numbers are slightly different than that due to what is called a "prior" - which is a factor acknowleging that all possible states have equal probability given no input, and this factor is carried down throughout the tree. The impact of the prior gets smaller as the amount of evidence (data) increases.
No comments:
Post a Comment