waynergf
When using BayesiaLab in the Supervised Learning mode to develop a predictive model for a binary outcome (in my case, whether a clinic outpatient will show or not for his appointment), a probability of "NoShow" is generated for each patient. To generate the Precision and Reliability statistics a "cut point" must be chosen (typically in reference to the Receiver Operating Characteristic curve): the value between 0 and 1 above which the patient is predicted to be a NoShow.I've searched the BayesiaLab book and the Web site to no avail. What cut point value is used in Bayesia Lab? Is it 0.50? [Note: It need not be, depending on the "balance" between Sensitivity and Selectivity the healthcare provider chooses in order to balance risk against consequence.]
Quote 0 0
Dan
By default, this is indeed the most likelihood criteria that is used for the Targeted Evaluation. However, you can set your own threshold manually, or even generate multiple thresholds to compare their performances (see the screenshot below).Please note also that the evaluation curves are interactive. For example, if you click on the Gains Curve, you will be able to get the probability p that allows to get the corresponding precision (90% with a threshold of 0.25 in the example below).Hope this helps
Quote 0 0
waynergf
Thanks, Dan, for the quick - and very helpful - reply! :-)A follow-up question: What does this mean, when selecting from the "Targeted Evaluation Settings" the choices "Evaluate All States" and "Maximum Size of Evidence" = 5:"The posterior probability distribution is uniform or cannot be computed (incompatible set of observations). Choose a state in the list or let the program select one randomly."It won't accept "Random Choice" and I must select *both* states (0, 1) and "Remember My Choice" for each state selection for BayesiaLab to proceed to "Targeted Evaluation." ???
Quote 0 0
Dan
The option "Evaluate all states" allows to carry out the targeted evaluation for each state of your variable. Obviously, if you are only interested in the confusion matrix and its related metrics, this is not useful as it's not changing. However, the Gains, Lift, Roc, Calibration curves are state specific.The maximum set of evidence is for evaluation the quality of your model in the context of an Adaptive Questionnaire, where you are not using all the variables for the computation of the posterior, but only the n most informative variables.The uniform posterior is usually generated when the observation you are using are not compatible with your network (i.e. you've learned a deterministic relation that is only true in your training set). To prevent this problem, you have to use a non-informative prior when learning your model, by using the option "Edit | Edit Smooth Probability Estimation". This will add 1 (default value) virtual occurence in your database, spread uniformly across your joint probability distribution, defining then that everything is possible.
Quote 0 0