• Dan
• Posts 36
• Reputation
• Member Since
• Last Active
• Name Dan
http://community.bayesialab.com gone?
Hi Ero,

Indeed, we had 2 versions of this forum running in // till yesterday while waiting for the migration to be fully completed. Now, this one is the only one running.

It seems the question you post in the old forum is indeed lost, but we have added the bug you described for the BIF format in our ToDo list. We will let you know when it's fixed.
0 0
Non Binary ROC curve calculation
Hi Andrea,Here is the workflow:[*]Batch Inference to compute the posterior probabilities[*]Sort the data from the highest to the lowest posterior probabilities of your positive state[*]Let P be the actual number of positive states in your data set[*]Let S be the size of your data set[*]Let P_i be the sum of the True Positives when processing the first i sorted instances[*]Let N_i be the sum of the False Positives when processing the first i sorted instances[*]TPR_i = P_i/P[*]FPR_i = N_i/(S-P)Hope this helps,Dan
0 0
Multinet option (Multiple Clustering)
This new algorithm is an option in the Clustering tool. You thus have the same features as the original algorithm.
0 0
Multinet option (Multiple Clustering)
Hi Andrea,The original clustering algorithm used in BayesiaLab is based on a Naive structure (the latent variable that represents the segmentation is the parent of all the variables used for clustering) and the Expectation-Maximization algorithm. The Naive structure thus implies the conditional independency of the clustering variables given the latent variable. The Multinet algorithm is a new algorithm for relaxing this conditional independency hypothesis. Instead of using a Naive structure, the algorithm uses an Augmented Naive structure, in which the intra-cluster conditional dependencies between the children are represented. Each cluster thus represents instances that not only "look similar" (Naive) but also "behave similarly" (Augmented Naive).
0 0
Supervised Learning --> No network no matter which method used?
This means that either you either the relationships with your target are too weak with respect to the size of your dataset, or that the distribution of your target is very unbalanced.You can thus try to reduce the structural coefficient (Edit | Edit Structural Coefficient), or try to use the stratification (Learning | Stratification).Hope this helps,Dan
0 0
Avoiding overfitting to common states
Hi Steve,Just to let you know that we just released version 6.0.5 that comes with the option to use the stratification for the parameter estimation.Best,Dan
0 0
Avoiding overfitting to common states
Indeed, the stratification just allows to change the marginal distributions of the selected set of nodes in order to get a more complex model. However, at the end of the learning algorithms, the parameters are estimated on the unstratified data. We will add an option to allow the estimation of the parameters on the stratified dataset. In the meantime, you can try to sample your dataset, or even better, associate a higher weight with the data corresponding to the less likely state.Hope this helps
0 0
ROC Cut Point?
The option "Evaluate all states" allows to carry out the targeted evaluation for each state of your variable. Obviously, if you are only interested in the confusion matrix and its related metrics, this is not useful as it's not changing. However, the Gains, Lift, Roc, Calibration curves are state specific.The maximum set of evidence is for evaluation the quality of your model in the context of an Adaptive Questionnaire, where you are not using all the variables for the computation of the posterior, but only the n most informative variables.The uniform posterior is usually generated when the observation you are using are not compatible with your network (i.e. you've learned a deterministic relation that is only true in your training set). To prevent this problem, you have to use a non-informative prior when learning your model, by using the option "Edit | Edit Smooth Probability Estimation". This will add 1 (default value) virtual occurence in your database, spread uniformly across your joint probability distribution, defining then that everything is possible.
0 0
ROC Cut Point?
By default, this is indeed the most likelihood criteria that is used for the Targeted Evaluation. However, you can set your own threshold manually, or even generate multiple thresholds to compare their performances (see the screenshot below).Please note also that the evaluation curves are interactive. For example, if you click on the Gains Curve, you will be able to get the probability p that allows to get the corresponding precision (90% with a threshold of 0.25 in the example below).Hope this helps
0 0
Function Node Inference Function: Standard Deviation
Our API are Java libraries. We do have 3 API that allows creating models, doing inference, and carrying out some structural learning via java programming. Please see http://library.bayesia.com/display/BlabC/Bayesia+Engine+API for a description of the Modeling and Inference API (the learning one is brand new and is still under development). As per your credible interval problem, you're right, there is unfortunately no easy way to automate this within BayesiaLab.
0 0
Function Node Inference Function: Standard Deviation
The JT cannot perform this calculation because it's a network that has been machine learned, i.e. it's not parametric. As far as I know, there is thus no other solution than sampling your associated continuous dataset. As mentioned before, the direct filtering algorithm will ignore the network structure. As per your the 4th step of your workflow, there is indeed no way to directly use the function nodes for that. Your function requires iteration over the states of your node. You need the API for that.
0 0
Function Node Inference Function: Standard Deviation
If the nodes are not directly connected, the probability distributions resulting from the filtering of your data will not be the same as those obtained with the Junction Tree inference. Filtering is a selection and thus simulates a direct link between the variables. Let suppose you have two unconnected variables in your network, setting evidence in the Junction Tree on one variable will not impact the distribution of the second one, whereas it can change with data filtering.
0 0
Function Node Inference Function: Standard Deviation
Yes, I meant posterior distributions. We can implement a kind a filtering on your data based on the entered evidence, but this will not be equivalent to the posterior distribution you have in the monitors, unless the node on which you enter evidence are directly connected to the node for which you want to get the credible interval.Dan
0 0
Function Node Inference Function: Standard Deviation
Hi Alden,I like the idea of discretization for getting the confidence interval of any kind of distribution. This is indeed something we can easily do when data is available. However, this would not be possible for conditional distributions. Would that be still useful for you?Dan
0 0
Function Node Inference Function: Standard Deviation
Adding StdDev(v) in the "Inference Functions" is indeed a good idea. It will be added in the next release. This should help you computing your credible intervals. Thanks
0 0
count post selected