Library
Q1: I am attempting Segmentation using BayesiaLab. Is there a way to generate and save cluster memberships in BayesiaLab as in other segmentation tools?Q2: This segmentation would be based on aggregated data using 29 variables with a base of 48 data points. Is this even feasible?
Quote 0 0
Library
Q1: Cluster MembershipBayesiaLab comes with various data clustering algorithms (Learning | Clustering). These algorithms work on a set of selected nodes and create a hidden variable.Once this new variable created, you can:[list=1:2d02vnr1][*:2d02vnr1]Make a right click on the Cluster node (painted in white rather than blue, as it is a hidden node) to bring up the node contextual menu and select Imputation; you will be prompted to choose the [url=https://forums.bayesialab.com/viewtopic.php?f=9&t=30:2d02vnr1]imputation policy[/url:2d02vnr1], which in this case should be Choose the Values with the Maximum Probability. The color of the Cluster node will become blue (not hidden anymore). You will then be able to s[url=https://forums.bayesialab.com/viewtopic.php?f=9&t=29:2d02vnr1]ave the imputed values[/url:2d02vnr1] by using the Data | Save Data.[/*:m:2d02vnr1][*:2d02vnr1]Use Batch Labeling (Validation Mode: Inference | Batch Labeling); you will then get the most probable state of your target variable (the Cluster node), along with the posterior probability of the predicted state.[/*:m:2d02vnr1][*:2d02vnr1]Use Batch inference (Validation Mode: Inference | Batch Inference); you will get the full posterior probability distribution of your target node.[/*:m:2d02vnr1][/list:2d02vnr1]Q2: Database CharacteristicsThe number of variables is not a problem. You can either use Data Clustering or Hierarchical Clustering.The problem comes from the 48 data points. It’s feasible, but you just have to keep in mind that data clustering with BayesiaLab consists in summarizing the joint probability distribution of the associated variables with a few discrete states (the states of the Cluster node). Therefore, in your case, the joint distribution is a hypercube made of at least 2^29 cells (if your variables have only 2 states), and this joint is only sampled with 48 points! The result can then be unstable.
Quote 1 0