Library
To what degree are the results of the data clustering algorithm in BayesiaLab dependent on the specified random seed? I am finding that running the same analysis multiple time yields a different number of segments each time. Is there a recommended number of trials to find a stable solution?
Quote 0 0
Library
As most other data clustering algorithms, the (proprietary) BayesiaLab clustering algorithm is stochastic, too. Unless you use the same random seed, the results can then change from run to run.The algorithm starts with a random solution and then uses an Expectation-Maximization algorithm to incrementally optimize the solution. Any instability of results usually stems from the number of variables and/or from weak relationships between the variables:[*:1p4wihpr]Data Clustering in BayesiaLab consists in finding a compact representation/summary of the Joint Probability Distribution (JPD) defined by all the variables. The size of the JPD (an hypercube) grows exponentially with the number of variables. So, if there are too many variables, this hypercube will be very large and thus may contain many “local optima”. As a result, the algorithm can converge toward different solutions. [/*:m:1p4wihpr][*:1p4wihpr]When the probabilistic relationships between the variables are too weak, the samples are spread all over the hypercube without forming any (easily identifiable) cluster.[/*:m:1p4wihpr]Too many variables: Reduce the number of variables [list=1:1p4wihpr][*:1p4wihpr]Use one of the unsupervised structural learning algorithms, e.g. Maximum Weight Spanning Tree[/*:m:1p4wihpr][*:1p4wihpr]Compute the Node Force (Analysis | Graphic | Node Force)[/*:m:1p4wihpr][*:1p4wihpr]Exclude/delete the nodes with the lowest Node Force[/*:m:1p4wihpr][/list:1p4wihpr]Increase the number of trials for finding the optimal number of states (e.g. 50 if the learning time is not too long).Choosing the ideal clustering is ultimately subjective. In practice, a good clustering is one that you can communicate and that is "actionable".
Quote 0 0