What algorithms does BayesiaLab use for learning network parameters?
Quote 0 0
The parameters of a network are computed using Maximum Likelihood Estimation, i.e. the probability that each event (cell) corresponds to the observed frequency in the dataset. Let's consider the simple network below: Maximum Likelihood Estimation:The marginal probability distribution of PA is estimated as: [latex:2maglmdt]\hat P(Pa=pa_i)=\frac{N(Pa=pa_i) }{\sum_j N(Pa=pa_j)}[/latex:2maglmdt] where N(.) represents the occurrence of the specified configuration in the dataset.The conditional probability distribution of X|PA is estimated as: [latex:2maglmdt]\hat P(X=x_i|Pa=pa_i)=\frac{N(X=x_i, Pa=pa_i) }{\sum_j N(X=x_j, Pa=pa_i) }[/latex:2maglmdt]Maximum Likelihood Estimation with Priors:Priors can also be taken into account when estimating the parameters. Priors would reflect the a-priori knowledge of an analyst regarding the domain, i.e. expert knowledge. See also [url=https://forums.bayesialab.com/viewtopic.php?f=5&t=12:2maglmdt]Prior Knowledge for Structural Learning[/url:2maglmdt].These priors are expressed with an analyst-specified, initial Bayesian network (structure and parameters), plus an analyst-specified Equivalent Number of Samples. The Equivalent Number of Samples represents the analyst's own degree confidence in the priors.[latex:2maglmdt]\hat P(X=x_i|Pa=pa_i)=\frac{N(X=x_i, Pa=pa_i) + M_0 \times P_0(X=x_i, Pa=pa_i) }{\sum_j ( N(X=x_j, Pa=pa_i)+ M_0 \times P_0(X=x_j, Pa=pa_i)) }[/latex:2maglmdt] where: - [latex:2maglmdt]M_0[/latex:2maglmdt] is the degree of confidence in the prior. - [latex:2maglmdt]P_0[/latex:2maglmdt] is the joint probability returned by the prior Bayesian network.These two terms are used to generate virtual samples that are subsequently combined with the observed samples from the dataset.Information: Priors are defined by selecting Learning | Generate Virtual Database.The current Bayesian network is used to compute [latex:2maglmdt]P_0[/latex:2maglmdt]A text field allows to set [latex:2maglmdt]M_0[/latex:2maglmdt] The existence of a Virtual Database is indicated by an icon in the lower right corner of the graph window, next to the "real dataset" icon. Right-clicking on the Virtual Database icon displays the structure of the prior knowledge that was used for generating the virtual samples. The virtual samples will be combined with the observed ("real") samples during the learning process. Information: Smoothed Probability Estimation allows you to define prior knowledge in such a way that all the variables are marginally independent (fully unconnected network), and the marginal probability distributions of all nodes are uniform. For instance, if the number of Virtual Occurrences is set to 1, one observation ("occurrence") would be "spread across" the states of each node, essentially assigning a fraction of an observation to each node's states. [latex:2maglmdt]M_0[/latex:2maglmdt] specifies the number of Virtual Occurrences.
Quote 0 0