Q1: Which is a better measure to report - KL Divergence or Mutual Information?Q2: Is it true that the mutual information of a variable to itself is 1?
Quote 0 0
Q1: Mutual Information vs KL DivergenceThe Mutual Information between two variables X and Y is defined as follows:[latex:3t3pf2uy]I(X,Y)=\sum_{x \in X}\sum_{y \in Y} p(x,y)\log_2 \frac{p(x,y)}{p(x)p(y)}[/latex:3t3pf2uy]The KL Divergence allows comparing two probability distributions, P and Q.[latex:3t3pf2uy]D_{KL}(P({\cal X})\|Q({\cal X}))=\sum_{\cal X}P({\cal X})log_2\frac{P({\cal X})}{Q({\cal X})}[/latex:3t3pf2uy]We use the KL Divergence in BayesiaLab for measuring the strength of a direct relationship between two variables. P is then the Bayesian network with the link and Q is the one without the link.The Mutual Information can be rewritten as:[latex:3t3pf2uy]I(x,y)=D_{KL}(p(x,y)\|p(x)p(y))[/latex:3t3pf2uy]Therefore, Mutual Information (I) and KL Divergence are identical when there are no spouses (co-parents) implied in the measured relation.Example:Let's take the following network with two nodes X and Z. The analysis of the relation with Mutual Information (Validation Mode: Analysis | Visual | Arcs' Mutual Information) and with KL (Validation Mode: Analysis | Visual | Arc Force) return the same value: 0.3436  However, as soon as other variables are implied in the relation as co-parents, the KL Divergence will integrate them in the analysis, leading to a more precise result.Let's take the following deterministic example where Z is an Exclusive Or between X and Y, i.e. true when X and Y are different. The analysis of the relations with Mutual Information (Validation Mode: Analysis | Visual | Arcs' Mutual Information) returns the following graph where the mutual information between X and Z and Y and Z are both null. Indeed, X and Y do not have any impact on Z when they are analyzed separately. On the other hand, the force of the arcs computed with KL (Validation Mode: Analysis | Visual | Arc Force) reflects perfectly the deterministic relation between of X and Y on Z. Q2: Normalized Mutual InformationTwo clones will have a Normalized Mutual Information I_N(X, X) = 1 but not necessarily a Mutual Information I(X, X)=1. It depends on the value of the initial entropy H(X). You will get it with a binary variable X that has a uniform marginal distribution.
Quote 0 0
I cannot see the jpeg images related to the examples.=> What sould I do?
Quote 0 0
Dan
The settings of the forum are set in such a way that you need to be logged in to be able to see the images.
Quote 0 0
Why the "Search" functionality (top right corner of the page) does not find any information related to mutual information and replies with "The following words in your search query were ignored because they are too common words: mutual information", when for example "Divergence" or "normalized" give right answers?
Quote 0 0
This research problem is solved.Thank you for comment
Quote 0 0