# Comparing BNs¶

def dict2html(di1,di2=None):
res= "<br/>".join([f"<b>{k:15}</b>:{v}" for k,v in di1.items()])
if di2 is not None:
res+="<br/><br/>"
res+= "<br/>".join([f"<b>{k:15}</b>:{v}" for k,v in di2.items()])
return res

import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
import pyAgrum.lib.bn_vs_bn as gcm


## How to compare two BNs¶

PyAgrum allows you to compare BNs in several ways. This notebook show you some of them: - a graphical diff between the 2 BNs - some scores form recal and precision - distance measures (for more, see notebook 26-klForBNs for more)

### Between two different structures¶

bn1=gum.fastBN("A->B->C->D->E<-A->F")
bn2=gum.fastBN("A->B<-C->D->E<-A;F->E")
cmp=gcm.GraphicalBNComparator(bn1,bn2)
kl=gum.ExactBNdistance(bn1,bn2) # bruteForce is possible car the BNs are small
gnb.sideBySide(bn1,bn2,gnb.getBNDiff(bn1,bn2),dict2html(cmp.scores(),cmp.hamming()),cmp.equivalentBNs(),dict2html(kl.compute()),
captions=['bn1','bn2','graphical diff','Scores','equivalent ?','distances'],valign="bottom")

 G B B C C B->C D D E E D->E A A A->B F F A->F A->E C->D bn1 G B B D D E E D->E A A A->B A->E C C C->B C->D F F F->E bn2 G A A B B A->B E E A->E F F A->F C C C->B D D C->D D->E F->E graphical diff count :{'tp': 4, 'tn': 22, 'fp': 2, 'fn': 2}recall :0.6666666666666666precision :0.6666666666666666fscore :0.6666666666666666dist2opt :0.47140452079103173hamming :2structural hamming:4Scores B has different parents in the two bns whose names are in {'C'}equivalent ? klPQ :4.196541750177101errorPQ :0klQP :3.2059483428285143errorQP :0hellinger :0.9677600218089104bhattacharya :0.6316377360196375jensen-shannon :0.5606728422545296distances

The logic for the arcs of the graphical diff is the following. When comparaing bn1 with bn2 (in that order) : - full black line: the arc is common for both - full red line: the arc is common but inverted in bn2 - dotted black line: the arc is added in bn2 - dotted red line: the arc is removed in bn2

For the scores : - precision and recall are computed considering BN1 as the reference - $$Fscore=\frac{2\cdot recall\cdot precision}{recall+precision}$$ is the weighted average of Precision and Recall. - $$dist2opt=\sqrt{(1-precision)^2+(1-recall)^2}$$ represents the euclidian distance to the ideal(precision=1,recall=1)

EquivalentBN return “OK” if equivalent or a reason for non equivalence

Finally, BruteForceKL compute in the same time several distances : I-projection, M-projection, Hellinger and Bhattacharya. For more complex BNs, there exists a GibbsKL to approximate those distances. Of course, the computation are much slower.

### Same structure, different parameters¶

bn1=gum.fastBN("A->B->C->D->E<-A->F")
bn2=gum.fastBN("A->B->C->D->E<-A->F")
cmp=gcm.GraphicalBNComparator(bn1,bn2)
kl=gum.ExactBNdistance(bn1,bn2) # bruteForce is possible car the BNs are small
gnb.sideBySide(bn1,bn2,gnb.getBNDiff(bn1,bn2),dict2html(cmp.scores(),cmp.hamming()),cmp.equivalentBNs(),dict2html(kl.compute()),
captions=['bn1','bn2','graphical diff','Scores','equivalent ?','distances'],valign="bottom")

 G B B C C B->C D D E E D->E A A A->B F F A->F A->E C->D bn1 G B B C C B->C D D E E D->E A A A->B F F A->F A->E C->D bn2 G A A B B A->B E E A->E F F A->F C C B->C D D C->D D->E graphical diff count :{'tp': 6, 'tn': 24, 'fp': 0, 'fn': 0}recall :1.0precision :1.0fscore :1.0dist2opt :0.0hamming :0structural hamming:0Scores Different CPTs for Aequivalent ? klPQ :2.1308173097251513errorPQ :0klQP :2.051478487550463errorQP :0hellinger :0.7949618385190523bhattacharya :0.37977128326511445jensen-shannon :0.40904630057939284distances

### identical BNs¶

bn1=gum.fastBN("A->B->C->D->E<-A->F")
bn2=bn1
cmp=gcm.GraphicalBNComparator(bn1,bn2)
kl=gum.ExactBNdistance(bn1,bn2) # bruteForce is possible car the BNs are small
gnb.sideBySide(bn1,bn2,gnb.getBNDiff(bn1,bn2),dict2html(cmp.scores(),cmp.hamming()),cmp.equivalentBNs(),dict2html(kl.compute()),
captions=['bn1','bn2','graphical diff','Scores','equivalent ?','distances'],valign="bottom")

 G B B C C B->C D D E E D->E A A A->B F F A->F A->E C->D bn1 G B B C C B->C D D E E D->E A A A->B F F A->F A->E C->D bn2 G A A B B A->B E E A->E F F A->F C C B->C D D C->D D->E graphical diff count :{'tp': 6, 'tn': 24, 'fp': 0, 'fn': 0}recall :1.0precision :1.0fscore :1.0dist2opt :0.0hamming :0structural hamming:0Scores OKequivalent ? klPQ :0.0errorPQ :0klQP :0.0errorQP :0hellinger :0.0bhattacharya :1.1102230246251565e-16jensen-shannon :0.0distances

In the notebook Learning_DirichletPriorAndWeightedDatabase, you can find an interresting discussion on how can change those scores and distance.

