Comparing BNs

Creative Commons License

aGrUM

interactive online version

In [1]:
def dict2html(di1,di2=None):
    res= "<br/>".join([f"<b>{k:15}</b>:{v}" for k,v in di1.items()])
    if di2 is not None:
        res+="<br/><br/>"
        res+= "<br/>".join([f"<b>{k:15}</b>:{v}" for k,v in di2.items()])
    return res


In [2]:
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
import pyAgrum.lib.bn_vs_bn as gcm

How to compare two BNs

PyAgrum allows you to compare BNs in several ways. This notebook show you some of them: - a graphical diff between the 2 BNs - some scores form recal and precision - distance measures (for more, see notebook 26-klForBNs for more)

Between two different structures

In [3]:
bn1=gum.fastBN("A->B->C->D->E<-A->F")
bn2=gum.fastBN("A->B<-C->D->E<-A;F->E")
cmp=gcm.GraphicalBNComparator(bn1,bn2)
kl=gum.ExactBNdistance(bn1,bn2) # bruteForce is possible car the BNs are small
gnb.sideBySide(bn1,bn2,gnb.getBNDiff(bn1,bn2),dict2html(cmp.scores(),cmp.hamming()),cmp.equivalentBNs(),dict2html(kl.compute()),
              captions=['bn1','bn2','graphical diff','Scores','equivalent ?','distances'],valign="bottom")
G B B C C B->C E E D D D->E F F A A A->B A->E A->F C->D
bn1
G B B E E D D D->E F F F->E A A A->B A->E C C C->B C->D
bn2
G A A B B A->B E E A->E F F A->F C C C->B D D C->D D->E F->E
graphical diff
count :{'tp': 4, 'tn': 22, 'fp': 2, 'fn': 2}
recall :0.6666666666666666
precision :0.6666666666666666
fscore :0.6666666666666666
dist2opt :0.47140452079103173

hamming :2
structural hamming:4
Scores
B has different parents in the two bns whose names are in {'C'}
equivalent ?
klPQ :4.196541750177101
errorPQ :0
klQP :3.2059483428285143
errorQP :0
hellinger :0.9677600218089104
bhattacharya :0.6316377360196375
jensen-shannon :0.5606728422545296
distances

The logic for the arcs of the graphical diff is the following. When comparaing bn1 with bn2 (in that order) : - full black line: the arc is common for both - full red line: the arc is common but inverted in bn2 - dotted black line: the arc is added in bn2 - dotted red line: the arc is removed in bn2

For the scores : - precision and recall are computed considering BN1 as the reference - \(Fscore=\frac{2\cdot recall\cdot precision}{recall+precision}\) is the weighted average of Precision and Recall. - \(dist2opt=\sqrt{(1-precision)^2+(1-recall)^2}\) represents the euclidian distance to the ideal(precision=1,recall=1)

EquivalentBN return “OK” if equivalent or a reason for non equivalence

Finally, BruteForceKL compute in the same time several distances : I-projection, M-projection, Hellinger and Bhattacharya. For more complex BNs, there exists a GibbsKL to approximate those distances. Of course, the computation are much slower.

Same structure, different parameters

In [4]:
bn1=gum.fastBN("A->B->C->D->E<-A->F")
bn2=gum.fastBN("A->B->C->D->E<-A->F")
cmp=gcm.GraphicalBNComparator(bn1,bn2)
kl=gum.ExactBNdistance(bn1,bn2) # bruteForce is possible car the BNs are small
gnb.sideBySide(bn1,bn2,gnb.getBNDiff(bn1,bn2),dict2html(cmp.scores(),cmp.hamming()),cmp.equivalentBNs(),dict2html(kl.compute()),
              captions=['bn1','bn2','graphical diff','Scores','equivalent ?','distances'],valign="bottom")
G B B C C B->C E E D D D->E F F A A A->B A->E A->F C->D
bn1
G B B C C B->C E E D D D->E F F A A A->B A->E A->F C->D
bn2
G A A B B A->B E E A->E F F A->F C C B->C D D C->D D->E
graphical diff
count :{'tp': 6, 'tn': 24, 'fp': 0, 'fn': 0}
recall :1.0
precision :1.0
fscore :1.0
dist2opt :0.0

hamming :0
structural hamming:0
Scores
Different CPTs for A
equivalent ?
klPQ :2.1308173097251513
errorPQ :0
klQP :2.051478487550463
errorQP :0
hellinger :0.7949618385190523
bhattacharya :0.37977128326511445
jensen-shannon :0.40904630057939284
distances

identical BNs

In [5]:
bn1=gum.fastBN("A->B->C->D->E<-A->F")
bn2=bn1
cmp=gcm.GraphicalBNComparator(bn1,bn2)
kl=gum.ExactBNdistance(bn1,bn2) # bruteForce is possible car the BNs are small
gnb.sideBySide(bn1,bn2,gnb.getBNDiff(bn1,bn2),dict2html(cmp.scores(),cmp.hamming()),cmp.equivalentBNs(),dict2html(kl.compute()),
              captions=['bn1','bn2','graphical diff','Scores','equivalent ?','distances'],valign="bottom")
G B B C C B->C E E D D D->E F F A A A->B A->E A->F C->D
bn1
G B B C C B->C E E D D D->E F F A A A->B A->E A->F C->D
bn2
G A A B B A->B E E A->E F F A->F C C B->C D D C->D D->E
graphical diff
count :{'tp': 6, 'tn': 24, 'fp': 0, 'fn': 0}
recall :1.0
precision :1.0
fscore :1.0
dist2opt :0.0

hamming :0
structural hamming:0
Scores
OK
equivalent ?
klPQ :0.0
errorPQ :0
klQP :0.0
errorQP :0
hellinger :0.0
bhattacharya :1.1102230246251565e-16
jensen-shannon :0.0
distances

In the notebook Learning_DirichletPriorAndWeightedDatabase, you can find an interresting discussion on how can change those scores and distance.

In [ ]:

In [ ]: