Kullback-Leibler for Bayesian networks
In [1]:
import os
%matplotlib inline
from pylab import *
import matplotlib.pyplot as plt
import pyAgrum and pyAgrum.lib.notebook (for … notebooks :-) )
In [2]:
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
Create a first BN : bn
In [3]:
bn=gum.loadBN("res/asia.bif")
# randomly re-generate parameters for every Conditional Probability Table
bn.generateCPTs()
bn
Out[3]:
Create a second BN : bn2
In [4]:
bn2=gum.loadBN("res/asia.bif")
bn2.generateCPTs()
bn2
Out[4]:
bn vs bn2 : different parameters
In [5]:
gnb.flow.row(bn.cpt(3),bn2.cpt(3),
captions=["a CPT in bn","same CPT in bn2 (with different parameters)"])
|
| |
---|---|---|
0.8859 | 0.1141 | |
0.6475 | 0.3525 |
|
| |
---|---|---|
0.5543 | 0.4457 | |
0.6152 | 0.3848 |
Exact and (Gibbs) approximated KL-divergence
In order to compute KL-divergence, we just need to be sure that the 2 distributions are defined on the same domain (same variables, etc.)
Exact KL
In [6]:
g1=gum.ExactBNdistance(bn,bn2)
print(g1.compute())
{'klPQ': 7.443291682334482, 'errorPQ': 0, 'klQP': 4.9202095271714175, 'errorQP': 0, 'hellinger': 1.1555943226920256, 'bhattacharya': 1.1017144533663605, 'jensen-shannon': 0.7594743897090905}
If the models are not on the same domain :
In [7]:
bn_different_domain=gum.loadBN("res/alarm.dsl")
# g=gum.BruteForceKL(bn,bn_different_domain) # a KL-divergence between asia and alarm ... :(
#
# would cause
#---------------------------------------------------------------------------
#OperationNotAllowed Traceback (most recent call last)
#
#OperationNotAllowed: this operation is not allowed : KL : the 2 BNs are not compatible (not the same vars : visit_to_Asia?)
Gibbs-approximated KL
In [8]:
g=gum.GibbsBNdistance(bn,bn2)
g.setVerbosity(True)
g.setMaxTime(120)
g.setBurnIn(5000)
g.setEpsilon(1e-7)
g.setPeriodSize(500)
In [9]:
print(g.compute())
print("Computed in {0} s".format(g.currentTime()))
{'klPQ': 7.419965195684456, 'errorPQ': 0, 'klQP': 4.735746557327108, 'errorQP': 0, 'hellinger': 1.1462640194030809, 'bhattacharya': 1.0995583915974099, 'jensen-shannon': 0.747969168578439}
Computed in 1.2953540000000001 s
In [10]:
print("--")
print(g.messageApproximationScheme())
print("--")
print("Temps de calcul : {0}".format(g.currentTime()))
print("Nombre d'itérations : {0}".format(g.nbrIterations()))
--
stopped with epsilon=1e-07
--
Temps de calcul : 1.2953540000000001
Nombre d'itérations : 384000
In [11]:
p=plot(g.history(), 'g')
Animation of Gibbs KL
Since it may be difficult to know what happens during approximation algorithm, pyAgrum allows to follow the iteration using animated matplotlib figure
In [12]:
g=gum.GibbsBNdistance(bn,bn2)
g.setMaxTime(60)
g.setBurnIn(500)
g.setEpsilon(1e-7)
g.setPeriodSize(5000)
In [13]:
gnb.animApproximationScheme(g) # logarithmique scale for Y
g.compute()
Out[13]:
{'klPQ': 7.438386018548863,
'errorPQ': 0,
'klQP': 4.93817439050127,
'errorQP': 0,
'hellinger': 1.1570436427400401,
'bhattacharya': 1.095336026622168,
'jensen-shannon': 0.7618873346309015}