# Learning essential graphs

In [1]:

from pylab import *
import matplotlib.pyplot as plt

import os

import pyAgrum as gum
import pyAgrum.lib.notebook as gnb


## Compare learning algorithms

Essentially MIIC computes the essential graph (CPDAG) from data. Essential graphs are PDAGs (Partially Directed Acyclic Graphs).

In [2]:

learner=gum.BNLearner("res/sample_asia.csv")
learner.useMIIC()
learner.useNMLCorrection()
print(learner)

Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : MIIC
Score          : BDeu  (Not used for constraint-based algorithms)
Correction     : NML  (Not used for score-based algorithms)
Prior          : -


In [3]:

gemiic=learner.learnEssentialGraph()
gnb.show(gemiic)


For the others methods, it is possible to obtain the essential graph from the learned BN.

In [4]:

learner=gum.BNLearner("res/sample_asia.csv")
learner.useGreedyHillClimbing()
bnHC=learner.learnBN()
print(learner)
geHC=gum.EssentialGraph(bnHC)
geHC
gnb.sideBySide(bnHC,geHC)

Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : Greedy Hill Climbing
Score          : BDeu  (Not used for constraint-based algorithms)
Correction     : MDL  (Not used for score-based algorithms)
Prior          : -


 G tuberculos_or_cancer tuberculos_or_cancer positive_XraY positive_XraY tuberculos_or_cancer->positive_XraY dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea visit_to_Asia visit_to_Asia lung_cancer lung_cancer lung_cancer->tuberculos_or_cancer bronchitis bronchitis lung_cancer->bronchitis smoking smoking lung_cancer->smoking bronchitis->smoking bronchitis->dyspnoea tuberculosis tuberculosis tuberculosis->tuberculos_or_cancer tuberculosis->visit_to_Asia no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 3 bronchitis 1->3 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3->5 7 dyspnoea 3->7 4 positive_XraY 6->4 6->7
In [5]:

learner=gum.BNLearner("res/sample_asia.csv")
learner.useLocalSearchWithTabuList()
print(learner)
bnTL=learner.learnBN()
geTL=gum.EssentialGraph(bnTL)
geTL
gnb.sideBySide(bnTL,geTL)

Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : Local Search with Tabu List
Tabu list size : 2
Score          : BDeu  (Not used for constraint-based algorithms)
Correction     : MDL  (Not used for score-based algorithms)
Prior          : -


 G tuberculos_or_cancer tuberculos_or_cancer lung_cancer lung_cancer tuberculos_or_cancer->lung_cancer dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea tuberculosis tuberculosis tuberculos_or_cancer->tuberculosis visit_to_Asia visit_to_Asia positive_XraY positive_XraY positive_XraY->tuberculos_or_cancer smoking smoking lung_cancer->smoking bronchitis bronchitis bronchitis->dyspnoea smoking->bronchitis tuberculosis->visit_to_Asia tuberculosis->lung_cancer no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 1->2 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3 bronchitis 3->5 7 dyspnoea 3->7 4 positive_XraY 4->6 6->7

Hence we can compare the 4 algorithms.

In [6]:

(
gnb.flow.clear()
.display()
)


Essential graph from miic

BayesNet from GHC

Essential graph from GHC

BayesNet from TabuList

Essential graph from TabuList
In [ ]: