Learning essential graphs

Creative Commons License

aGrUM

interactive online version

In [1]:
from pylab import *
import matplotlib.pyplot as plt

import os

import pyAgrum as gum
import pyAgrum.lib.notebook as gnb


Compare learning algorithms

Essentially MIIC computes the essential graph (CPDAG) from data. Essential graphs are PDAGs (Partially Directed Acyclic Graphs).

In [2]:
learner=gum.BNLearner("res/sample_asia.csv")
learner.useMIIC()
learner.useNMLCorrection()
print(learner)
Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : MIIC
Score          : BDeu  (Not used for constraint-based algorithms)
Correction     : NML  (Not used for score-based algorithms)
Prior          : -

In [3]:
gemiic=learner.learnEssentialGraph()
gnb.show(gemiic)
../_images/notebooks_33-Learning_LearningAndEssentialGraphs_5_0.svg

For the others methods, it is possible to obtain the essential graph from the learned BN.

In [4]:
learner=gum.BNLearner("res/sample_asia.csv")
learner.useGreedyHillClimbing()
bnHC=learner.learnBN()
print(learner)
geHC=gum.EssentialGraph(bnHC)
geHC
gnb.sideBySide(bnHC,geHC)
Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : Greedy Hill Climbing
Score          : BDeu  (Not used for constraint-based algorithms)
Correction     : MDL  (Not used for score-based algorithms)
Prior          : -

G tuberculos_or_cancer tuberculos_or_cancer positive_XraY positive_XraY tuberculos_or_cancer->positive_XraY dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea visit_to_Asia visit_to_Asia lung_cancer lung_cancer lung_cancer->tuberculos_or_cancer bronchitis bronchitis lung_cancer->bronchitis smoking smoking lung_cancer->smoking bronchitis->smoking bronchitis->dyspnoea tuberculosis tuberculosis tuberculosis->tuberculos_or_cancer tuberculosis->visit_to_Asia
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 3 bronchitis 1->3 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3->5 7 dyspnoea 3->7 4 positive_XraY 6->4 6->7
In [5]:
learner=gum.BNLearner("res/sample_asia.csv")
learner.useLocalSearchWithTabuList()
print(learner)
bnTL=learner.learnBN()
geTL=gum.EssentialGraph(bnTL)
geTL
gnb.sideBySide(bnTL,geTL)
Filename       : res/sample_asia.csv
Size           : (50000,8)
Variables      : visit_to_Asia[2], lung_cancer[2], tuberculosis[2], bronchitis[2], positive_XraY[2], smoking[2], tuberculos_or_cancer[2], dyspnoea[2]
Induced types  : True
Missing values : False
Algorithm      : Local Search with Tabu List
Tabu list size : 2
Score          : BDeu  (Not used for constraint-based algorithms)
Correction     : MDL  (Not used for score-based algorithms)
Prior          : -

G tuberculos_or_cancer tuberculos_or_cancer lung_cancer lung_cancer tuberculos_or_cancer->lung_cancer dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea tuberculosis tuberculosis tuberculos_or_cancer->tuberculosis visit_to_Asia visit_to_Asia positive_XraY positive_XraY positive_XraY->tuberculos_or_cancer smoking smoking lung_cancer->smoking bronchitis bronchitis bronchitis->dyspnoea smoking->bronchitis tuberculosis->visit_to_Asia tuberculosis->lung_cancer
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 1->2 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3 bronchitis 3->5 7 dyspnoea 3->7 4 positive_XraY 4->6 6->7

Hence we can compare the 4 algorithms.

In [6]:
(
  gnb.flow.clear()
  .add(gemiic,"Essential graph from miic")
  .add(bnHC,"BayesNet from GHC")
  .add(geHC,"Essential graph from GHC")
  .add(bnTL,"BayesNet from TabuList")
  .add(geTL,"Essential graph from TabuList")
  .display()
)
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3 bronchitis 3->5 7 dyspnoea 3->7 4 positive_XraY 6->4 6->7
Essential graph from miic
G tuberculos_or_cancer tuberculos_or_cancer positive_XraY positive_XraY tuberculos_or_cancer->positive_XraY dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea visit_to_Asia visit_to_Asia lung_cancer lung_cancer lung_cancer->tuberculos_or_cancer bronchitis bronchitis lung_cancer->bronchitis smoking smoking lung_cancer->smoking bronchitis->smoking bronchitis->dyspnoea tuberculosis tuberculosis tuberculosis->tuberculos_or_cancer tuberculosis->visit_to_Asia
BayesNet from GHC
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 3 bronchitis 1->3 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3->5 7 dyspnoea 3->7 4 positive_XraY 6->4 6->7
Essential graph from GHC
G tuberculos_or_cancer tuberculos_or_cancer lung_cancer lung_cancer tuberculos_or_cancer->lung_cancer dyspnoea dyspnoea tuberculos_or_cancer->dyspnoea tuberculosis tuberculosis tuberculos_or_cancer->tuberculosis visit_to_Asia visit_to_Asia positive_XraY positive_XraY positive_XraY->tuberculos_or_cancer smoking smoking lung_cancer->smoking bronchitis bronchitis bronchitis->dyspnoea smoking->bronchitis tuberculosis->visit_to_Asia tuberculosis->lung_cancer
BayesNet from TabuList
no_name 0 visit_to_Asia 2 tuberculosis 0->2 1 lung_cancer 1->2 5 smoking 1->5 6 tuberculos_or_cancer 1->6 2->6 3 bronchitis 3->5 7 dyspnoea 3->7 4 positive_XraY 4->6 6->7
Essential graph from TabuList
In [ ]: