Using sklearn to cross-validate bayesian network classifier

The purpose of this notebook is to show the possible integration of the pyAgrum’s classifier in the scikit-learn’s ecosystem. Thus, it is possible to use the tools provided by scikit-learn for crossfolding for pyAgrum’s Bayesian network.

Creative Commons License

aGrUM

interactive online version

In [1]:
import pyAgrum as gum
import pyAgrum.lib.notebook as gnb
from pyAgrum.skbn import BNClassifier
In [2]:
from sklearn.model_selection import cross_validate
from sklearn import datasets
# get iris data
iris = datasets.load_iris()
X = iris.data
y = iris.target
In [3]:
model = BNClassifier(learningMethod='MIIC', prior='Smoothing', priorWeight=1,
                     discretizationNbBins=3,discretizationStrategy="kmeans",discretizationThreshold=10)
In [4]:
cv = cross_validate(model, X, y, cv=30)
print(f"scores with cross-folding : {cv['test_score']}")
print()

print(f"mean score : {cv['test_score'].mean()}")
scores with cross-folding : [1.  1.  1.  1.  1.  1.  1.  1.  1.  1.  0.8 1.  1.  1.  0.8 1.  1.  0.8
 1.  1.  1.  0.8 1.  1.  1.  1.  1.  1.  1.  1. ]

mean score : 0.9733333333333333
In [5]:
cv = cross_validate(model, X, y, cv=50)
print(f"scores with cross-folding : {cv['test_score']}")
print()

print(f"mean score : {cv['test_score'].mean()}")
scores with cross-folding : [1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         0.66666667 1.         1.         1.
 1.         1.         1.         0.66666667 1.         1.
 1.         1.         1.         0.66666667 1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.         1.         1.         1.         1.
 1.         1.        ]

mean score : 0.98