other pyAgrum.lib modules

bn2roc

The purpose of this module is to provide tools for building ROC and PR from Bayesian Network.

pyAgrum.lib.bn2roc.animPR(bn, datasrc, target='Y', label='1')

Interactive selection of a threshold using TPR and FPR for BN and data

Parameters:

bn (pyAgrum.BayesNet) – a Bayesian network
datasrc (str|DataFrame) – a csv filename or a pandas.DataFrame
target (str) – the target
label (str) – the target label

pyAgrum.lib.bn2roc.animROC(bn, datasrc, target='Y', label='1')

Interactive selection of a threshold using TPR and FPR for BN and data

Parameters:

bn (pyAgrum.BayesNet) – a Bayesian network
datasrc (str|DataFrame) – a csv filename or a pandas.DataFrame
target (str) – the target
label (str) – the target label

pyAgrum.lib.bn2roc.getPRpoints(bn, datasrc, target, label, with_labels=True, significant_digits=10)

Compute the points of the PR curve

Parameters:

bn (pyAgrum.BayesNet) – a Bayesian network
datasrc (str|DataFrame) – a csv filename or a pandas.DataFrame
target (str) – the target
label (str) – the target’s label
with_labels (bool) – whether we use label or id (especially for parameter label)
significant_digits – number of significant digits when computing probabilities

Returns:

List[Tuple[float,float]]: the list of points (precision,recall)

pyAgrum.lib.bn2roc.getROCpoints(bn, datasrc, target, label, with_labels=True, significant_digits=10)

Compute the points of the ROC curve

Parameters:

bn (pyAgrum.BayesNet) – a Bayesian network
datasrc (str | DataFrame) – a csv filename or a DataFrame
target (str) – the target
label (str) – the target’s label
with_labels (bool) – whether we use label or id (especially for parameter label)
significant_digits – number of significant digits when computing probabilities

Returns:

List[Tuple[int,int]]: the list of points (FalsePositifRate,TruePositifRate)

pyAgrum.lib.bn2roc.showPR(bn, datasrc, target, label, *, beta=1, show_progress=True, show_fig=True, save_fig=False, with_labels=True, significant_digits=10)

Compute the ROC curve and save the result in the folder of the csv file.

Parameters:

bn (pyAgrum.BayesNet) – a Bayesian network
datasrc (str|DataFrame) – a csv filename or a pandas.DataFrame
target (str) – the target
label (str) – the target label
show_progress (bool) – indicates if the progress bar must be printed
save_fig – save the result ?
show_fig – plot the resuls ?
with_labels – labels in csv ?
significant_digits – number of significant digits when computing probabilities

pyAgrum.lib.bn2roc.showROC(bn, datasrc, target, label, show_progress=True, show_fig=True, save_fig=False, with_labels=True, significant_digits=10)

Compute the ROC curve and save the result in the folder of the csv file.

Parameters:

bn (pyAgrum.BayesNet) – a Bayesian network
datasrc (str|DataFrame) – a csv filename or a pandas.DataFrame
target (str) – the target
label (str) – the target label
show_progress (bool) – indicates if the progress bar must be printed
save_fig – save the result
show_fig – plot the resuls
with_labels – labels in csv
significant_digits – number of significant digits when computing probabilities

pyAgrum.lib.bn2roc.showROC_PR(bn, datasrc, target, label, *, beta=1, show_progress=True, show_fig=True, save_fig=False, with_labels=True, show_ROC=True, show_PR=True, significant_digits=10, bgcolor=None)

Compute the ROC curve and save the result in the folder of the csv file.

Parameters:

bn (pyAgrum.BayesNet) – a Bayesian network
datasrc (str|DataFrame) – a csv filename or a pandas.DataFrame
target (str) – the target
label (str) – the target label
beta (float) – the value of beta for the F-beta score
show_progress (bool) – indicates if the progress bar must be printed
save_fig – save the result
show_fig – plot the resuls
with_labels – labels in csv
show_ROC (bool) – whether we show the ROC figure
show_PR (bool) – whether we show the PR figure
significant_digits – number of significant digits when computing probabilities
bgcolor – HTML background color for the figure (default: None if transparent)

Returns:

(pointsROC, thresholdROC, pointsPR, thresholdPR)

Return type:

tuple

bn2scores

The purpose of this module is to provide tools for computing different scores from a BN.

pyAgrum.lib.bn2scores.checkCompatibility(bn, fields, csv_name)

check if the variables of the bn are in the fields

Parameters:

bn (gum.BayesNet) – the model
fields (Dict[str,int]) – Dict of name,position in the file
csv_name (str) – name of the csv file

Raises:

gum.DatabaseError – if a BN variable is not in fields

Returns:

return a dictionary of position for BN variables in fields

Return type:

Dict[int,str]

pyAgrum.lib.bn2scores.computeScores(bn_name, csv_name, visible=False, dialect=None)

Compute scores (likelihood, aic, bic, mdl, etc.) from a bn w.r.t to a csv

Parameters:

bn_name (pyAgrum.BayesNet | str) – a gum.BayesianNetwork or a filename for a BN
csv_name (str) – a filename for the CSV database
visible (bool) – do we show the progress
dialect (csv.Dialect) – if not provided, dialect will be inferred using csv.Sniffer().sniff(csvfile.read(1024))

Returns:

percentDatabaseUsed,scores

Return type:

Tuple[float,Dict[str,float]]

pyAgrum.lib.bn2scores.lines_count(filename): count lines in a file

bn_vs_bn

The purpose of this module is to provide tools for comaring different BNs.

class pyAgrum.lib.bn_vs_bn.GraphicalBNComparator(name1, name2, delta=1e-06)

Bases: object

BNGraphicalComparator allows to compare in multiple way 2 BNs…The smallest assumption is that the names of the variables are the same in the 2 BNs. But some comparisons will have also to check the type and domainSize of the variables. The bns have not exactly the same role : _bn1 is rather the referent model for the comparison whereas _bn2 is the compared one to the referent model.

Parameters:

name1 (str or pyAgrum.BayesNet) – a BN or a filename for reference
name2 (str or pyAgrum.BayesNet) – another BN or antoher filename for comparison

dotDiff()

Return a pydot graph that compares the arcs of _bn1 (reference) with those of self._bn2. full black line: the arc is common for both full red line: the arc is common but inverted in _bn2 dotted black line: the arc is added in _bn2 dotted red line: the arc is removed in _bn2

Warning

if pydot is not installed, this function just returns None

Returns:: the result dot graph or None if pydot can not be imported
Return type:: pydot.Dot

equivalentBNs()

Check if the 2 BNs are equivalent :

same variables
same graphical structure
same parameters

Returns:: “OK” if bn are the same, a description of the error otherwise
Return type:: str

hamming()

Compute hamming and structural hamming distance

Hamming distance is the difference of edges comparing the 2 skeletons, and Structural Hamming difference is the difference comparing the cpdags, including the arcs’ orientation.

Returns:: A dictionary containing PURE_HAMMING,STRUCTURAL_HAMMING
Return type:: dict[double,double]

scores()

Compute Precision, Recall, F-score for self._bn2 compared to self._bn1

precision and recall are computed considering BN1 as the reference

Fscore is 2*(recall* precision)/(recall+precision) and is the weighted average of Precision and Recall.

dist2opt=square root of (1-precision)^2+(1-recall)^2 and represents the euclidian distance to the ideal point (precision=1, recall=1)

Returns:: A dictionnary containing ‘precision’, ‘recall’, ‘fscore’, ‘dist2opt’ and so on.
Return type:: dict[str,double]

skeletonScores()

Compute Precision, Recall, F-score for skeletons of self._bn2 compared to self._bn1

precision and recall are computed considering BN1 as the reference

Fscor is 2*(recall* precision)/(recall+precision) and is the weighted average of Precision and Recall.

dist2opt=square root of (1-precision)^2+(1-recall)^2 and represents the euclidian distance to the ideal point (precision=1, recall=1)

Returns:: A dictionnary containing ‘precision’, ‘recall’, ‘fscore’, ‘dist2opt’ and so on.
Return type:: dict[str,double]

pyAgrum.lib.bn_vs_bn.graphDiff(bnref, bncmp, noStyle=False)

Return a pydot graph that compares the arcs of bnref to bncmp. graphDiff allows bncmp to have less nodes than bnref. (this is not the case in GraphicalBNComparator.dotDiff())

if noStyle is False use 4 styles (fixed in pyAgrum.config) :

the arc is common for both
the arc is common but inverted in _bn2
the arc is added in _bn2
the arc is removed in _bn2

See graphDiffLegend() to add a legend to the graph. .. warning:: if pydot is not installed, this function just returns None

Returns:: the result dot graph or None if pydot can not be imported
Return type:: pydot.Dot

pyAgrum.lib.bn_vs_bn.graphDiffLegend()