Generation of database¶

class pyAgrum.BNDatabaseGenerator(bn)¶

BNDatabaseGenerator is used to easily generate databases from a pyAgrum.BayesNet.

Parameters: bn (pyAgrum.BayesNet) – the Bayesian network used to generate data.

bn()¶

Return type: BayesNet

drawSamples(*args)¶

Generate and stock a database generated by sampling the Bayesian network.

If evs is specified, the samples are stored only if there are compatible with these observations.

Returns the log2likelihood of this database.

Parameters

nbSamples (int) – the number of samples that will be generated
evs ("pyAgrum.Instantiation" or Dict[intstr,intstr]) – (optional) The evidence that will be observed by the resulting samples.

Warning

nbSamples is not the size of the database but the number of generated samples. It may happen that the evidence is very rare (or even impossible). In that cas the generated database may have only a few samples (even it may be empty).

Examples

>>> import pyAgrum as gum
>>> bn=gum.fastBN('A->B{yes|maybe|no}<-C->D->E<-F<-B')
>>> g=gum.BNDatabaseGenerator(bn)
>>> g.setRandomVarOrder()
>>> g.drawSamples(100,{'B':'yes','E':'1'})
-233.16554130404904
>>> g.to_pandas()
    D  E  C    B  F  A
0   1  1  0  yes  1  1
1   1  1  0  yes  1  0
2   1  1  1  yes  0  1
3   1  1  0  yes  0  0
4   1  1  0  yes  0  1
5   1  1  0  yes  1  0
6   1  1  0  yes  0  0
7   0  1  1  yes  1  1
8   1  1  0  yes  0  1
9   0  1  0  yes  1  1
10  1  1  0  yes  1  1

Return type: float

log2likelihood()¶

Return type: float

samplesAt(row, col)¶

Parameters

row (int) –
col (int) –

Return type

int

samplesLabelAt(row, col)¶

Parameters

row (int) –
col (int) –

Return type

str

samplesNbCols()¶

return the number of columns in the samples

Return type: int

samplesNbRows()¶

return the number of rows in the samples

Return type: int

setAntiTopologicalVarOrder()¶

Return type: None

setRandomVarOrder()¶

Return type: None

setTopologicalVarOrder()¶

Return type: None

setVarOrder(*args)¶

Return type: None

setVarOrderFromCSV(*args)¶

Return type: None

toCSV(*args)¶

generates csv representing the generated database.

Parameters

csvFilename (str) – the name of the csv file
useLabels (bool) – whether label or id in the csv file (default true)
append (bool) – append in the file or rewrite the file (default false)
csvSeparator (str) – separator in the csv file (default ‘,’)

Return type

None

to_pandas(with_labels=True)¶

export the samples as a pandas.DataFrame.

Parameters: with_labels (bool) – is the DataFrame full of labels of variables or full of index of labels of variables

varOrder()¶

Return type: object

varOrderNames()¶

Return type: List[str]