Generation of database

class pyAgrum.BNDatabaseGenerator(bn)

BNDatabaseGenerator is used to easily generate databases from a pyAgrum.BayesNet.

Parameters

bn (pyAgrum.BayesNet) – the Bayesian network used to generate data.

bn()
Return type

BayesNet

drawSamples(*args)

Generate and stock a database generated by sampling the Bayesian network.

If evs is specified, the samples are stored only if there are compatible with these observations.

Returns the log2likelihood of this database.

Parameters
  • nbSamples (int) – the number of samples that will be generated

  • evs ("pyAgrum.Instantiation" or Dict[intstr,intstr]) – (optional) The evidence that will be observed by the resulting samples.

Warning

nbSamples is not the size of the database but the number of generated samples. It may happen that the evidence is very rare (or even impossible). In that cas the generated database may have only a few samples (even it may be empty).

Examples

>>> import pyAgrum as gum
>>> bn=gum.fastBN('A->B{yes|maybe|no}<-C->D->E<-F<-B')
>>> g=gum.BNDatabaseGenerator(bn)
>>> g.setRandomVarOrder()
>>> g.drawSamples(100,{'B':'yes','E':'1'})
-233.16554130404904
>>> g.to_pandas()
    D  E  C    B  F  A
0   1  1  0  yes  1  1
1   1  1  0  yes  1  0
2   1  1  1  yes  0  1
3   1  1  0  yes  0  0
4   1  1  0  yes  0  1
5   1  1  0  yes  1  0
6   1  1  0  yes  0  0
7   0  1  1  yes  1  1
8   1  1  0  yes  0  1
9   0  1  0  yes  1  1
10  1  1  0  yes  1  1
Return type

float

log2likelihood()
Return type

float

samplesAt(row, col)
Parameters
  • row (int) –

  • col (int) –

Return type

int

samplesLabelAt(row, col)
Parameters
  • row (int) –

  • col (int) –

Return type

str

samplesNbCols()

return the number of columns in the samples

Return type

int

samplesNbRows()

return the number of rows in the samples

Return type

int

setAntiTopologicalVarOrder()
Return type

None

setRandomVarOrder()
Return type

None

setTopologicalVarOrder()
Return type

None

setVarOrder(*args)
Return type

None

setVarOrderFromCSV(*args)
Return type

None

toCSV(*args)

generates csv representing the generated database.

Parameters
  • csvFilename (str) – the name of the csv file

  • useLabels (bool) – whether label or id in the csv file (default true)

  • append (bool) – append in the file or rewrite the file (default false)

  • csvSeparator (str) – separator in the csv file (default ‘,’)

Return type

None

to_pandas(with_labels=True)

export the samples as a pandas.DataFrame.

Parameters

with_labels (bool) – is the DataFrame full of labels of variables or full of index of labels of variables

varOrder()
Return type

object

varOrderNames()
Return type

List[str]