Generation of database¶
- class pyAgrum.BNDatabaseGenerator(bn)¶
BNDatabaseGenerator is used to easily generate databases from a pyAgrum.BayesNet.
- Parameters
bn (pyAgrum.BayesNet) – the Bayesian network used to generate data.
- drawSamples(*args)¶
Generate and stock a database generated by sampling the Bayesian network.
If evs is specified, the samples are stored only if there are compatible with these observations.
Returns the log2likelihood of this database.
- Parameters
nbSamples (int) – the number of samples that will be generated
evs ("pyAgrum.Instantiation" or Dict[intstr,intstr]) – (optional) The evidence that will be observed by the resulting samples.
Warning
nbSamples is not the size of the database but the number of generated samples. It may happen that the evidence is very rare (or even impossible). In that cas the generated database may have only a few samples (even it may be empty).
Examples
>>> import pyAgrum as gum >>> bn=gum.fastBN('A->B{yes|maybe|no}<-C->D->E<-F<-B') >>> g=gum.BNDatabaseGenerator(bn) >>> g.setRandomVarOrder() >>> g.drawSamples(100,{'B':'yes','E':'1'}) -233.16554130404904 >>> g.to_pandas() D E C B F A 0 1 1 0 yes 1 1 1 1 1 0 yes 1 0 2 1 1 1 yes 0 1 3 1 1 0 yes 0 0 4 1 1 0 yes 0 1 5 1 1 0 yes 1 0 6 1 1 0 yes 0 0 7 0 1 1 yes 1 1 8 1 1 0 yes 0 1 9 0 1 0 yes 1 1 10 1 1 0 yes 1 1
- Return type
float
- log2likelihood()¶
- Return type
float
- samplesAt(row, col)¶
- Parameters
row (
int
) –col (
int
) –
- Return type
int
- samplesLabelAt(row, col)¶
- Parameters
row (
int
) –col (
int
) –
- Return type
str
- samplesNbCols()¶
return the number of columns in the samples
- Return type
int
- samplesNbRows()¶
return the number of rows in the samples
- Return type
int
- setAntiTopologicalVarOrder()¶
- Return type
None
- setRandomVarOrder()¶
- Return type
None
- setTopologicalVarOrder()¶
- Return type
None
- setVarOrder(*args)¶
- Return type
None
- setVarOrderFromCSV(*args)¶
- Return type
None
- toCSV(*args)¶
generates csv representing the generated database.
- Parameters
csvFilename (str) – the name of the csv file
useLabels (bool) – whether label or id in the csv file (default true)
append (bool) – append in the file or rewrite the file (default false)
csvSeparator (str) – separator in the csv file (default ‘,’)
- Return type
None
- to_pandas(with_labels=True)¶
export the samples as a pandas.DataFrame.
- Parameters
with_labels (bool) – is the DataFrame full of labels of variables or full of index of labels of variables
- varOrder()¶
- Return type
object
- varOrderNames()¶
- Return type
List
[str
]