The CLG model
- A CLG is :
A
pyAgrum.DiGraph
to represents dependency between random variables. The model does not allows cycles.A dictionary id2var to map each NodeID to a
pyAgrum.clg.GaussianVariable
random variable.A dictionary name2id to map each variable’s name to its NodeID.
A dictionary arc2coef to map each arc to its coefficient.
A CLG is equivalent to a SEM (Structural Equation Model) with Gaussian variables.
- class pyAgrum.clg.CLG(clg=None)
- CompareStructure(clg_to_compare)
We use the f-score to compare the causal structure of the two CLGs. We create two BNs with the same structure as the two CLGs and then compare the two BNs.
- Parameters:
clg_to_compare (CLG) – The CLG to compare with.
- Returns:
The f-score of the comparison.
- Return type:
float
- add(var)
Add a new variable to the CLG.
- Parameters:
var (GaussianVariable) – The variable to be added to the CLG.
- Returns:
The id of the added variable.
- Return type:
NodeId
- Raises:
ValueError – if the argument is None.
NameError – if the name of the variable is empty.
NameError – if a variable with the same name already exists in the CLG.
- addArc(val1, val2, coef=1)
Add an arc val->val2 with a coefficient coef to the CLG.
- Parameters:
val1 (NameOrId) – The name or the NodeId of the parent variable.
val2 (NameOrId) – The name or the NodeId of the child variable.
coef (float or int) – The coefficient of the arc.
- Returns:
The tuple of the NodeIds of the parent and the child variables.
- Return type:
Tuple[NodeId, NodeId]
- Raises:
gum.NotFound – if one of the names is not found in the CLG.
ValueError – if the coefficient is 0.
- arcs()
Return the list of arcs in the CLG.
- Returns:
The list of arcs in the CLG.
- Return type:
List[Tuple[NodeId, NodeId]]
- children(val)
Return the list of children ids from the name or the id of a node.
- Parameters:
val (NameOrId) – The name or the NodeId of the variable.
- Returns:
The set of children nodes’ ids.
- Return type:
Set[NodeId]
- children_names(val)
Return the list of children names from the name or the id of a node.
- Parameters:
val (NameOrId) – The name or the NodeId of the variable.
- Returns:
The list of val’s children’s names.
- Return type:
List[str]
- coefArc(val1, val2)
Return the coefficient of the arc val1->val2.
- Parameters:
val1 (NameOrId) – The name or the NodeId of the parent variable.
val2 (NameOrId) – The name or the NodeId of the child variable.
- Returns:
The coefficient of the arc.
- Return type:
float
- Raises:
pyAgrum.NotFound – if one of the names is not found in the CLG.
pyAgrum.NotFound – if the arc does not exist.
- copy(clg)
- dag()
Return the graph of the CLG (which is a DAG).
- Returns:
The graph of the CLG.
- Return type:
gum.DAG
- dag2dict()
Return a dictionary representing the DAG of the CLG.
- Returns:
C – A directed graph DAG representing the causal structure.
- Return type:
Dict[NodeId, Set[NodeId]]
- eraseArc(val1, val2)
Erase the arc val->val2.
- existsArc(val1, val2)
Check if an arc val->val2 exists.
- Parameters:
val1 (NameOrId) – The name or the NodeId of the parent variable.
val2 (NameOrId) – The name or the NodeId of the child variable.
- Returns:
True if the arc exists.
- Return type:
bool
- Raises:
gum.NotFound – if one of the names is not found in the CLG.
- idFromName(name)
Return the NodeId from the name.
- Parameters:
name (str) – The name of the variable.
- Returns:
The NodeId of the variable.
- Return type:
NodeId
- Raises:
gum.NotFound – if the name is not found in the CLG.
- logLikelihood(data)
Return the log-likelihood of the data.
- Parameters:
data (csv file) – The data.
- Returns:
The log-likelihood of the data for the CLG.
- Return type:
float
- name(node)
Return the associated name of the variable.
- Parameters:
node (NodeId) – The id of the variable.
- Returns:
The associated name of the variable.
- Return type:
str
- Raises:
gum.NotFound – if the node is not found in the CLG.
- nameOrId(val)
Return the NodeId from the name or the NodeId.
- Parameters:
val (NameOrId) – The name or the NodeId of the variable.
- Returns:
The NodeId of the variable.
- Return type:
NodeId
- names()
Return the list of names in the CLG.
- Returns:
The list of names in the CLG.
- Return type:
List[str]
- nodes()
Return the list of NodeIds in the CLG.
- Returns:
The list of NodeIds in the CLG.
- Return type:
List[NodeId]
- parent_names(val)
Return the list of parents names from the name or the id of a node.
- Parameters:
val (NameOrId) – The name or the NodeId of the variable.
- Returns:
The list of val’s parents’ names.
- Return type:
List[str]
- parents(val)
Return the list of parent ids from the name or the id of a node.
- Parameters:
val (NameOrId) – The name or the NodeId of the variable.
- Returns:
The set of parent nodes’ ids.
- Return type:
Set[NodeId]
- setCoef(val1, val2, coef)
Set the coefficient of an arc val1->val2.
- Parameters:
val1 (NameOrId) – The name or the NodeId of the parent variable.
val2 (NameOrId) – The name or the NodeId of the child variable.
coef (float or int) – The new coefficient of the arc.
- Raises:
gum.NotFound – if one of the names is not found in the CLG.
ValueError – if the coefficient is 0.
ValueError – if the arc does not exist.
- setMu(node, mu)
Set the mean of a variable.
- Parameters:
node (NodeId) – The id of the variable.
mu (float) – The new mean of the variable.
- Raises:
gum.NotFound – if the node is not found in the CLG.
- setSigma(node, sigma)
Set the standard deviation of a variable.
- Parameters:
node (NodeId) – The id of the variable.
sigma (float) – The new standard deviation of the variable.
- Raises:
gum.NotFound – if the node is not found in the CLG.
- toDot()
- topologicalOrder()
Return the topological order of the CLG.
- Returns:
The list of NodeIds in the topological order.
- Return type:
List[NodeId]
- variable(val)
Return the variable from the NodeId or from the name.
- Parameters:
val (NameOrId) – The name or the NodeId of the variable.
- Returns:
The variable.
- Return type:
- Raises:
gum.NotFound – if val is not Found in the CLG.
- variables()
Return the list of the variables in the CLG.
- Returns:
The list of the variables in the CLG.
- Return type:
List[GaussianVariable]
- class pyAgrum.clg.SEM
This class is used to parse a SEM into a CLG model or convert a CLG model into a SEM.
code
sem = SEM(‘’’ # hyper parameters A = 4[5] B = 3[5] C = -2[5]
# equations D = A[.2] # D is a noisy version of A E = 1 + D + 2 B[2] F = E + C + B + E[0.001] ‘’’)
- FIND_FLOAT = '^([0-9]*\\.?[0-9]*)$'
- FIND_STDDEV = '^\\[([0-9]*\\.?[0-9]*)\\]$'
- FIND_TERM = '^([0-9]*\\.?[0-9]*)([a-zA-Z_]\\w*)$'
- FIND_VAR = '^([a-zA-Z_]\\w*)$'
- ID = '[a-zA-Z_]\\w*'
- NUMBER = '[0-9]*\\.?[0-9]*'
- static loadCLG(filename)
Load the CLG from the file containing a SEM.
- Parameters:
filename (str) – The name of the file containing the SEM of CLG.
- Return type:
the loaded CLG
- static saveCLG(clg, filename)
Save the CLG as a SEM to a file.
- Parameters:
clg (CLG) – The CLG model to be saved.
filename (str) – The name of the file containing the SEM of CLG.
- static toclg(sem)
This function parses a SEM into a CLG model.
- Parameters:
sem (str) – The SEM to be parsed.
- Returns:
The CLG model corresponding to the SEM.
- Return type:
Other functions for CLG
- pyAgrum.clg.randomCLG(nb_variables, names, MuMax=5, MuMin=-5, SigmaMax=10, SigmaMin=1, ArcCoefMax=10, ArcCoefMin=5)
This function generates a random CLG with nb_variables variables.
- Parameters:
nb_variables (int) – The number of variables in the CLG.
names (str) – The list of names of the variables.
MuMax (float) – The maximum value of mu.
MuMin (float) – The minimum value of mu.
SigmaMax (float) – The maximum value of sigma.
SigmaMin (float) – The minimum value of sigma.
ArcCoefMax (float) – The maximum value of the coefficient of the arc.
ArcCoefMin (float) – The minimum value of the coefficient of the arc.
- Returns:
The random CLG.
- Return type: