The CLG model

A CLG is :
  • A pyAgrum.DiGraph to represents dependency between random variables. The model does not allows cycles.

  • A dictionary id2var to map each NodeID to a pyAgrum.clg.GaussianVariable random variable.

  • A dictionary name2id to map each variable’s name to its NodeID.

  • A dictionary arc2coef to map each arc to its coefficient.

A CLG is equivalent to a SEM (Structural Equation Model) with Gaussian variables.

class pyAgrum.clg.CLG(clg=None)
CompareStructure(clg_to_compare)

We use the f-score to compare the causal structure of the two CLGs. We create two BNs with the same structure as the two CLGs and then compare the two BNs.

Parameters:

clg_to_compare (CLG) – The CLG to compare with.

Returns:

The f-score of the comparison.

Return type:

float

add(var)

Add a new variable to the CLG.

Parameters:

var (GaussianVariable) – The variable to be added to the CLG.

Returns:

The id of the added variable.

Return type:

NodeId

Raises:
  • ValueError – if the argument is None.

  • NameError – if the name of the variable is empty.

  • NameError – if a variable with the same name already exists in the CLG.

addArc(val1, val2, coef=1)

Add an arc val->val2 with a coefficient coef to the CLG.

Parameters:
  • val1 (NameOrId) – The name or the NodeId of the parent variable.

  • val2 (NameOrId) – The name or the NodeId of the child variable.

  • coef (float or int) – The coefficient of the arc.

Returns:

The tuple of the NodeIds of the parent and the child variables.

Return type:

Tuple[NodeId, NodeId]

Raises:
  • gum.NotFound – if one of the names is not found in the CLG.

  • ValueError – if the coefficient is 0.

arcs()

Return the list of arcs in the CLG.

Returns:

The list of arcs in the CLG.

Return type:

List[Tuple[NodeId, NodeId]]

children(val)

Return the list of children ids from the name or the id of a node.

Parameters:

val (NameOrId) – The name or the NodeId of the variable.

Returns:

The set of children nodes’ ids.

Return type:

Set[NodeId]

children_names(val)

Return the list of children names from the name or the id of a node.

Parameters:

val (NameOrId) – The name or the NodeId of the variable.

Returns:

The list of val’s children’s names.

Return type:

List[str]

coefArc(val1, val2)

Return the coefficient of the arc val1->val2.

Parameters:
  • val1 (NameOrId) – The name or the NodeId of the parent variable.

  • val2 (NameOrId) – The name or the NodeId of the child variable.

Returns:

The coefficient of the arc.

Return type:

float

Raises:
copy(clg)
dag()

Return the graph of the CLG (which is a DAG).

Returns:

The graph of the CLG.

Return type:

gum.DAG

dag2dict()

Return a dictionary representing the DAG of the CLG.

Returns:

C – A directed graph DAG representing the causal structure.

Return type:

Dict[NodeId, Set[NodeId]]

eraseArc(val1, val2)

Erase the arc val->val2.

existsArc(val1, val2)

Check if an arc val->val2 exists.

Parameters:
  • val1 (NameOrId) – The name or the NodeId of the parent variable.

  • val2 (NameOrId) – The name or the NodeId of the child variable.

Returns:

True if the arc exists.

Return type:

bool

Raises:

gum.NotFound – if one of the names is not found in the CLG.

idFromName(name)

Return the NodeId from the name.

Parameters:

name (str) – The name of the variable.

Returns:

The NodeId of the variable.

Return type:

NodeId

Raises:

gum.NotFound – if the name is not found in the CLG.

logLikelihood(data)

Return the log-likelihood of the data.

Parameters:

data (csv file) – The data.

Returns:

The log-likelihood of the data for the CLG.

Return type:

float

name(node)

Return the associated name of the variable.

Parameters:

node (NodeId) – The id of the variable.

Returns:

The associated name of the variable.

Return type:

str

Raises:

gum.NotFound – if the node is not found in the CLG.

nameOrId(val)

Return the NodeId from the name or the NodeId.

Parameters:

val (NameOrId) – The name or the NodeId of the variable.

Returns:

The NodeId of the variable.

Return type:

NodeId

names()

Return the list of names in the CLG.

Returns:

The list of names in the CLG.

Return type:

List[str]

nodes()

Return the list of NodeIds in the CLG.

Returns:

The list of NodeIds in the CLG.

Return type:

List[NodeId]

parent_names(val)

Return the list of parents names from the name or the id of a node.

Parameters:

val (NameOrId) – The name or the NodeId of the variable.

Returns:

The list of val’s parents’ names.

Return type:

List[str]

parents(val)

Return the list of parent ids from the name or the id of a node.

Parameters:

val (NameOrId) – The name or the NodeId of the variable.

Returns:

The set of parent nodes’ ids.

Return type:

Set[NodeId]

setCoef(val1, val2, coef)

Set the coefficient of an arc val1->val2.

Parameters:
  • val1 (NameOrId) – The name or the NodeId of the parent variable.

  • val2 (NameOrId) – The name or the NodeId of the child variable.

  • coef (float or int) – The new coefficient of the arc.

Raises:
  • gum.NotFound – if one of the names is not found in the CLG.

  • ValueError – if the coefficient is 0.

  • ValueError – if the arc does not exist.

setMu(node, mu)

Set the mean of a variable.

Parameters:
  • node (NodeId) – The id of the variable.

  • mu (float) – The new mean of the variable.

Raises:

gum.NotFound – if the node is not found in the CLG.

setSigma(node, sigma)

Set the standard deviation of a variable.

Parameters:
  • node (NodeId) – The id of the variable.

  • sigma (float) – The new standard deviation of the variable.

Raises:

gum.NotFound – if the node is not found in the CLG.

toDot()
topologicalOrder()

Return the topological order of the CLG.

Returns:

The list of NodeIds in the topological order.

Return type:

List[NodeId]

variable(val)

Return the variable from the NodeId or from the name.

Parameters:

val (NameOrId) – The name or the NodeId of the variable.

Returns:

The variable.

Return type:

GaussianVariable

Raises:

gum.NotFound – if val is not Found in the CLG.

variables()

Return the list of the variables in the CLG.

Returns:

The list of the variables in the CLG.

Return type:

List[GaussianVariable]

class pyAgrum.clg.SEM

This class is used to parse a SEM into a CLG model or convert a CLG model into a SEM.

code

sem = SEM(‘’’ # hyper parameters A = 4[5] B = 3[5] C = -2[5]

# equations D = A[.2] # D is a noisy version of A E = 1 + D + 2 B[2] F = E + C + B + E[0.001] ‘’’)

FIND_FLOAT = '^([0-9]*\\.?[0-9]*)$'
FIND_STDDEV = '^\\[([0-9]*\\.?[0-9]*)\\]$'
FIND_TERM = '^([0-9]*\\.?[0-9]*)([a-zA-Z_]\\w*)$'
FIND_VAR = '^([a-zA-Z_]\\w*)$'
ID = '[a-zA-Z_]\\w*'
NUMBER = '[0-9]*\\.?[0-9]*'
static loadCLG(filename)

Load the CLG from the file containing a SEM.

Parameters:

filename (str) – The name of the file containing the SEM of CLG.

Return type:

the loaded CLG

static saveCLG(clg, filename)

Save the CLG as a SEM to a file.

Parameters:
  • clg (CLG) – The CLG model to be saved.

  • filename (str) – The name of the file containing the SEM of CLG.

static toclg(sem)

This function parses a SEM into a CLG model.

Parameters:

sem (str) – The SEM to be parsed.

Returns:

The CLG model corresponding to the SEM.

Return type:

CLG

static tosem(clg)

This function converts a CLG model into a SEM.

Parameters:

clg (CLG) – The CLG model to be converted.

Returns:

lines – The SEM corresponding to the CLG model.

Return type:

str

Other functions for CLG

pyAgrum.clg.randomCLG(nb_variables, names, MuMax=5, MuMin=-5, SigmaMax=10, SigmaMin=1, ArcCoefMax=10, ArcCoefMin=5)

This function generates a random CLG with nb_variables variables.

Parameters:
  • nb_variables (int) – The number of variables in the CLG.

  • names (str) – The list of names of the variables.

  • MuMax (float) – The maximum value of mu.

  • MuMin (float) – The minimum value of mu.

  • SigmaMax (float) – The maximum value of sigma.

  • SigmaMin (float) – The minimum value of sigma.

  • ArcCoefMax (float) – The maximum value of the coefficient of the arc.

  • ArcCoefMin (float) – The minimum value of the coefficient of the arc.

Returns:

The random CLG.

Return type:

CLG