# 2. Huang Example

Let’s go over the Huang Bayesian Belief Network (BBN) [HD99].

## 2.1. Model

The Huang BBN is stored in a JSON file. For completeness, we show the content of the JSON file.

```
[1]:
```

```
import json
with open('../../_support/bbn/huang-bbn.json', 'r') as fp:
data = json.load(fp)
data
```

```
[1]:
```

```
{'d': {'nodes': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'],
'edges': [['A', 'B'],
['A', 'C'],
['B', 'D'],
['C', 'E'],
['C', 'G'],
['D', 'F'],
['E', 'F'],
['E', 'H'],
['G', 'H']]},
'p': {'A': {'columns': ['A', '__p__'], 'data': [['on', 0.5], ['off', 0.5]]},
'B': {'columns': ['A', 'B', '__p__'],
'data': [['on', 'on', 0.5],
['on', 'off', 0.5],
['off', 'on', 0.4],
['off', 'off', 0.6]]},
'C': {'columns': ['A', 'C', '__p__'],
'data': [['on', 'on', 0.7],
['on', 'off', 0.3],
['off', 'on', 0.2],
['off', 'off', 0.8]]},
'D': {'columns': ['B', 'D', '__p__'],
'data': [['on', 'on', 0.9],
['on', 'off', 0.1],
['off', 'on', 0.5],
['off', 'off', 0.5]]},
'E': {'columns': ['C', 'E', '__p__'],
'data': [['on', 'on', 0.3],
['on', 'off', 0.7],
['off', 'on', 0.6],
['off', 'off', 0.4]]},
'F': {'columns': ['D', 'E', 'F', '__p__'],
'data': [['on', 'on', 'on', 0.01],
['on', 'on', 'off', 0.99],
['on', 'off', 'on', 0.01],
['on', 'off', 'off', 0.99],
['off', 'on', 'on', 0.01],
['off', 'on', 'off', 0.99],
['off', 'off', 'on', 0.99],
['off', 'off', 'off', 0.01]]},
'G': {'columns': ['C', 'G', '__p__'],
'data': [['on', 'on', 0.8],
['on', 'off', 0.2],
['off', 'on', 0.1],
['off', 'off', 0.9]]},
'H': {'columns': ['E', 'G', 'H', '__p__'],
'data': [['on', 'on', 'on', 0.05],
['on', 'on', 'off', 0.95],
['on', 'off', 'on', 0.95],
['on', 'off', 'off', 0.05],
['off', 'on', 'on', 0.95],
['off', 'on', 'off', 0.05],
['off', 'off', 'on', 0.95],
['off', 'off', 'off', 0.05]]}}}
```

From this BBN-specified JSON file, we will create the capable reasoning model.

```
[2]:
```

```
from pybbn.factory import create_reasoning_model
model = create_reasoning_model(data['d'], data['p'])
```

### 2.1.1. Graph (DAG)

We can visualize the directed acyclic graph (DAG) of the Huang BBN.

```
[3]:
```

```
from help.viz import *
draw_dag(model)
```

### 2.1.2. Parameters

Here are the parameters of the Huang BBN.

```
[4]:
```

```
model.node_potentials['A']
```

```
[4]:
```

A | __p__ | |
---|---|---|

0 | on | 0.5 |

1 | off | 0.5 |

```
[5]:
```

```
model.node_potentials['B']
```

```
[5]:
```

A | B | __p__ | |
---|---|---|---|

0 | on | on | 0.5 |

1 | on | off | 0.5 |

2 | off | on | 0.4 |

3 | off | off | 0.6 |

```
[6]:
```

```
model.node_potentials['C']
```

```
[6]:
```

A | C | __p__ | |
---|---|---|---|

0 | on | on | 0.7 |

1 | on | off | 0.3 |

2 | off | on | 0.2 |

3 | off | off | 0.8 |

```
[7]:
```

```
model.node_potentials['D']
```

```
[7]:
```

B | D | __p__ | |
---|---|---|---|

0 | on | on | 0.9 |

1 | on | off | 0.1 |

2 | off | on | 0.5 |

3 | off | off | 0.5 |

```
[8]:
```

```
model.node_potentials['E']
```

```
[8]:
```

C | E | __p__ | |
---|---|---|---|

0 | on | on | 0.3 |

1 | on | off | 0.7 |

2 | off | on | 0.6 |

3 | off | off | 0.4 |

```
[9]:
```

```
model.node_potentials['F']
```

```
[9]:
```

D | E | F | __p__ | |
---|---|---|---|---|

0 | on | on | on | 0.01 |

1 | on | on | off | 0.99 |

2 | on | off | on | 0.01 |

3 | on | off | off | 0.99 |

4 | off | on | on | 0.01 |

5 | off | on | off | 0.99 |

6 | off | off | on | 0.99 |

7 | off | off | off | 0.01 |

```
[10]:
```

```
model.node_potentials['G']
```

```
[10]:
```

C | G | __p__ | |
---|---|---|---|

0 | on | on | 0.8 |

1 | on | off | 0.2 |

2 | off | on | 0.1 |

3 | off | off | 0.9 |

```
[11]:
```

```
model.node_potentials['H']
```

```
[11]:
```

E | G | H | __p__ | |
---|---|---|---|---|

0 | on | on | on | 0.05 |

1 | on | on | off | 0.95 |

2 | on | off | on | 0.95 |

3 | on | off | off | 0.05 |

4 | off | on | on | 0.95 |

5 | off | on | off | 0.05 |

6 | off | off | on | 0.95 |

7 | off | off | off | 0.05 |

## 2.2. Graphs

### 2.2.1. Intermediate graphs

It might be interesting to see the intermediate graphs created. Here’s what the graphs below represent.

DAG: the Huang DAG associated with the Huang BBN

Undirected: the undirected graph of the Huang DAG (directed edges are removed)

Moralized: the moralized graph of the Huang DAG

Triangulated: the triangulated graph of the Huang DAG

```
[12]:
```

```
draw_intermediate_graphs(model)
```

### 2.2.2. Join tree

The join tree is the main inferencing engine of the reasoning model. We can choose to visualize the join tree too.

red nodes: these are the cliques/clusters of the join tree

black nodes: these are the separation sets of the join tree

```
[13]:
```

```
draw_junction_tree(model)
```

## 2.3. Queries

Queries can be classified as follows.

Marginal query (univariate probabilities)

Associational query (conditional probabilities)

Interventional query (causal effect probabilities)

Counterfactual query (counterfactual probabilities)

The above categories of queries follow Pearl’s Causal Hierarchy (PCH) or the Causal Ladder [GDH22].

### 2.3.1. Marginal query

Let’s just follow one variable, \(H\). The marginal probability of \(H\), \(P(H)\), is easy to query.

```
[14]:
```

```
q = model.pquery()
q['H']
```

```
[14]:
```

H | __p__ | |
---|---|---|

0 | off | 0.1769 |

1 | on | 0.8231 |

### 2.3.2. Associational query

In an associational query, we want to compute the conditional probabilities

\(P(H|E=\mathrm{on})\), and

\(P(H|E=\mathrm{off})\).

```
[15]:
```

```
q = model.pquery(evidences=model.e({'E': 'on'}))
q['H']
```

```
[15]:
```

H | __p__ | |
---|---|---|

0 | off | 0.322903 |

1 | on | 0.677097 |

```
[16]:
```

```
q = model.pquery(evidences=model.e({'E': 'off'}))
q['H']
```

```
[16]:
```

H | __p__ | |
---|---|---|

0 | off | 0.05 |

1 | on | 0.95 |

### 2.3.3. Interventional query

The interventional query is estimating the causal effect one one variable on another. In this case, we want to estimate the causal effect of \(E\) on \(H\). However, \(E\) and \(H\) are confounded through \(\{C, G\}\). We need to apply the `do-operator`

on \(E\) to estimate the causal effect of \(E\) on \(H\). (The causal effect of \(E\) on \(H\) is not simply the associational query we performed above!)

\(P(H=h|\mathrm{do}(E=e))\)

We do not have to control for both counfounders \(\{C, G\}\), it is sufficient to just control for \(C\) to block backdoor paths from \(E\) to \(H\). Thus, using the backdoor adjustment formula, we can compute the causal effect as follows.

\(P(H=h|\mathrm{do}(E=e)) = \sum_C P(H=h|E=e, C)P(C)\)

Let’s compute two causal effects.

\(P(H=\mathrm{on}|\mathrm{do}(E=\mathrm{on})) = \sum_C P(H=\mathrm{on}|E=\mathrm{on}, C)P(C)\)

\(P(H=\mathrm{on}|\mathrm{do}(E=\mathrm{off})) = \sum_C P(H=\mathrm{on}|E=\mathrm{off}, C)P(C)\)

```
[17]:
```

```
p_on = model.iquery(Y=['H'], y=['on'], X=['E'], x=['on'])
p_on
```

```
[17]:
```

```
H 0.5765
dtype: float64
```

```
[18]:
```

```
p_off = model.iquery(Y=['H'], y=['on'], X=['E'], x=['off'])
p_off
```

```
[18]:
```

```
H 0.95
dtype: float64
```

The average causal effect (ACE) is defined as follows.

\(\mathrm{ACE} = P(H=\mathrm{on}|\mathrm{do}(E=\mathrm{on})) - P(H=\mathrm{on}|\mathrm{do}(E=\mathrm{off}))\)

```
[19]:
```

```
p_on['H'] - p_off['H']
```

```
[19]:
```

```
-0.37349999999999994
```

### 2.3.4. Counterfactual query

In this counterfactual query, we have the following.

query: \(H\)

factual: \(E=\mathrm{on}, G=\mathrm{on}, H=\mathrm{on}\)

counterfactual: \(E=\mathrm{off}\)

The query is against \(H\); it is the variable we are interested in. The factual is what actually happened. The counterfactual is the hypothetical. Let’s try to word this as a counterfactual statement (not so easy to do).

Given that \(E\) was on, \(G\) was on, \(H\) was on, what is the probability of \(H\) had we set \(E\) to off?

```
[20]:
```

```
Y = 'H'
e = {
'E': 'on',
'G': 'on',
'H': 'on'
}
h = {
'E': 'off'
}
model.cquery(Y, e, h)
```

```
[20]:
```

H | __p__ | |
---|---|---|

0 | off | 0.043749 |

1 | on | 0.956251 |

Let’s try another counterfactual query. We have the following.

query: \(H\)

factual: \(E=\mathrm{off}, G=\mathrm{off}, H=\mathrm{off}\)

counterfactual: \(E=\mathrm{on}\)

As a counterfactual statement, we might naturally say the following (?).

Given that \(E\) was off, \(G\) was off, \(H\) was off, what is the probability of \(H\) had we set \(E\) to on?

```
[21]:
```

```
Y = 'H'
e = {
'E': 'off',
'G': 'off',
'H': 'off'
}
h = {
'E': 'on'
}
model.cquery(Y, e, h)
```

```
[21]:
```

H | __p__ | |
---|---|---|

0 | off | 0.439756 |

1 | on | 0.560244 |

Here’s the last counterfactual query. We have the following.

query: \(H\)

factual: \(E=\mathrm{on}, G=\mathrm{on}, H=\mathrm{on}\)

counterfactual: \(E=\mathrm{off}, G=\mathrm{off}\)

As a counterfactual statement, we might naturally say the following (?).

Given that \(E\) was on, \(G\) was on, \(H\) was on, what is the probability of \(H\) had we set \(E\) and \(G\) to off?

```
[22]:
```

```
Y = 'H'
e = {
'E': 'on',
'G': 'on',
'H': 'on'
}
h = {
'E': 'off',
'G': 'off'
}
model.cquery(Y, e, h)
```

```
[22]:
```

H | __p__ | |
---|---|---|

0 | off | 0.04526 |

1 | on | 0.95474 |

### 2.3.5. Graphical query

Graphical queries revolve around the DAG. Typical graphical queries are about `d-separation`

and discovering counfounders and mediators.

From the DAG, we ask if \(F\) and \(H\) are d-separated, \(I(F, H)\). The answer is false, since there are backdoor paths between \(F\) and \(H\). Here are the backdoor paths between \(F\) and \(H\).

\(F, D, B, A, C, E, H\)

\(F, D, B, A, C, G, H\)

\(F, E, C, G, H\)

\(F, E, H\)

```
[23]:
```

```
model.is_d_separated('F', 'H')
```

```
[23]:
```

```
False
```

Perhaps we can block all backdoor paths between \(F\) and \(H\) with \(E\), \(I(F,H|E)\)? The answer is false.

```
[24]:
```

```
model.is_d_separated('F', 'H', {'E'})
```

```
[24]:
```

```
False
```

Can we block all backdoor paths between \(F\) and \(H\) with \(E\) and \(C\), \(I(F,H|E, C)\)? The answer is true.

```
[25]:
```

```
model.is_d_separated('F', 'H', {'E', 'C'})
```

```
[25]:
```

```
True
```

We can query the graph to get us the “minimal confounders” that will block all backdoor paths. There are multiple minimal sets, but with the particular approach we use, the answer is \(\{D, E\}\). The answer to \(I(F,H|D,E)\) is true.

```
[26]:
```

```
model.get_minimal_confounders('F', 'H')
```

```
[26]:
```

```
['D', 'E']
```

We can use the model itself to 1) get the minimal confounders and then 2) test for d-separation as follows.

```
[27]:
```

```
model.is_d_separated('F', 'H', model.get_minimal_confounders('F', 'H'))
```

```
[27]:
```

```
True
```

If we want a list of all counfounders that can block all backdoors between \(F\) and \(H\), then we can also get that from the DAG.

```
[28]:
```

```
model.get_all_confounders('F', 'H')
```

```
[28]:
```

```
{'A', 'B', 'C', 'D', 'E', 'G'}
```

## 2.4. Save

Finally, we can save the reasoning model that we built from the Huang BBN. Note that we are persisting the reasoning model and not the BBN that we depersisted to begin with.

```
[29]:
```

```
from pybbn.serde import model_to_dict
with open('../../_support/bbn/huang-reasoning.json', 'w') as fp:
json.dump(model_to_dict(model), fp, indent=1)
```