You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I was trying to determine a single latent variable in my model, and when I tried to run the EM algorithm using fit_latent_cpds, it sometimes throw random errors while some times it can product some result.
Steps to Reproduce
I have created the following test data to try the model:
data = pd.DataFrame({'node1': np.repeat(1, 50), 'node2': np.repeat(1,50)})
for i in [0, 3, 5, 13, 17, 29, 30, 31, 32]:
data['node1'][i] = 0
for i in [4,5,11,15,17,25,27,34,41,47]:
data['node2'][i] = 0
The data structure is very simple, a latent variable latent1 that affects node1 and node2.
However, some times I receive different error messages:
Traceback (most recent call last):
File "test_2.py", line 28, in <module>
bn.fit_latent_cpds(lv_name="latent1", lv_states=[0, 1], data=data[["node1", "node2"]], n_runs=30)
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/network/network.py", line 553, in fit_latent_cpds
estimator = EMSingleLatentVariable(
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/estimator/em.py", line 144, in __init__
self._mb_data, self._mb_partitions = self._get_markov_blanket_data(data)
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/estimator/em.py", line 585, in _get_markov_blanket_data
mb_product = cpd_multiplication([self.cpds[node] for node in self.valid_nodes])
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/utils/pgmpy_utils.py", line 122, in cpd_multiplication
product_pgmpy = factor_product(*cpds_pgmpy) # type: TabularCPD
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pgmpy/factors/base.py", line 76, in factor_product
return reduce(lambda phi1, phi2: phi1 * phi2, args)
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pgmpy/factors/base.py", line 76, in <lambda>
return reduce(lambda phi1, phi2: phi1 * phi2, args)
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pgmpy/factors/discrete/DiscreteFactor.py", line 930, in __mul__
return self.product(other, inplace=False)
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pgmpy/factors/discrete/DiscreteFactor.py", line 697, in product
phi = self if inplace else self.copy()
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pgmpy/factors/discrete/CPD.py", line 299, in copy
return TabularCPD(
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pgmpy/factors/discrete/CPD.py", line 142, in __init__
super(TabularCPD, self).__init__(
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/pgmpy/factors/discrete/DiscreteFactor.py", line 99, in __init__
raise ValueError("Variable names cannot be same")
ValueError: Variable names cannot be same
And sometimes I receive this error:
Traceback (most recent call last):
File "test_2.py", line 28, in <module>
bn.fit_latent_cpds(lv_name="latent1", lv_states=[0, 1], data=data[["node1", "node2"]], n_runs=30)
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/network/network.py", line 563, in fit_latent_cpds
estimator.run(n_runs=n_runs, stopping_delta=stopping_delta)
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/estimator/em.py", line 181, in run
self.e_step() # Expectation step
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/estimator/em.py", line 233, in e_step
results = self._update_sufficient_stats(node_mb_data["_lookup_"])
File "/Users/user/opt/anaconda3/envs/py38/lib/python3.8/site-packages/causalnex/estimator/em.py", line 448, in _update_sufficient_stats
prob_lv_given_mb = self._mb_product[mb_cols]
KeyError: (nan, 0.0)
My code originally also includes the boundaries and priors, however I realise these two errors just randomly pop up at different times.
Please let me know if I have done something wrong in setting up the network.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
CausalNex version used (pip show causalnex): 0.11.0
Python version used (python -V): 3.8.15 (via conda)
Operating system and version: Mac OS M1
The text was updated successfully, but these errors were encountered:
In line 702 of DiscreteFactor.py from pgmpy library
Change from new_variables = list(set(phi.variables).union(phi1.variables))
to new_variables = phi.variables + [var for var in phi1.variables if var not in phi.variables]
Description
I was trying to determine a single latent variable in my model, and when I tried to run the EM algorithm using fit_latent_cpds, it sometimes throw random errors while some times it can product some result.
Steps to Reproduce
I have created the following test data to try the model:
The data structure is very simple, a latent variable
latent1
that affectsnode1
andnode2
.Some times I received good result as following:
However, some times I receive different error messages:
And sometimes I receive this error:
My code originally also includes the boundaries and priors, however I realise these two errors just randomly pop up at different times.
Please let me know if I have done something wrong in setting up the network.
Your Environment
Include as many relevant details about the environment in which you experienced the bug:
pip show causalnex
): 0.11.0python -V
): 3.8.15 (via conda)The text was updated successfully, but these errors were encountered: