Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation on conflict resolution strategies and a typo (?) #203

Open
this-is-sofia opened this issue Oct 28, 2024 · 0 comments
Open

Comments

@this-is-sofia
Copy link

Hi, I am trying to understand what the alternative rules for resolving orientation conflicts in the PC algorithm do and how the procedure is justified (utils -> PCUtils).

Here are my questions:

  • Why are we sorting in ascending order? Is there a paper / source / documentation where these rules and their justifications are presented?
  • Is there a typo in p_{xy|not y}? Shouldn't it be the p-value (now descending order) of the CI test X and Z given Y looking at the implementation?

I think the author of the method is @jdramsey, thank you so much for taking a look!

In particular, I am looking at rules 3 and 4 (prioritizing stronger colliders):

if priority == 3:  # 3. Order colliders by p_{xz|y} in ascending order
            for (x, y, z) in R0:
                cond = cg_new.find_cond_sets_with_mid(x, z, y)
                UC_dict[(x, y, z)] = max([cg_new.ci_test(x, z, S) for S in cond])
            UC_dict = sort_dict_ascending(UC_dict)

        else:  # 4. Order colliders by p_{xy|not y} in descending order
            for (x, y, z) in R0:
                cond = cg_new.find_cond_sets_without_mid(x, z, y)
                UC_dict[(x, y, z)] = max([cg_new.ci_test(x, z, S) for S in cond])
            UC_dict = sort_dict_ascending(UC_dict, descending=True)

Here is my understanding of what is being implemented. I am happy to be corrected!

  • The description of find_cond_sets_with_mid(self, i: int, j: int, k: int) -> List[Tuple[int]] says it "return[s] the list of conditioning sets of the neighbors of i or j in adjmat which contains k", so we are finding subsets of neighbors of x and z which contain y.
  • We then create a dictionary or the CI test results given these subsets and take the maximum over the p-values given the different conditioning sets. A large p-value means that we accept our hypothesis 'conditional independence', that means if we take the largest p-value we sort be how independent we think our variables are given the conditioning set that contains y.
  • Last, we sort the triples by p-values in ascending order. Later in the function, we iterate through this dictionary and orient edges only if they have not been oriented. In my understanding, that means, we now prioritize low p-values which to me seems inconsistent to maximizing over p-values in the step before and the idea that we want to orient colliders if x and z are independent given y (i.e. a large p value for the test, not a small one).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant