Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Sires query: How to you treat founder individuals with no potential fathers? #6

Open
LStrachan opened this issue Feb 21, 2024 · 2 comments

Comments

@LStrachan
Copy link

Ive created an small example to better explain my problem:

I have honeybee genotyped data where I have genotypes of 3 queens, 5 worker offspring from each queen and 3 potential grand-dams who produced the fathers. In the pedigree, the queens and grand-dams are treated like founders in the pedigree with both parents unknown and the offspring have known mothers but no known fathers. We do not have a sequence file.
I've used the documentation and examples as best I could to format these files correctly but seem to have a problem with my potential sires file. And some questions that I would like clarifying are:

  • Do you have to include all IDs in the Potential Sires list? (the documentation wasn't very clear about that) I get the same error if I include all individuals or don't
  • If you do need all IDs, how do you treat the founders who have no potential sires? I've tried leaving the columns for these founders blank, with 0/NA/9/-9 to be read as unknown but get an ErrorKey that the symbol isn't recognised.

Below I've attached some the files for this example so it can be reproduced. The example sire file is a .list file (which I've been using) but this format wasn't able to be attached here so it will need changed or the .sh modified.

GenoExampleGitHub.txt
PedigreeExampleGithub.txt
PotSiresAllidsExample.txt

This is the .sh being run (also not wanting to be attached):
AlphaAssign -genotypes "GenoExampleGitHub.txt" -potentialsires "PotSiresAllidsExample.list" -pedigree "PedigreeExampleGitHub.txt" -out output/out_genotypes

Hopefully this example will reproduce the same problem but I've also attached a screenshot of the error that is coming up for me.
Screenshot 2024-02-21 at 12 27 06

Any help working out this problem would be greatly appreciated :)

@janaobsteter
Copy link

Ok, the first problem is that your data is tab delimited where the example data is white-space delimited. I am not sure that's causing the problem though.

Second, you have sites with all missing genotypes in your genotype file - and this I think is causing the problem. For example, the fifth column (site) is all missing values - which is weird. So, I would suggest you take the cleaned genotypes and also check what happened in recoding. There is suspiciously little "2"s in there, a lot of "0"s and way too many missing data (that is "9")

@janaobsteter
Copy link

I've tested taking all-missing sites out and it works. Something went wrong with data transformation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants