Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with parse_cnvs.py script #133

Open
shaghayeghsoudi opened this issue Feb 28, 2021 · 2 comments
Open

problem with parse_cnvs.py script #133

shaghayeghsoudi opened this issue Feb 28, 2021 · 2 comments

Comments

@shaghayeghsoudi
Copy link

shaghayeghsoudi commented Feb 28, 2021

Hello,
I recently started working on PhyloWGS and after installing, compiling and testing scripts I ran into a problem with parse_cnvs.py script. I have Battenberg outputs and obviously it should be quite straightforward to run it but I keep getting this error message

python ./parse_cnvs.py -f battenberg -c 0.27 data.test.battenberg.txt

error:
File "./parse_cnvs.py", line 195, in
main()
File "./parse_cnvs.py", line 191, in main
regions = parser.parse()
File "./parse_cnvs.py", line 111, in parse
end = int(fields[3 + self._field_offset])
ValueError: invalid literal for int() with base 10: '0.610923189999321'

I just do not understand what's wrong. end = int(fields[3 + self._field_offset]) is the end position of CNV and I do not know what it takes BAF in column 4 (='0.610923189999321). Any idea? I appreciate the help, I am really stuck on that for several days.

@dancooke
Copy link

@shaghayegh-flower I just ran into this error and specifying battenberg-smchet rather than just battenberg resolved it. In you case:

$ python ./parse_cnvs.py -f battenberg-smchet -c 0.27 data.test.battenberg.txt

It looks like some Battenberg output has an ID column while others don't.

Once I got this working I ran into another error however:

Traceback (most recent call last):
  File "/well/gerton/dan/apps/phylowgs/parser/parse_cnvs.py", line 195, in <module>
    main()
  File "/well/gerton/dan/apps/phylowgs/parser/parse_cnvs.py", line 191, in main
    regions = parser.parse()
  File "/well/gerton/dan/apps/phylowgs/parser/parse_cnvs.py", line 117, in parse
    cnv1['major_cn'] = int(fields[8 + self._field_offset])
ValueError: invalid literal for int() with base 10: 'NA'

The parser doesn't nicely handle invalid field values, so I had to remove these rows manually before calling the parser:

$ awk '{if ($8!="NA"&&$9!="NA") print}' data.test.battenberg.txt > data.test.battenberg_noNA.txt
$ python ./parse_cnvs.py -f battenberg-smchet -c 0.27 data.test.battenberg_noNA.txt

Hope that helps!

@shaghayeghsoudi
Copy link
Author

Thanks a lot @dancooke. That helped a lot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants