Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in running diff_length.R #37

Open
doshirLV opened this issue Sep 26, 2024 · 10 comments
Open

Error in running diff_length.R #37

doshirLV opened this issue Sep 26, 2024 · 10 comments
Assignees

Comments

@doshirLV
Copy link

doshirLV commented Sep 26, 2024

Hello Maragkakis Lab,

Error in names(x) <- value :
'names' attribute [2] must be the same length as the vector [1]
Calls: nanoplen -> colnames<-
Execution halted

Conda environment installed successfully and active. Nanoplen R package runs properly with test data. My data is in the same format. But the samples in the metadata file are in a different order than they appear in the input file.

R version 4.2.2 (2022-10-31) running on Ubuntu 20.04.5 LTS

Thank you in advance for your help,
Raj

@leetaiyi leetaiyi self-assigned this Sep 26, 2024
@leetaiyi
Copy link
Collaborator

Hi Raj,

Can you send me a head of what your data looks like?

Chris

@doshirLV
Copy link
Author

I appreciate your swift response Chris,

Input file:

sample  trxID   polyA_len
1    ENST00000000233.10      33.0
1    ENST00000000233.10      33.0
1    ENST00000000233.10      41.0
1    ENST00000000233.10      42.0

Metadata file:

sample  condition
10   treatment
7    treatment
1    treatment
16   treatment

Feel free to let me know if you need anything else

@leetaiyi
Copy link
Collaborator

leetaiyi commented Sep 27, 2024

Hmm, I may need your full files to discover what's wrong. I'm trying to resolve this same issue with someone else's data as well

Edit: The other issue is not the same as yours

@doshirLV
Copy link
Author

doshirLV commented Oct 4, 2024

Hmm, I may need your full files to discover what's wrong. I'm trying to resolve this same issue with someone else's data as well

Edit: The other issue is not the same as yours

Hello Chris,

Any update on what the issue may be? Unfortunately I am unable to send the full files as it is patient data. But I could provide you metrics such as the size of the file or the order of the samples.

Thank you for your diligence,
Raj

@leetaiyi
Copy link
Collaborator

leetaiyi commented Oct 4, 2024

It looks as simple as a delimiter issue but from what you showed me, it doesn't seem like that's the case.

@leetaiyi
Copy link
Collaborator

leetaiyi commented Oct 8, 2024

At this point I am reasonably sure it is something with your input data or your command. I've successfully just run a bunch of other data without issues

@doshirLV
Copy link
Author

doshirLV commented Oct 8, 2024

I appreciate you double checking Chris,

Could it be that my sample order is different in the input and metadata files?
So in the input file sample order is 1, 10, 11 .. 19, 2, 3 .. 9.
Versus the metadata file is random:
10 treatment
7 treatment
1 treatment
16 treatment
6 treatment
11 control
17 control
12 control
5 treatment
19 control
8 treatment
3 control
14 control
18 treatment
4 control
13 control
15 treatment
2 control

Thank you for your assistance,
Raj

@leetaiyi
Copy link
Collaborator

leetaiyi commented Oct 8, 2024

No, my metadata is also pretty scrambled, and it merges with the data via your library ID anyway

@doshirLV
Copy link
Author

doshirLV commented Oct 8, 2024

So weird, I am using the exact same command and it works for the "format" test data but gives error for mine.

Test:

./scripts/R/nanoplen/scripts/diff_length.R \
-d ./scripts/R/nanoplen/examples/nanoplen_input_format.tab \
-m ./scripts/R/nanoplen/examples/metadata_format.tab \
-b "control" \
-t m \
-o ./scripts/R/nanoplen/examples/nanoplen_output_format_LinearMixedModel_2024Oct08.tab

Error:

./scripts/R/nanoplen/scripts/diff_length.R \
-d ./projects/path/to/data/analysis/nanoplen/input.tab \
-m ./projects/path/to/data/analysis/nanoplen/metaData.tab \
-b "control" \
-t m \
-o ./projects/path/to/data/analysis/nanoplen/out_LMM_2024Oct08.tab

I will check for extra whitespace and how the tabs are formatted to confirm that it is not an issue with a delimiter. Also, file size should not be a problem, correct? I have 17+ million lines in the input file.
The only other thing I can think of is to go through your R scripts and see exactly where the error is occurring and change the corresponding code.

@leetaiyi
Copy link
Collaborator

No, size shouldn't be an issue code-wise. Were you able to try debugging line by line yourself?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants