-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addressing Batch Effects in Datasets with SUPPA2 #182
Comments
Dear Xi Xu,
We have recently handled batch effects using a linear model with co-factors
(see https://pubmed.ncbi.nlm.nih.gov/36518527/)
In this case, rather than performing a test between conditions, we try to
fit a linear model between the conditions. To that model, you can add a
list of cofactors, each described as a vector with the same number of
components as your patients. The cofactors could be numerical values (e.g.
age), nominal value (sex), another experimental variable (source,
post-mortem, …), or even values obtained from other methods that estimate
batch effects (e.g. SVA). This model will give you the events that best
correlate with the conditions accounting for all those sources of batch
effect.
An alternative might be correcting the read counts / TPM values for these
batch effects, and then running SUPPA. We have not tried this, so I would
not know if this is effective.
I hope this helps
Please do not hesitate to write back with more questions
Thanks a lot for using SUPPA
Best
Eduardo
…On Thu, 8 Feb 2024 at 03:35, XXuxi ***@***.***> wrote:
Dear SUPPA2 Development Team,
I am in the process of using SUPPA2 for differential splicing analysis and
have encountered an issue with batch effects in my dataset, which includes
sequencing samples from different batches. Upon analyzing TPM values
through PCA, it became evident that batch effects are present.
My query is: Should batch effect correction be performed on the TPM values
before running SUPPA2?
Thanks for your assistance.
Best regards,
Xi Xu
—
Reply to this email directly, view it on GitHub
<#182>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKB7O542RZUQI6L5ZEMLYSOUO7AVCNFSM6AAAAABC6EXRC2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZDGNBTGU4DONI>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Dear Eduardo, I just wanted to ask, based on your response, I looked into this paper you attached and it does indeed mention a batch regression approach, but I can't seem to find the code. Can you please point me to where the code was deposited for this batch regression technique? Or, can you please explain exactly how the PSI values were re-modeled using linear regression? Also, does this regression approach still produce PSI values that range from 0 to 1? Thanks in advance! -Jay |
hi,
sorry for the delayed reply. It is a standard lm() function, correcting
with co-variables. We used those that we observed had the strongest
confounding effect. It is a fairly standard function in R. There should be
enough tutorials available or coding co-pilots that could help you identify
the syntax to do it. We'll try to make the code available in the SUPPA page.
Thanks
E.
…On Mon, 11 Mar 2024 at 14:15, jdee3 ***@***.***> wrote:
Dear Eduardo,
I just wanted to ask, based on your response, I looked into this paper you
attached and it does indeed mention a batch regression approach, but I
can't seem to find the code. Can you please point me to where the code was
deposited for this batch regression technique? Or, can you please explain
exactly how the PSI values were re-modeled using linear regression? Thanks
in advance!
-Jay
—
Reply to this email directly, view it on GitHub
<#182 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADCZKB6EIFTLWCLSMTIT6Z3YXUOWRAVCNFSM6AAAAABC6EXRC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBXGU2DSNZQGE>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Dear SUPPA2 Development Team,
I am in the process of using SUPPA2 for differential splicing analysis and have encountered an issue with batch effects in my dataset, which includes sequencing samples from different batches. Upon analyzing TPM values through PCA, it became evident that batch effects are present.
My query is: Should batch effect correction be performed on the TPM values before running SUPPA2?
Thanks for your assistance.
Best regards,
Xi Xu
The text was updated successfully, but these errors were encountered: