Releases: MangiolaLaboratory/sccomp
Major update random effect + proportion input + cmdstanr backend
We are thrilled to introduce a host of significant updates and new features in this latest release of sccomp. These enhancements are designed to provide you with more powerful tools for compositional data analysis, improve usability, and offer greater flexibility in your workflows.
1. Support for Random Effects Modeling
One of the most substantial additions is the implementation of random effects modeling within the sccomp framework. This feature allows you to incorporate hierarchical or nested data structures into your analyses, which is particularly beneficial when dealing with complex experimental designs.
Key Advantages:
- Hierarchical Data Analysis: You can now model data that has multiple levels of variability, such as measurements nested within subjects or samples collected across different time points.
- Flexibility in Model Specification: The inclusion of random effects provides greater flexibility in specifying models that accurately reflect the underlying structure of your data.
2. Direct Input of Proportion Data
We have introduced the ability to input proportion data directly into the sccomp functions. This should not be used if counts are present. It is though to model proportions when counts are not available, for example as result of deconvolution.
Key Advantages:
- Greater Data Compatibility: Allows for the integration of data from different sources that may already be in proportion form.
- Enhanced Flexibility: Facilitates the analysis of data types where counts are not available, such as percentages or fractions.
3. Refactoring and Performance Improvements
Significant effort has been put into refactoring the codebase and optimizing performance. This includes rebasing the master branch and cleaning up the code to enhance readability and maintainability.
Key Enhancements:
- Codebase Streamlining: Multiple rebasing efforts (#45, #54, #125, etc.) have resulted in a cleaner, more efficient codebase.
- Model Function Improvements: Refactoring of model functions (#150, #152) enhances computational efficiency and eases future development.
- Nested Grouping with Cmdstanr: Integration of nested grouping capabilities using cmdstanr (#151, #153) allows for more sophisticated statistical modeling.
4. Enhanced Customization and Control
We have added features that give you more control over the analysis process and outputs.
Key Enhancements:
- Custom Output Samples for Variational Bayes: You can now specify the number of output samples when using variational Bayes methods (#137), allowing you to balance between computational speed and estimation precision.
- Deprecation of Redundant Arguments: Cleaning up the function arguments (#155) makes the functions easier to use and reduces confusion.
- Residual Calculation Updates: Changes to how residuals are calculated (#124) improve the accuracy of model diagnostics.
5. Documentation and Usability Improvements
We recognize the importance of clear documentation and have made substantial updates to enhance your user experience.
Key Enhancements:
- Updated README and Vignettes: The README file and accompanying vignettes have been thoroughly updated (#141) to reflect all new features and provide detailed guidance on how to use them.
- Attribute Passing Improvements: Modifications to how attributes are passed between functions (#140) improve the consistency and reliability of the package.
- User Messages and Warnings: Informative messages have been added (#148) to help you understand the progress of computations and alert you to potential issues.
6. Additional Features and Fixes
Several other enhancements and bug fixes have been implemented to improve the overall functionality of sccomp.
Key Enhancements:
- Proportion Difference Calculation: A new feature to calculate the difference in proportions directly (#147), aiding in the interpretation of results.
- Environment Handling in Formulas: Adjustments to formula handling (#142) prevent potential errors related to variable scope and environment.
- Instantiation and Initialization Improvements: Enhancements to how models are instantiated (#136) lead to more stable and faster model fitting.
Full Changelog: https://github.com/MangiolaLaboratory/sccomp/compare/v1.7.12…v1.9.1
For a comprehensive overview of all changes and detailed instructions on how to utilize the new features, please refer to the README.
We believe these updates will significantly enhance your data analysis capabilities using sccomp. The support for random effects modeling and direct proportion data input, in particular, open up new avenues for sophisticated and flexible analyses. We are committed to continuous improvement and welcome any feedback you may have.
Thank you for your continued support, and we hope you find these new features valuable in your research.
What's Changed PR list
- rebase master by @stemangiola in #45
- rebase by @stemangiola in #54
- rebase by @stemangiola in #125
- rebase master by @stemangiola in #128
- rebase by @stemangiola in #131
- rebase by @stemangiola in #132
- Instantiate by @stemangiola in #136
- allow custom output samples for vb by @stemangiola in #137
- update functions to be exposed by @stemangiola in #139
- pass the attribute by @stemangiola in #140
- update README and vignette by @stemangiola in #141
- rebase by @stemangiola in #143
- drop environment from formula and quotes by @stemangiola in #142
- rebase by @stemangiola in #144
- Calculate proportion difference by @stemangiola in #147
- add message by @stemangiola in #148
- Refactor model functions by @stemangiola in #150
- Cmdstanr nested grouping by @stemangiola in #151
- Refactor model functions by @stemangiola in #152
- Cmdstanr nested grouping by @stemangiola in #153
- Cmdstanr by @stemangiola in #84
- deprecate argument by @stemangiola in #155
- Change residuals by @stemangiola in #124
- Proportions by @stemangiola in #126
Full Changelog: v1.7.12...v1.9.1
More permissive logit threshold for significance
v1.7.7 Variational + Random effects multivariate
What's Changed
- Update array syntax, fabs, dirichlet_multinomial by @andrjohns in #111
- update theme by @stemangiola in #118
- .x does not exist here by @stemangiola in #119
- When the sum of generated count is 0, the division was returning null… by @stemangiola in #120
- Add multivariate prior by @stemangiola in #114
- allow arbitrary contrasts to be plotted by @stemangiola in #121
- Variational default by @stemangiola in #127
New Contributors
- @andrjohns made their first contribution in #111
Full Changelog: v1.7.2...v1.7.7
New tidy interface
We announce the new tidy and modular interface for a sccomp
, which improves modularity, and clarity. The main change is the modularisation of sccomp
in functions which can be linked with the pipe operator |>
.
Function | Description |
---|---|
Estimation: sccomp_stimate() |
which is usually run once in the analysis (per model). |
Testing: sccomp_test() |
which candy run multiple times, depending on how many contrasts you want to test (e.g. age, untreated vs treated). |
Outlier removal: sccomp_remove_outliers() |
which is usually run once after sccomp_estimate() in case you want to produce estimates not influenced by outlier data points. |
Unwanted variation removal: sccomp_remove_unwanted_variation() |
which is run after sccomp_estimate() and produces a dataset that just preserve the variability of your factor of interest. |
Data replication: sccomp_replicate() |
which is run after sccomp_estimate() and produces a dataset representing the theoretical data distribution according to the model (from the posterior distribution). |
Plotting: plot() |
which is run after sccomp_test and outputs a series of summary plots. |
Deprecation of the function sccomp_glm()
The new framework
outlier_free_estimate =
seurat_obj |>
# Estimate
sccomp_estimate(
formula_composition = ~ type + continuous_covariate,
.sample = sample,
.cell_group = cell_group,
cores = 1
) |>
# Remove outliers
sccomp_remove_outliers(cores = 1)
# Test
outlier_free_estimate |>
sccomp_test(contrasts = "typehealthy")
New functionalities
Removal of unwanted variation.
For visualisation purposes, we can select factor of interest we would like to preserve the effect for, end exclude all the rest. For example, if we want to produce a dataset with just the type
effect, we can execute
outlier_free_estimate |>
sccomp_remove_unwanted_variation(~ type)
Plotting
The bloating functionalities have been improved. Now, both discrete and continuous variables can be visualised overlaying the to reticle data distribution from the model. This helps the user understanding whether the model is descriptively adequate to the data.
For example, if the theoretical data distribution from the sccomp
does not overlap with the observed data distribution, this is an indication that the probability distribution used by sccomp
is not suitable for the data or a different model (design matrix) should be used.
outlier_free_estimate |>
sccomp_test(contrasts = "typehealthy") |>
plot()
Now plotting the test against the continuous covariate
outlier_free_estimate |>
sccomp_test(contrasts = "continuous_covariate") |>
plot()
What's Changed
- Drop distinct random effect by @stemangiola in #79
- add attributes by @stemangiola in #80
- if continuous do not multiply by covariate by @stemangiola in #82
- rebase by @stemangiola in #83
- avoid QR decomposition for random effects by @stemangiola in #81
- add outliers tests by @stemangiola in #86
- add completion also to sccomp from counts by @stemangiola in #98
- Improve arguments by @stemangiola in #101
- Add controls on contrasts by @stemangiola in #104
- Separate outlier in two methods by @stemangiola in #87
- change github actions by @stemangiola in #105
Full Changelog: v1.3.5...v1.7.2
multilevel implementation for submission
v1.3.6.1 Update DESCRIPTION
pre-submission
v1.3.5 update README