Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bump: version 1.0.1 → 1.1.0 #374

Merged
merged 6 commits into from
Mar 13, 2024
Merged

bump: version 1.0.1 → 1.1.0 #374

merged 6 commits into from
Mar 13, 2024

Conversation

d33bs
Copy link
Member

@d33bs d33bs commented Mar 5, 2024

Description

This PR prepares a v1.1.0 release for pycytominer.

Thanks for any feedback or thoughts you may have!

What is the nature of your change?

  • Bug fix (fixes an issue).
  • Enhancement (adds functionality).
  • Breaking change (fix or feature that would cause existing functionality to not work as expected).
  • This change requires a documentation update.

Checklist

Please ensure that all boxes are checked before indicating that a pull request is ready for review.

  • I have read the CONTRIBUTING.md guidelines.
  • My code follows the style guidelines of this project.
  • I have performed a self-review of my own code.
  • I have commented my code, particularly in hard-to-understand areas.
  • I have made corresponding changes to the documentation.
  • My changes generate no new warnings.
  • New and existing unit tests pass locally with my changes.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have deleted all non-relevant text in this pull request template.

@d33bs d33bs requested review from gwaybio and kenibrewer March 5, 2024 16:39
gwaybio
gwaybio previously approved these changes Mar 5, 2024
Copy link
Member

@gwaybio gwaybio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great @d33bs - thanks for the detailed notes. I made minor suggestions for keeping capitalization consistent, fixing typos, and clarifying spherize update.

Also, after going through this bump, is there anything to adjust in https://github.com/cytomining/pycytominer/blob/main/CONTRIBUTING.md#releases?

CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
kenibrewer
kenibrewer previously approved these changes Mar 6, 2024
Copy link
Member

@kenibrewer kenibrewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@codecov-commenter
Copy link

codecov-commenter commented Mar 6, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.95%. Comparing base (4e84a04) to head (43cf984).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #374      +/-   ##
==========================================
- Coverage   94.96%   94.95%   -0.01%     
==========================================
  Files          56       56              
  Lines        3139     3133       -6     
==========================================
- Hits         2981     2975       -6     
  Misses        158      158              
Flag Coverage Δ
unittests 94.95% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@d33bs
Copy link
Member Author

d33bs commented Mar 6, 2024

Thanks @gwaybio and @kenibrewer for your reviews! I made some updates based on what you mentioned.

Also, after going through this bump, is there anything to adjust in https://github.com/cytomining/pycytominer/blob/main/CONTRIBUTING.md#releases?

I was able to follow the directions and found they were accurate. In addition, I found myself wanting to add more detail to the CHANGELOG.md file beyond that of what was captured by cz bump. It looked like the change keys or change scopes filtering made it so many commits were not gathered or automatically added to the file.

I tried to explore customizing the associated with the keys or scope for the changelog for a while but I couldn't get this to work when using cz bump. As a result, for a simpler (but less reproducible) approach, I used git log --oneline v1.0.1..HEAD to find the commit messages and grouped by what I interpreted as change keys and scope (I tried to include all). My thinking here was that I wasn't sure the level of granularity / filtering a maintainer should apply when going through these (who am I to say what's important enough for a user to see or not?). Just the same, maybe I should have been more decisive here!

I candidly wasn't sure if this was something we should add to the release documentation or if it's something which should be flexible to each maintainer. Maybe it's also that with a little more time the commitizen customization could work well here. Do you have thoughts on this?

@gwaybio
Copy link
Member

gwaybio commented Mar 6, 2024

Thanks for digging into this Dave! It sounds like the CONTRIBUTING document is sufficient - baking in some flexibility will help with release agility, so I agree we can be a bit less formulaic given commitizen limitations. Super excited to see this version bump!

@d33bs
Copy link
Member Author

d33bs commented Mar 7, 2024

Thanks @gwaybio ! In thinking about this more I remembered that I wondered where the cz cli command would be accessed. My understanding is that this should come from the Poetry environment for pycytominer. I added a minor update to this effect.

After pushing this update, I noticed the Pytest tests are now failing, seemingly for unrelated reasons.

The error:

=========================== short test summary info ============================
FAILED tests/test_cyto_utils/test_cell_locations.py::test_output_shape_and_required_columns[cell_loc3] - FileNotFoundError: cellpainting-gallery/test-cpg0016-jump/source_4/workspace/load_data_csv/2021_08_23_Batch12/BR00126114/test_BR00126114_load_data_with_illum.parquet
FAILED tests/test_cyto_utils/test_cell_locations.py::test_output_value_correctness[cell_loc3] - FileNotFoundError: cellpainting-gallery/test-cpg0016-jump/source_4/workspace/load_data_csv/2021_08_23_Batch12/BR00126114/test_BR00126114_load_data_with_illum.parquet

Appears to be occurring due to missing JUMP resources (I imagine these must have been changed sometime yesterday or earlier today):

% aws s3 ls --summarize --no-sign-request s3://cellpainting-gallery/test-cpg0016-jump/source_4/workspace/load_data_csv/2021_08_23_Batch12/BR00126114/

Total Objects: 0
   Total Size: 0

Should we hold off on a merge for this PR until this can be resolved (via another PR)?

@gwaybio
Copy link
Member

gwaybio commented Mar 7, 2024

Should we hold off on a merge for this PR until this can be resolved (via another PR)?

My gut tells me to wait until resolved prior to bumping version (new versions probably should have fully functional tests)

Is it possible to remove the dependency on cell-painting gallery maintaining stable paths? Or are we explicitly testing S3 access functionality in this function as well?

@gwaybio
Copy link
Member

gwaybio commented Mar 7, 2024

@d33bs and I just had a quick in-person chat about this. Here's the action items we landed on:

  1. @shntnu @bethac07 - can you confirm how useful the S3 functionality with in the cell_locations function is? How often do you/your labs use it? If the answer is "we use it never", then we might remove S3 functionality in the short term, in favor of a more longer-term, general-cloud support with higher pycytominer functionality coverage.
  2. I pinged the cellpainting-gallery team in Stable items on S3 for software testing purposes broadinstitute/cellpainting-gallery#80 to learn if they can support a dedicated S3 path for pycytominer (and other tools) testing purposes

@kenibrewer
Copy link
Member

Should we hold off on a merge for this PR until this can be resolved (via another PR)?

I agree with holding off on deployment until the testing is resolved. My preferred method of resolving the problem would be pointing the tests at a more stable set of files for the time being. The s3 integration in pycytominer is definitely something that is something I'd like to remove in the future, but I would prefer to do so via a proper deprecation notice.

As a longer term action item, it would be good to formalize whether our commitment to our users to avoid breaking changes post 1.0 release covers only the core functions annotate, normalize, etc. or if it also all the functionality in cyto_utils. We do have a cleanup of cyto_utils on the roadmap and my recent discovery of circular dependencies in there (#372 ) means that it will be quite challenging to do the cleanup without breaking and/or removing some features.

@kenibrewer kenibrewer dismissed stale reviews from gwaybio and themself March 8, 2024 21:04

Approval dismissed pending testing fix

@shntnu
Copy link
Member

shntnu commented Mar 8, 2024

  • @shntnu @bethac07 - can you confirm how useful the S3 functionality with in the cell_locations function is? How often do you/your labs use it? If the answer is "we use it never", then we might remove S3 functionality in the short term, in favor of a more longer-term, general-cloud support with higher pycytominer functionality coverage.

We use cell_locations a lot now albeit in short bursts. It is practically unusable (for us) without S3.

AFAIK cell_locations has zero dependencies on pycytominer so one option is to bump it out of pycytominer temporarily, and bring it back when the conditions are ripe.

I'll reply on that thread. Looks like you are all set there

One option is to avoid this problem is to use moto.mock_s3

#257 (comment)

@d33bs
Copy link
Member Author

d33bs commented Mar 9, 2024

I'm working towards getting the tests to pass with new S3 paths. In the meantime I wanted to add some thoughts based on what @shntnu mentioned.

  • Pycytominer: I think it could be important to account for larger than system data storage and processing. This goes beyond just S3 and would include private cloud, SSH, and maybe other options. It could be that we lean into cell_locations and encourage this growth. I can also understand why not via the complexity involved with managing external data sources or mocked testing solutions.
  • CytoTable: CytoTable implements moto for testing and cloudpathlib for multi-cloud compatibility of data sources. Maybe it makes sense for cell_locations to have a home there?
  • Cell Locations (independent): if neither of the above are suitable, maybe an independent project and package through Cytomining would make better sense (for ex. cytomining/cell_locations)?

@d33bs
Copy link
Member Author

d33bs commented Mar 11, 2024

@gwaybio , @kenibrewer , @shntnu I believe this is ready for a re-review when there's a moment. I've made modifications to allow new S3 paths to be used and have made modifications to the CellLocations tests to help address new issues which became visible afterwards. My changes here are a bit naive (apologies) to account for how we might pivot from this point forward for the capabilities involved, don't hesitate to let me know if I should enhance the approach here and now.

@shntnu
Copy link
Member

shntnu commented Mar 11, 2024

@d33bs I didn't review the diffs specifically but the code overall looks fine to me.

Note that s3://cellpainting-gallery/cpg0016-jump/source_4/workspace/backend/2021_08_23_Batch12/BR00126114/BR00126114.sqlite is quite large but if it isn't slowing down the tests, we are fine.

aws s3 ls --human-readable s3://cellpainting-gallery/cpg0016-jump/source_4/workspace/backend/2021_08_23_Batch12/BR00126114/BR00126114.sqlite
2022-10-02 06:47:03    3.6 GiB BR00126114.sqlite

The older file was tiny (192Kb; was on S3 as well) but if we can afford to use BR00126114.sqlite (speedwise), that's a much more realistic test

@shntnu
Copy link
Member

shntnu commented Mar 11, 2024

  • Pycytominer: I think it could be important to account for larger than system data storage and processing. This goes beyond just S3 and would include private cloud, SSH, and maybe other options. It could be that we lean into cell_locations and encourage this growth. I can also understand why not via the complexity involved with managing external data sources or mocked testing solutions.

I'd defer to others, including Greg and you on this

  • CytoTable: CytoTable implements moto for testing and cloudpathlib for multi-cloud compatibility of data sources. Maybe it makes sense for cell_locations to have a home there?

Fine by me but it does feel like an odd fit

  • Cell Locations (independent): if neither of the above are suitable, maybe an independent project and package through Cytomining would make better sense (for ex. cytomining/cell_locations)?

This is also fine by me if it has to be done; it would be weird to have a microrepo but works for me if this is turning out to be a blocker for pycytominer

Copy link
Member

@gwaybio gwaybio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @kenibrewer should give the final approval for version bump. Testing LGTM

tests/test_cyto_utils/conftest.py Show resolved Hide resolved
Copy link
Member

@kenibrewer kenibrewer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Thanks for figuring out the testing changes that needed to happen @d33bs

Let's continue the conversation about the proper home of cell_locations over here:
#376

CHANGELOG.md Show resolved Hide resolved
tests/test_cyto_utils/conftest.py Show resolved Hide resolved
tests/test_cyto_utils/conftest.py Show resolved Hide resolved
Copy link
Member

@gwaybio gwaybio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Excited for this upgrade

@d33bs
Copy link
Member Author

d33bs commented Mar 13, 2024

Thank you @gwaybio, @kenibrewer, @shntnu and @ErinWeisbart for your help with items related to this PR and reviews! Merging this in now.

@d33bs d33bs merged commit 2cd5ad5 into cytomining:main Mar 13, 2024
10 checks passed
@d33bs d33bs deleted the release-v1.1.0 branch March 13, 2024 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants