Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added glob support for Snowpark and Streamlit #1814

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

sfc-gh-astus
Copy link
Contributor

@sfc-gh-astus sfc-gh-astus commented Oct 29, 2024

Pre-review checklist

  • I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
  • I've added or updated automated unit tests to verify correctness of my new code.
  • I've added or updated integration tests to verify correctness of my new code.
  • I've confirmed that my changes are working by executing CLI's commands manually on MacOS.
  • I've confirmed that my changes are working by executing CLI's commands manually on Windows.
  • I've confirmed that my changes are up-to-date with the target branch.
  • I've described my changes in the release notes.
  • I've described my changes in the section below.

Changes description

  • Introduced Artifacts type and changed models for Snowpark, Streamlit and Native App

Comment on lines 54 to 68
for artifact in artifacts:
if isinstance(artifact, PathMapping):
_artifacts.append(artifact)
else:
_artifacts.append(PathMapping(src=artefact))
_artifacts.append(PathMapping(src=artifact))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that previous name was correct.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Artefact' is the British spelling, while, 'artifact' is the American spelling.

We have both versions in code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why my_page.py is empty file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy had problems with non existing title function.

@sfc-gh-astus sfc-gh-astus force-pushed the added-glob-support-for-snowpark-and-streamlit branch 16 times, most recently from 3653ce8 to 7ef640a Compare October 31, 2024 12:08
@sfc-gh-astus
Copy link
Contributor Author

Snowpark output directory per stage or leave it for diff/sync PR?

@sfc-gh-astus sfc-gh-astus force-pushed the added-glob-support-for-snowpark-and-streamlit branch from 7ef640a to 4c3ee60 Compare November 5, 2024 09:37
@sfc-gh-astus sfc-gh-astus marked this pull request as ready for review November 5, 2024 09:52
@sfc-gh-astus sfc-gh-astus requested review from a team as code owners November 5, 2024 09:52
@sfc-gh-astus sfc-gh-astus force-pushed the added-glob-support-for-snowpark-and-streamlit branch 2 times, most recently from 5a0fc07 to c8058c6 Compare November 5, 2024 10:56
@sfc-gh-astus sfc-gh-astus force-pushed the added-glob-support-for-snowpark-and-streamlit branch from eebdb30 to 19fdf46 Compare November 5, 2024 12:45
from typing import Dict, Optional, Set, Tuple

import typer
from click import ClickException, UsageError
from snowflake.cli._plugins.nativeapp.artifacts import BundleMap, symlink_or_copy
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to use BundleMap for every type of entity, maybe it's best to move it from native app plugin?

Copy link
Contributor Author

@sfc-gh-astus sfc-gh-astus Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good idea, but not in this PR.

)
from snowflake.cli.api.project.schemas.v1.native_app.path_mapping import PathMapping
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, maybe we could extract common objects, instead of cross-plugin imports

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, but not in this PR.

@sfc-gh-astus sfc-gh-astus force-pushed the added-glob-support-for-snowpark-and-streamlit branch from 2e7278d to 6968934 Compare November 6, 2024 16:32
@@ -303,19 +303,18 @@ def _add(self, src: Path, dest: Path, map_as_child: bool) -> None:
src=canonical_src, dest=canonical_dest, dest_is_dir=dest_is_dir
)

def _add_mapping(self, src: str, dest: Optional[str] = None):
def _add_mapping(self, src: Path, dest: Optional[str] = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src was a string here because it can be a glob pattern as well, which is odd to represent as a Path. I'm not a big fan of this change (or the matching one in the model)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After discussion we decided to back to string.

@@ -225,8 +228,8 @@ def build_artifacts_mappings(
entities_to_imports_map[entity_id].add(artefact_dto.import_path(stage))
stages_to_artifact_map[stage].update(required_artifacts)

if project_paths.dependencies.exists():
deps_artefact = project_paths.get_dependencies_artefact()
deps_artefact = project_paths.get_dependencies_artefact()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should pick one of british or american spelling for artifact/artefact and use it consistently

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, created ticket for it.

deploy_root=(project_paths.project_root / "output").absolute(),
)
bundle_map.add(
PathMapping(src=str(artefact.path), dest=(artefact.dest or None))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, commented above before seeing this conversation. I think representing a glob pattern as a Path is misleading, I'm not in favor. There's a reason we were using str explicitly, as opposed to Path pretty much everywhere else when a Path is resolved. If we want to make this "might be a glob" explicit, happy to consider something like a type alias or similar.

@sfc-gh-turbaszek
Copy link
Contributor

| If we want to make this "might be a glob" explicit, happy to consider something like a type alias or similar.

+1 to this idea I think it would make sense to have PathOrGlob = str

_artifacts.append(artefact)
for artifact in artifacts:
if (
"*" in artifact
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* isn't the only glob pattern, so that's a really weak check. What about something like file[1-5].py ?

Copy link
Contributor Author

@sfc-gh-astus sfc-gh-astus Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to to start with basics, I will try with glob.has_magic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried and it works after small changes!

for artifact in artifacts:
if (
"*" in artifact
and FeatureFlag.ENABLE_SNOWPARK_BUNDLE_MAP_BUILD.is_disabled()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not generally a fan of naming publicly visible concepts by implementation details like "bundle map". Is there a more user-friendly name we could use here?

Also, is it not the default because of backwards compat concerns? I wonder if the solution isn't to make it default, but then give a flag to revert to old behaviour if needed (SNOWPARK_LEGACY_GLOB_BEHAVIOUR=true ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can not be default, because it is BCR. New build changes zip paths.

Copy link
Contributor Author

@sfc-gh-astus sfc-gh-astus Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ENABLE_SNOWPARK_GLOB_SUPPORT?

for (absolute_src, absolute_dest) in bundle_map.all_mappings(
absolute=True, expand_directories=False
):
symlink_or_copy(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'll need to apply processors as well, is that planned for a future PR? I can probably help migrate processors to a more general location (and the bundle map too while we're at it), if you want the help. I think there are a few things to fix with processors in the process anyway, this is a good opportunity to make those small, under-the-hood changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally forgot about processors, but I thing it should be added in new PR.

sfc-gh-bdufour
sfc-gh-bdufour previously approved these changes Nov 7, 2024
Copy link
Contributor

@sfc-gh-bdufour sfc-gh-bdufour left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of my comments are blockers, so LGTM for NADE

@sfc-gh-astus sfc-gh-astus force-pushed the added-glob-support-for-snowpark-and-streamlit branch 2 times, most recently from 4e569bd to 7735817 Compare November 14, 2024 11:10
@sfc-gh-astus sfc-gh-astus force-pushed the added-glob-support-for-snowpark-and-streamlit branch from 7735817 to 4a558d8 Compare November 14, 2024 16:36
sfc-gh-mraba
sfc-gh-mraba previously approved these changes Nov 14, 2024
Comment on lines +159 to +157
def deploy_root(self) -> Path:
return self.project_root / "output"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could cause conflicts with other domains, since native apps uses output/deploy by default. We make ours configurable as well through snowflake.yml. We should probably align the choice of deploy root, and make it per-entity so that multiple entities, even across different domains, don't negatively interfere with each other. We should probably do that as a follow-up PR very quickly.


@property
def deploy_root(self) -> Path:
return self.project_root / "output"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

native apps uses output/deploy for this. We have other ephemeral directories under output/ as well, e.g. output/bundle as the bundle root. We also make all of those configurable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants