Skip to content

Conversation

bubriks
Copy link
Contributor

@bubriks bubriks commented Sep 13, 2023

This PR adds/fixes/changes...

  • please summarize your changes to the code
  • and make sure to include all changes to user-facing APIs

JIRA Issue: -

Priority for Review: -

Related PRs: -

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Tests on VM

Checklist For The Assigned Reviewer:

- [ ] Checked if merge conflicts with master exist
- [ ] Checked if stylechecks for Java and Python pass
- [ ] Checked if all docstrings were added and/or updated appropriately
- [ ] Ran spellcheck on docstring
- [ ] Checked if guides & concepts need to be updated
- [ ] Checked if naming conventions for parameters and variables were followed
- [ ] Checked if private methods are properly declared and used
- [ ] Checked if hard-to-understand areas of code are commented
- [ ] Checked if tests are effective
- [ ] Built and deployed changes on dev VM and tested manually
- [x] (Checked if all type annotations were added and/or updated appropriately)

@bubriks bubriks requested a review from SirOibaf September 13, 2023 12:46
@SirOibaf
Copy link
Contributor

Are you sure this PR is correct? Looks to me that we always end up in the first branch of the if statement as the initial_check_point is always not empty string as we set it here:
https://github.com/logicalclocks/feature-store-api/blame/25cfcd57ad792a3b6a732570943692c49b406fbc/python/hsfs/engine/python.py#L956

It doens't seem there is a way of controlling the skip_offset parameter in the first branch and the offsets are always skipped.

Additionally the skip_offset parameter is not documented anywhere in the APIs. Please add the proper documentation in the insert method.

* rename skip_offsets -> use_current_offsets
* add documentation
@bubriks
Copy link
Contributor Author

bubriks commented Sep 18, 2023

I think everything should be correct.

initial_check_point will be empty if topic doesn't exists (for example after upgrade).

In the first if statement (here: https://github.com/logicalclocks/feature-store-api/blame/25cfcd57ad792a3b6a732570943692c49b406fbc/python/hsfs/engine/python.py#L1016) we always run materialization job setting the initial offset to 0 for all partitions of topic since the topic didn't exist and job should start from the beginning (done here: https://github.com/logicalclocks/feature-store-api/blame/25cfcd57ad792a3b6a732570943692c49b406fbc/python/hsfs/engine/python.py#L1028)

@bubriks
Copy link
Contributor Author

bubriks commented Sep 18, 2023

@SirOibaf I also change the skip_offset parameter name to use_current_offsets as i think its more descriptive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants