Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FSTORE-1008] enable interacting with java client to hopsworks #1110

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

davitbzh
Copy link
Contributor

@davitbzh davitbzh commented Sep 7, 2023

This PR adds/fixes/changes...

  • please summarize your changes to the code
  • and make sure to include all changes to user-facing APIs

JIRA Issue: -

Priority for Review: -

Related PRs: -

How Has This Been Tested?

  • Unit Tests
  • Integration Tests
  • Manual Tests on VM

Checklist For The Assigned Reviewer:

- [ ] Checked if merge conflicts with master exist
- [ ] Checked if stylechecks for Java and Python pass
- [ ] Checked if all docstrings were added and/or updated appropriately
- [ ] Ran spellcheck on docstring
- [ ] Checked if guides & concepts need to be updated
- [ ] Checked if naming conventions for parameters and variables were followed
- [ ] Checked if private methods are properly declared and used
- [ ] Checked if hard-to-understand areas of code are commented
- [ ] Checked if tests are effective
- [ ] Built and deployed changes on dev VM and tested manually
- [x] (Checked if all type annotations were added and/or updated appropriately)

@davitbzh davitbzh requested a review from kennethmhc September 7, 2023 09:25
@davitbzh davitbzh changed the title [FSTORE-1008] add classes java client to hsfs [FSTORE-1008] enable interacting with java client to hopsworks Sep 7, 2023
@@ -79,35 +79,35 @@ public static List<TrainingDatasetFeature> makeLabelFeatures(QueryBase query, Li
for (Feature feat : (List<Feature>) query.getLeftFeatures()) {
labelWithPrefixToFeature.put(feat.getName(), feat.getName());
labelWithPrefixToFeatureGroup.put(feat.getName(),
(new FeatureGroupBaseForApi(null, feat.getFeatureGroupId())));
(new StreamFeatureGroup(null, feat.getFeatureGroupId())));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why change it to StreamFeatureGroup?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FeatureGroupBaseForApi was used as a dummy class for API calls. In hsfs only abstract classes were present and it is not initializable. Since we added now support for StreamFeatureGroup, it can do the job

private FeatureGroupEngine featureGroupEngine;
private FeatureViewEngine featureViewEngine;

public FeatureStore() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question: what methods should be included in the java client?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing in java client can be done is fetch FG, FV metadata and get feature vectors. I don't see any other use cases

* @throws FeatureStoreException
* @throws IOException
*/
public abstract void addTag(String name, Object value) throws FeatureStoreException, IOException;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why removing all the abstract method? Are they not needed for spark/flink etc? same for featurestorebase and featuregroupbase.
I think you can implement the abstract method as "not available" in the FeatureView

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point was that who will use addTag from Flink or Beam? It doesn't make sense. But it is possible do add since only api classes are used

@kennethmhc
Copy link
Contributor

When I was testing it, I found a bug. Can you change line 215 in VectorServer to

      String zippedTupleString =
          zipArraysToTupleString(preparedStatementParameters.get(fgId)
                  .entrySet()
                  .stream()
                  .sorted(Comparator.comparingInt(Map.Entry::getValue))
                  .map(e -> entry.get(e.getKey()))
              .collect(Collectors.toList()));

Basically, the problem is that when there are multiple primary key, they need to be sorted according to the index of preparedStatementParameters

@davitbzh davitbzh requested a review from kennethmhc February 4, 2024 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants