Skip to content

Conversation

@mvanschellebeeck
Copy link
Contributor

@mvanschellebeeck mvanschellebeeck commented Feb 27, 2025

Add nearestNeighbor query support.

See runNearestNeighborsTest.ts for example usage

@mvanschellebeeck mvanschellebeeck marked this pull request as ready for review February 27, 2025 17:14

readonly $title: string | undefined;

readonly $score?: number | undefined;
Copy link
Contributor

@nihalbhatnagar nihalbhatnagar Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should have this belong in OSDK base, since I imagine this will be undefined 90% of the time. What if instead we passed through a generic through the ObjectSet type that determined if we were or had made a nearest neighbors query, and conditionally added it to the Osdk.Instance type?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or actually looking further, it seems we only want score to be returned when ordering by relevance? Which makes it easier I think, we don't need to pass through a generic we can just add it to the type from a fetchPage result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble implementing the types here - would it be the same approach taken with $rid in https://github.com/palantir/osdk-ts/blob/main/packages/api/src/OsdkObjectFrom.ts#L224-L228?

| {
[K in L]?: "asc" | "desc";
}
| "relevance";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we order by relevance but we haven't called nearestNeighbors yet, do we throw?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we imagine that there will ever be any extra modifiers for relevance ordering? Trying to think if this should be a string or object option

Copy link
Contributor Author

@mvanschellebeeck mvanschellebeeck Mar 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we order by relevance but we haven't called nearestNeighbors yet, do we throw?

Yep, we throw InvalidOrderType: 8ac3d47

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we imagine that there will ever be any extra modifiers for relevance ordering? Trying to think if this should be a string or object option

This is how its modeled in the internal APIs so any added modifiers there would be a breaking change

value, all objects will be returned. This value is limited to 1 ≤ numNeighbors ≥ 500.
* @param property - The property key with a defined embedding model to search over.
*
* @returns An object set containing the `numNeighbors` nearest neighbors. To return the objects ordered by relevance and each
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I nest two nearest neighbors object sets, which one takes precedence? Or do we throw

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It operates the same way as nested unions/intersects in that it executes from inner to outer and returns the outer result. Added an example in 2538015

* Finds the nearest neighbors for a given text or vector within the object set.
*
* @param query - Queries support either a vector matching the embedding model defined on the property, or text that is
automatically embedded.
Copy link
Contributor

@nihalbhatnagar nihalbhatnagar Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting it automatically embedded is cool

@@ -0,0 +1,9 @@
---
"@osdk/foundry-sdk-generator": minor
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these all patch

& string;

export type VectorType = Extract<BaseWirePropertyTypes, "vector">;
export type VectorPropertyKeys<O extends ObjectOrInterfaceDefinition> = keyof {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use our FilteredPropertyKeys type?

| {
[K in L]?: "asc" | "desc";
}
| "relevance";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we imagine that there will ever be any extra modifiers for relevance ordering? Trying to think if this should be a string or object option

},
});

// Ignores "Type instantiation is excessively deep and possibly infinite."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may need to remove the branded which may fix this error? If it fails without it as well thats fine

"skillSetEmbedding",
);

const { data: unOrdered } = await nearestNeighbors.fetchPage();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind doing a sanity check here and trying to access .$score on one of the resulting objects (and adding //ts-expect-error) or actually for most of these cases

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we have type tests for OsdkObjectFrom but if we do that might be a better place to put it

$includeRid?: R;
}

export type OrderByType<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we rename this OrderByOptions?

expect(employees).toHaveLength(numNeighbors);
// Check that no score is returned when not ordered by relevance
// @ts-ignore
employees.forEach(e => expect(e.$score).toBeUndefined());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you have to ts-ignore here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh is it because $score isn't even a property on the object? If so, could we actually modify by adding some type tests to make sure that $score is not an available option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ts-expect-error is useful when you want something to error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ignore my comments about adding this specific test above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update in 0b6b35a

$orderBy: "relevance",
});
expectTypeOf<typeof orderedByRelevance>().toMatchTypeOf<
Osdk.Instance<Employee, never, PropertyKeys<Employee>, {}, "relevance">[]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not a big fan of relevance being in the type when in reality it's $score that pops up. I think what we want to do is something similar to ExtractOptions so instead of passing in Z directly to Osdk.Instance you pass in ExtractOrderByOptions<Z> which will give you either never or$score that way if someone looks at their return type, its clear what they're getting back on their object.

Like Osdk.Instance<quickAndDirty, "$rid", "name" | "foo"> makes it clear to me I'm getting name, foo, and $rid back, so would be nice to keep it consistent and also throw in $score rather than relevance

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep agreed - fixed in
0b6b35a

| "relevance";

export interface OrderByArg<
Q extends ObjectOrInterfaceDefinition,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting for myself that this isn't exported, so this arg reorder is fine

readonly nearestNeighbors: (
query: string | number[],
numNeighbors: number,
property: VectorPropertyKeys<Q>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use FilteredPropertyKeys here instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, f6eb11b

});

expectTypeOf(objectSet).branded.toEqualTypeOf<
expectTypeOf(objectSet).toEqualTypeOf<
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why'd we remove branded here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I was seeing Type instantiation is excessively deep and possibly infinite. Appears to be a failure mode for this branded check according to the docs:

branded:
     * // ❌ The following line doesn't compile, even though the types are arguably the same.
     * expectTypeOf<{ a: 1 } & { b: 2 }>().toEqualTypeOf<{ a: 1; b: 2 }>()
     * ```
     * This helper works around this problem by using
     * a more permissive but less performant check.
     *
     * __Note__: This comes at a performance cost, and can cause the compiler
     * to 'give up' if used with excessively deep types, so use sparingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants