Add `batch_size` param for `text_embedding` processor #1298

YeonghyeonKO · 2024-11-17T03:45:23Z

Description

Unlike the document from text_embedding processor in ingest-pipelines describes batch_size parameter, interface in opensearch-java client doesn't include it. From OpenSearch 2.16, it's possible to add batch inference support in ingest processors which inherits from AbstractBatchingProcessor(opensearch-project/neural-search#820). From now on, OpenSearch java client doesn't support batch_size parameter(optional) when defining text_embedding processor.

Since @miguel-vila's contribution by adding TextEmbeddingProcessor has been merged, there was another big change in opensearch-project/neural-search(opensearch-project/neural-search#820). In line with this change, I attempted to modify the code of the text_embedding processor of the opensearch-java client.

Issues Resolved

This PR is related to #1297

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

YeonghyeonKO added 4 commits November 17, 2024 12:06

Add batchSize parameter for text_embedding processor

ce25876

throw IllegalArgumentException when batchSize is not a positive integer

991b885

test: add test cases for BatchSize param

7843e43

test: exception when batchSize is zero or negative integer

90b81ab

YeonghyeonKO requested review from reta, Bukhtawar, dblock, szczepanczykd, madhusudhankonda, saratvemulapalli, VachaShah and Xtansia as code owners November 17, 2024 03:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `batch_size` param for `text_embedding` processor #1298

Add `batch_size` param for `text_embedding` processor #1298

YeonghyeonKO commented Nov 17, 2024 •

edited

Loading

Add batch_size param for text_embedding processor #1298

Are you sure you want to change the base?

Add batch_size param for text_embedding processor #1298

Conversation

YeonghyeonKO commented Nov 17, 2024 • edited Loading

Description

Issues Resolved

Add `batch_size` param for `text_embedding` processor #1298

Add `batch_size` param for `text_embedding` processor #1298

YeonghyeonKO commented Nov 17, 2024 •

edited

Loading