Skip to content

Commit 1c88766

Browse files
Add max_cache_length to PredictRequest.RequestOptions.
When set and supported by servable, the model server will cache the prefix of request up to this length. PiperOrigin-RevId: 817414641
1 parent 6d2fd08 commit 1c88766

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

tensorflow_serving/apis/predict.proto

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,10 @@ message PredictRequest {
7171
// response if the model stops at them. The model may stop at other tokens,
7272
// but will not return them in the response.
7373
repeated int64 return_stoptokens = 4;
74+
75+
// When set and supported by servable, the model server will cache the
76+
// prefix of request up to this length.
77+
optional int64 max_cache_length = 6;
7478
}
7579

7680
optional RequestOptions request_options = 7;

0 commit comments

Comments
 (0)