diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/inputs/generate.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/inputs/generate.md new file mode 100644 index 0000000000..45d91f6802 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/inputs/generate.md @@ -0,0 +1,126 @@ +--- +title: Generate Input +description: Explains an overview of configuring generate input +tags: [ "Tyk Streams", "Stream Outputs", "Inputs", "Generate" ] +--- + +Generates messages at a given interval using a [Bloblang]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}) mapping executed without a context. This allows you to generate messages for testing your pipeline configs. + +```yml +# Config fields, showing default values +input: + label: "" + generate: + mapping: root = "hello world" # No default (required) + interval: 1s + count: 0 + batch_size: 1 + auto_replay_nacks: true +``` + +## Examples + +## Cron Scheduled Processing + +A common use case for the generate input is to trigger processors on a schedule so that the processors themselves can behave similarly to an input. The following configuration reads rows from a PostgreSQL table every 5 minutes. + +```yaml +input: + generate: + interval: '@every 5m' + mapping: 'root = {}' + processors: + - sql_select: + driver: postgres + dsn: postgres://foouser:foopass@localhost:5432/testdb?sslmode=disable + table: foo + columns: [ "*" ] +``` + +## Generate 100 Rows + +The generate input can be used as a convenient way to generate test data. The following example generates 100 rows of structured data by setting an explicit count. The interval field is set to empty, which means data is generated as fast as the downstream components can consume it. + +```yaml +input: + generate: + count: 100 + interval: "" + mapping: | + root = if random_int() % 2 == 0 { + { + "type": "foo", + "foo": "is yummy" + } + } else { + { + "type": "bar", + "bar": "is gross" + } + } +``` + +## Fields + +### mapping + +A [bloblang]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}) mapping to use for generating messages. + + +Type: `string` + +```yml +# Examples + +mapping: root = "hello world" + +mapping: root = {"test":"message","id":uuid_v4()} +``` + +### interval + +The time interval at which messages should be generated, expressed either as a duration string or as a cron expression. If set to an empty string messages will be generated as fast as downstream services can process them. Cron expressions can specify a timezone by prefixing the expression with `TZ=`, where the location name corresponds to a file within the IANA Time Zone database. + + +Type: `string` +Default: `"1s"` + +```yml +# Examples + +interval: 5s + +interval: 1m + +interval: 1h + +interval: '@every 1s' + +interval: 0,30 */2 * * * * + +interval: TZ=Europe/London 30 3-6,20-23 * * * +``` + +### count + +An optional number of messages to generate, if set above 0 the specified number of messages is generated and then the input will shut down. + + +Type: `int` +Default: `0` + +### batch_size + +The number of generated messages that should be accumulated into each batch flushed at the specified interval. + + +Type: `int` +Default: `1` + +### auto_replay_nacks + +Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to `false` these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation. + + +Type: `bool` +Default: `true` diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/inputs/http-client.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/inputs/http-client.md new file mode 100644 index 0000000000..294d154cab --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/inputs/http-client.md @@ -0,0 +1,745 @@ +--- +title: Http Client +description: Explains an overview of configuring Http client input +tags: [ "Tyk Streams", "Stream Inputs", "Inputs", "Http Client", "http_client" ] +--- + +Connects to a server and continuously performs requests for a single message. + +## Common + +```yml +# Common config fields, showing default values +input: + label: "" + http_client: + url: "" # No default (required) + verb: GET + headers: {} + rate_limit: "" # No default (optional) + timeout: 5s + payload: "" # No default (optional) + stream: + enabled: false + reconnect: true + scanner: + lines: {} + auto_replay_nacks: true +``` + +## Advanced + +```yml +# All config fields, showing default values +input: + label: "" + http_client: + url: "" # No default (required) + verb: GET + headers: {} + metadata: + include_prefixes: [] + include_patterns: [] + dump_request_log_level: "" + oauth: + enabled: false + consumer_key: "" + consumer_secret: "" + access_token: "" + access_token_secret: "" + oauth2: + enabled: false + client_key: "" + client_secret: "" + token_url: "" + scopes: [] + endpoint_params: {} + basic_auth: + enabled: false + username: "" + password: "" + jwt: + enabled: false + private_key_file: "" + signing_method: "" + claims: {} + headers: {} + tls: + enabled: false + skip_cert_verify: false + enable_renegotiation: false + root_cas: "" + root_cas_file: "" + client_certs: [] + extract_headers: + include_prefixes: [] + include_patterns: [] + rate_limit: "" # No default (optional) + timeout: 5s + retry_period: 1s + max_retry_backoff: 300s + retries: 3 + backoff_on: + - 429 + drop_on: [] + successful_on: [] + proxy_url: "" # No default (optional) + payload: "" # No default (optional) + drop_empty_bodies: true + stream: + enabled: false + reconnect: true + scanner: + lines: {} + auto_replay_nacks: true +``` + +The URL and header values of this type can be dynamically set using [function interpolations]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + +### Streaming + +If you enable streaming then Tyk Streams will consume the body of the response as a continuous stream of data, breaking messages out following a chosen scanner. This allows you to consume APIs that provide long lived streamed data feeds (such as Twitter). + +### Pagination + +This input supports interpolation functions in the `url` and `headers` fields where data from the previous successfully consumed message (if there was one) can be referenced. This can be used in order to support basic levels of pagination. However, in cases where pagination depends on logic it is recommended that you use an [http processor]({{< ref "/product-stack/tyk-streaming/configuration/processors/http" >}}) instead, often combined with a [generate input]({{< ref "/product-stack/tyk-streaming/configuration/inputs/generate" >}}) in order to schedule the processor. + +## Examples + +### Basic Pagination + +Interpolation functions within the `url` and `headers` fields can be used to reference the previously consumed message, which allows simple pagination. + +```yaml +input: + http_client: + url: >- + http://api.example.com/search?query=allmyfoos&start_time=${! ( + (timestamp_unix()-300).ts_format("2006-01-02T15:04:05Z","UTC").escape_url_query() + ) }${! ("&next_token="+this.meta.next_token.not_null()) | "" } + verb: GET + rate_limit: foo_searches + +rate_limit_resources: + - label: foo_searches + local: + count: 1 + interval: 30s +``` + + + +## Fields + +### url + +The URL to connect to. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + + +Type: `string` + +### verb + +A verb to connect with + + +Type: `string` +Default: `"GET"` + +```yml +# Examples + +verb: POST + +verb: GET + +verb: DELETE +``` + +### headers + +A map of headers to add to the request. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}) + +Type: `object` +Default: `{}` + +```yml +# Examples + +headers: + Content-Type: application/octet-stream + traceparent: ${! tracing_span().traceparent } +``` + +### metadata + +Specify optional matching rules to determine which metadata keys should be added to the HTTP request as headers. + + +Type: `object` + +### metadata.include_prefixes + +Provide a list of explicit metadata key prefixes to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_prefixes: + - foo_ + - bar_ + +include_prefixes: + - kafka_ + +include_prefixes: + - content- +``` + +### metadata.include_patterns + +Provide a list of explicit metadata key regular expression (re2) patterns to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_patterns: + - .* + +include_patterns: + - _timestamp_unix$ +``` + +### dump_request_log_level + +Optionally set a level at which the request and response payload of each request made will be logged. + + +Type: `string` +Default: `""` +Options: `TRACE`, `DEBUG`, `INFO`, `WARN`, `ERROR`, `FATAL`, ``. + +### oauth + +Allows you to specify open authentication via OAuth version 1. + + +Type: `object` + +### oauth.enabled + +Whether to use OAuth version 1 in requests. + + +Type: `bool` +Default: `false` + +### oauth.consumer_key + +A value used to identify the client to the service provider. + + +Type: `string` +Default: `""` + +### oauth.consumer_secret + +A secret used to establish ownership of the consumer key. + + +Type: `string` +Default: `""` + +### oauth.access_token + +A value used to gain access to the protected resources on behalf of the user. + + +Type: `string` +Default: `""` + +### oauth.access_token_secret + +A secret provided in order to establish ownership of a given access token. + + +Type: `string` +Default: `""` + +### oauth2 + +Allows you to specify open authentication via OAuth version 2 using the client credentials token flow. + + +Type: `object` + +### oauth2.enabled + +Whether to use OAuth version 2 in requests. + + +Type: `bool` +Default: `false` + +### oauth2.client_key + +A value used to identify the client to the token provider. + + +Type: `string` +Default: `""` + +### oauth2.client_secret + +A secret used to establish ownership of the client key. + + +Type: `string` +Default: `""` + +### oauth2.token_url + +The URL of the token provider. + + +Type: `string` +Default: `""` + +### oauth2.scopes + +A list of optional requested permissions. + + +Type: `array` +Default: `[]` + +### oauth2.endpoint_params + +A list of optional endpoint parameters, values should be arrays of strings. + + +Type: `object` +Default: `{}` + +```yml +# Examples + +endpoint_params: + bar: + - woof + foo: + - meow + - quack +``` + +### basic_auth + +Allows you to specify basic authentication. + + +Type: `object` + +### basic_auth.enabled + +Whether to use basic authentication in requests. + + +Type: `bool` +Default: `false` + +### basic_auth.username + +A username to authenticate as. + + +Type: `string` +Default: `""` + +### basic_auth.password + +A password to authenticate with. + + +Type: `string` +Default: `""` + +### jwt + +Allows you to specify JWT authentication. + + +Type: `object` + +### jwt.enabled + +Whether to use JWT authentication in requests. + + +Type: `bool` +Default: `false` + +### jwt.private_key_file + +A file with the PEM encoded via PKCS1 or PKCS8 as private key. + + +Type: `string` +Default: `""` + +### jwt.signing_method + +A method used to sign the token such as RS256, RS384, RS512 or EdDSA. + + +Type: `string` +Default: `""` + +### jwt.claims + +A value used to identify the claims that issued the JWT. + + +Type: `object` +Default: `{}` + +### jwt.headers + +Add optional key/value headers to the JWT. + + +Type: `object` +Default: `{}` + +### tls + +Custom TLS settings can be used to override system defaults. + + +Type: `object` + +### tls.enabled + +Whether custom TLS settings are enabled. + + +Type: `bool` +Default: `false` + +### tls.skip_cert_verify + +Whether to skip server side certificate verification. + + +Type: `bool` +Default: `false` + +### tls.enable_renegotiation + +Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you're seeing the error message `local error: tls: no renegotiation`. + + +Type: `bool` +Default: `false` + +### tls.root_cas + +An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. + + +Type: `string` +Default: `""` + +```yml +# Examples + +root_cas: |- + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- +``` + +### tls.root_cas_file + +An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. + + +Type: `string` +Default: `""` + +```yml +# Examples + +root_cas_file: ./root_cas.pem +``` + +### tls.client_certs + +A list of client certificates to use. For each certificate either the fields `cert` and `key`, or `cert_file` and `key_file` should be specified, but not both. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +client_certs: + - cert: foo + key: bar + +client_certs: + - cert_file: ./example.pem + key_file: ./example.key +``` + +### tls.client_certs[].cert + +A plain text certificate to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].key + +A plain text certificate key to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].cert_file + +The path of a certificate to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].key_file + +The path of a certificate key to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].password + +A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete `pbeWithMD5AndDES-CBC` algorithm is not supported for the PKCS#8 format. Warning: Since it does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. + +Type: `string` +Default: `""` + +```yml +# Examples + +password: foo +``` + +### extract_headers + +Specify which response headers should be added to resulting messages as metadata. Header keys are lowercased before matching, so ensure that your patterns target lowercased versions of the header keys that you expect. + + +Type: `object` + +### extract_headers.include_prefixes + +Provide a list of explicit metadata key prefixes to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_prefixes: + - foo_ + - bar_ + +include_prefixes: + - kafka_ + +include_prefixes: + - content- +``` + +### extract_headers.include_patterns + +Provide a list of explicit metadata key regular expression (re2) patterns to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_patterns: + - .* + +include_patterns: + - _timestamp_unix$ +``` + +### rate_limit + +An optional [rate limit]({{< ref "/product-stack/tyk-streaming/configuration/rate-limits/overview" >}}) to throttle requests by. + + +Type: `string` + +### timeout + +A static timeout to apply to requests. + + +Type: `string` +Default: `"5s"` + +### retry_period + +The base period to wait between failed requests. + + +Type: `string` +Default: `"1s"` + +### max_retry_backoff + +The maximum period to wait between failed requests. + + +Type: `string` +Default: `"300s"` + +### retries + +The maximum number of retry attempts to make. + + +Type: `int` +Default: `3` + +### backoff_on + +A list of status codes whereby the request should be considered to have failed and retries should be attempted, but the period between them should be increased gradually. + + +Type: `array` +Default: `[429]` + +### drop_on + +A list of status codes whereby the request should be considered to have failed but retries should not be attempted. This is useful for preventing wasted retries for requests that will never succeed. Note that with these status codes the *request* is dropped, but *message* that caused the request will not be dropped. + + +Type: `array` +Default: `[]` + +### successful_on + +A list of status codes whereby the attempt should be considered successful, this is useful for dropping requests that return non-2XX codes indicating that the message has been dealt with, such as a 303 See Other or a 409 Conflict. All 2XX codes are considered successful unless they are present within `backoff_on` or `drop_on`, regardless of this field. + + +Type: `array` +Default: `[]` + +### proxy_url + +An optional HTTP proxy URL. + + +Type: `string` + +### payload + +An optional payload to deliver for each request. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + + +Type: `string` + +### drop_empty_bodies + +Whether empty payloads received from the target server should be dropped. + + +Type: `bool` +Default: `true` + +### stream + +Allows you to set streaming mode, where requests are kept open and messages are processed line-by-line. + + +Type: `object` + +### stream.enabled + +Enables streaming mode. + + +Type: `bool` +Default: `false` + +### stream.reconnect + +Sets whether to re-establish the connection once it is lost. + + +Type: `bool` +Default: `true` + +### stream.scanner + +The [scanner]({{< ref "/product-stack/tyk-streaming/configuration/scanners/overview" >}}) by which the stream of bytes consumed will be broken out into individual messages. Scanners are useful for processing large sources of data without holding the entirety of it within memory. For example, the `csv` scanner allows you to process individual CSV rows without loading the entire CSV file in memory at once. + + +Type: `scanner` +Default: `{"lines":{}}` + +### auto_replay_nacks + +Whether messages that are rejected (nacked) at the output level should be automatically replayed indefinitely, eventually resulting in back pressure if the cause of the rejections is persistent. If set to `false` these messages will instead be deleted. Disabling auto replays can greatly improve memory efficiency of high throughput streams as the original shape of the data can be discarded immediately upon consumption and mutation. + + +Type: `bool` +Default: `true` diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/http-client.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/http-client.md new file mode 100644 index 0000000000..1bdd80f9cf --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/http-client.md @@ -0,0 +1,822 @@ +--- +title: Http Client +description: Explains an overview of configuring Http client output +tags: [ "Tyk Streams", "Stream Outputs", "Outputs", "Http Client", "http_client" ] +--- + +Sends messages to an HTTP server. + + +## Common + +```yml +# Common config fields, showing default values +output: + label: "" + http_client: + url: "" # No default (required) + verb: POST + headers: {} + rate_limit: "" # No default (optional) + timeout: 5s + max_in_flight: 64 + batching: + count: 0 + byte_size: 0 + period: "" + check: "" +``` + +## Advanced + +```yml +# All config fields, showing default values +output: + label: "" + http_client: + url: "" # No default (required) + verb: POST + headers: {} + metadata: + include_prefixes: [] + include_patterns: [] + dump_request_log_level: "" + oauth: + enabled: false + consumer_key: "" + consumer_secret: "" + access_token: "" + access_token_secret: "" + oauth2: + enabled: false + client_key: "" + client_secret: "" + token_url: "" + scopes: [] + endpoint_params: {} + basic_auth: + enabled: false + username: "" + password: "" + jwt: + enabled: false + private_key_file: "" + signing_method: "" + claims: {} + headers: {} + tls: + enabled: false + skip_cert_verify: false + enable_renegotiation: false + root_cas: "" + root_cas_file: "" + client_certs: [] + extract_headers: + include_prefixes: [] + include_patterns: [] + rate_limit: "" # No default (optional) + timeout: 5s + retry_period: 1s + max_retry_backoff: 300s + retries: 3 + backoff_on: + - 429 + drop_on: [] + successful_on: [] + proxy_url: "" # No default (optional) + batch_as_multipart: false + propagate_response: false + max_in_flight: 64 + batching: + count: 0 + byte_size: 0 + period: "" + check: "" + processors: [] # No default (optional) + multipart: [] +``` + +When the number of retries expires the output will reject the message, the behavior after this will depend on the pipeline but usually this simply means the send is attempted again until successful whilst applying back pressure. + +The URL and header values of this type can be dynamically set using [function interpolations]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + +The body of the HTTP request is the raw contents of the message payload. If the message has multiple parts (is a batch) the request will be sent according to [RFC1341](https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html). This behavior can be disabled by setting the field [batch_as_multipart](#batch_as_multipart) to `false`. + +### Propagating Responses + +It's possible to propagate the response from each HTTP request back to the input source by setting `propagate_response` to `true`. Only inputs that support [synchronous responses]({{< ref "/product-stack/tyk-streaming/guides/sync-responses" >}}) are able to make use of these propagated responses. + +## Performance + +This output benefits from sending multiple messages in flight in parallel for improved performance. You can tune the max number of in flight messages (or message batches) with the field `max_in_flight`. + +This output benefits from sending messages as a [batch]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/batching" >}}) for improved performance. Batches can be formed at both the input and output level. + +## Fields + +### url + +The URL to connect to. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + +Type: `string` + +### verb + +A verb to connect with + + +Type: `string` +Default: `"POST"` + +```yml +# Examples + +verb: POST + +verb: GET + +verb: DELETE +``` + +### headers + +A map of headers to add to the request. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + + +Type: `object` +Default: `{}` + +```yml +# Examples + +headers: + Content-Type: application/octet-stream + traceparent: ${! tracing_span().traceparent } +``` + +### metadata + +Specify optional matching rules to determine which metadata keys should be added to the HTTP request as headers. + + +Type: `object` + +### metadata.include_prefixes + +Provide a list of explicit metadata key prefixes to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_prefixes: + - foo_ + - bar_ + +include_prefixes: + - kafka_ + +include_prefixes: + - content- +``` + +### metadata.include_patterns + +Provide a list of explicit metadata key regular expression (re2) patterns to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_patterns: + - .* + +include_patterns: + - _timestamp_unix$ +``` + +### dump_request_log_level + +Optionally set a level at which the request and response payload of each request made will be logged. + + +Type: `string` +Default: `""` +Options: `TRACE`, `DEBUG`, `INFO`, `WARN`, `ERROR`, `FATAL`, ``. + +### oauth + +Allows you to specify open authentication via OAuth version 1. + + +Type: `object` + +### oauth.enabled + +Whether to use OAuth version 1 in requests. + + +Type: `bool` +Default: `false` + +### oauth.consumer_key + +A value used to identify the client to the service provider. + + +Type: `string` +Default: `""` + +### oauth.consumer_secret + +A secret used to establish ownership of the consumer key. + + +Type: `string` +Default: `""` + +### oauth.access_token + +A value used to gain access to the protected resources on behalf of the user. + + +Type: `string` +Default: `""` + +### oauth.access_token_secret + +A secret provided in order to establish ownership of a given access token. + + +Type: `string` +Default: `""` + +### oauth2 + +Allows you to specify open authentication via OAuth version 2 using the client credentials token flow. + + +Type: `object` + +### oauth2.enabled + +Whether to use OAuth version 2 in requests. + + +Type: `bool` +Default: `false` + +### oauth2.client_key + +A value used to identify the client to the token provider. + + +Type: `string` +Default: `""` + +### oauth2.client_secret + +A secret used to establish ownership of the client key. + + +Type: `string` +Default: `""` + +### oauth2.token_url + +The URL of the token provider. + + +Type: `string` +Default: `""` + +### oauth2.scopes + +A list of optional requested permissions. + + +Type: `array` +Default: `[]` + +### oauth2.endpoint_params + +A list of optional endpoint parameters, values should be arrays of strings. + + +Type: `object` +Default: `{}` + +```yml +# Examples + +endpoint_params: + bar: + - woof + foo: + - meow + - quack +``` + +### basic_auth + +Allows you to specify basic authentication. + + +Type: `object` + +### basic_auth.enabled + +Whether to use basic authentication in requests. + + +Type: `bool` +Default: `false` + +### basic_auth.username + +A username to authenticate as. + + +Type: `string` +Default: `""` + +### basic_auth.password + +A password to authenticate with. + + +Type: `string` +Default: `""` + +### jwt + +Allows you to specify JWT authentication. + + +Type: `object` + +### jwt.enabled + +Whether to use JWT authentication in requests. + + +Type: `bool` +Default: `false` + +### jwt.private_key_file + +A file with the PEM encoded via PKCS1 or PKCS8 as private key. + + +Type: `string` +Default: `""` + +### jwt.signing_method + +A method used to sign the token such as RS256, RS384, RS512 or EdDSA. + + +Type: `string` +Default: `""` + +### jwt.claims + +A value used to identify the claims that issued the JWT. + + +Type: `object` +Default: `{}` + +### jwt.headers + +Add optional key/value headers to the JWT. + + +Type: `object` +Default: `{}` + +### tls + +Custom TLS settings can be used to override system defaults. + + +Type: `object` + +### tls.enabled + +Whether custom TLS settings are enabled. + + +Type: `bool` +Default: `false` + +### tls.skip_cert_verify + +Whether to skip server side certificate verification. + + +Type: `bool` +Default: `false` + +### tls.enable_renegotiation + +Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you're seeing the error message `local error: tls: no renegotiation`. + + +Type: `bool` +Default: `false` + +### tls.root_cas + +An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. + + +Type: `string` +Default: `""` + +```yml +# Examples + +root_cas: |- + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- +``` + +### tls.root_cas_file + +An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. + + +Type: `string` +Default: `""` + +```yml +# Examples + +root_cas_file: ./root_cas.pem +``` + +### tls.client_certs + +A list of client certificates to use. For each certificate either the fields `cert` and `key`, or `cert_file` and `key_file` should be specified, but not both. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +client_certs: + - cert: foo + key: bar + +client_certs: + - cert_file: ./example.pem + key_file: ./example.key +``` + +### tls.client_certs[].cert + +A plain text certificate to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].key + +A plain text certificate key to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].cert_file + +The path of a certificate to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].key_file + +The path of a certificate key to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].password + +A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete `pbeWithMD5AndDES-CBC` algorithm is not supported for the PKCS#8 format. Warning: Since it does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. + + +Type: `string` +Default: `""` + +```yml +# Examples + +password: foo +``` + +### extract_headers + +Specify which response headers should be added to resulting synchronous response messages as metadata. Header keys are lowercased before matching, so ensure that your patterns target lowercased versions of the header keys that you expect. This field is not applicable unless `propagate_response` is set to `true`. + + +Type: `object` + +### extract_headers.include_prefixes + +Provide a list of explicit metadata key prefixes to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_prefixes: + - foo_ + - bar_ + +include_prefixes: + - kafka_ + +include_prefixes: + - content- +``` + +### extract_headers.include_patterns + +Provide a list of explicit metadata key regular expression (re2) patterns to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_patterns: + - .* + +include_patterns: + - _timestamp_unix$ +``` + +### rate_limit + +An optional [rate limit]({{< ref "/product-stack/tyk-streaming/configuration/rate-limits/overview" >}}) to throttle requests by. + + +Type: `string` + +### timeout + +A static timeout to apply to requests. + + +Type: `string` +Default: `"5s"` + +### retry_period + +The base period to wait between failed requests. + + +Type: `string` +Default: `"1s"` + +### max_retry_backoff + +The maximum period to wait between failed requests. + + +Type: `string` +Default: `"300s"` + +### retries + +The maximum number of retry attempts to make. + + +Type: `int` +Default: `3` + +### backoff_on + +A list of status codes whereby the request should be considered to have failed and retries should be attempted, but the period between them should be increased gradually. + + +Type: `array` +Default: `[429]` + +### drop_on + +A list of status codes whereby the request should be considered to have failed but retries should not be attempted. This is useful for preventing wasted retries for requests that will never succeed. Note that with these status codes the _request_ is dropped, but _message_ that caused the request will not be dropped. + + +Type: `array` +Default: `[]` + +### successful_on + +A list of status codes whereby the attempt should be considered successful, this is useful for dropping requests that return non-2XX codes indicating that the message has been dealt with, such as a 303 See Other or a 409 Conflict. All 2XX codes are considered successful unless they are present within `backoff_on` or `drop_on`, regardless of this field. + + +Type: `array` +Default: `[]` + +### proxy_url + +An optional HTTP proxy URL. + + +Type: `string` + +### batch_as_multipart + +Send message batches as a single request using [RFC1341](https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html). If disabled messages in batches will be sent as individual requests. + + +Type: `bool` +Default: `false` + +### propagate_response + +Whether responses from the server should be [propagated back]({{< ref "/product-stack/tyk-streaming/guides/sync-responses" >}}) to the input. + + +Type: `bool` +Default: `false` + +### max_in_flight + +The maximum number of parallel message batches to have in flight at any given time. + + +Type: `int` +Default: `64` + +### batching + +Allows you to configure a [batching policy]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/batching" >}}). + + +Type: `object` + +```yml +# Examples + +batching: + byte_size: 5000 + count: 0 + period: 1s + +batching: + count: 10 + period: 1s + +batching: + check: this.contains("END BATCH") + count: 0 + period: 1m +``` + +### batching.count + +A number of messages at which the batch should be flushed. If `0` disables count based batching. + + +Type: `int` +Default: `0` + +### batching.byte_size + +An amount of bytes at which the batch should be flushed. If `0` disables size based batching. + + +Type: `int` +Default: `0` + +### batching.period + +A period in which an incomplete batch should be flushed regardless of its size. + + +Type: `string` +Default: `""` + +```yml +# Examples + +period: 1s + +period: 1m + +period: 500ms +``` + +### batching.check + +A [Bloblang query]({{< ref "/product-stack/tyk-streaming/guides/bloblang/overview" >}}) that should return a boolean value indicating whether a message should end a batch. + + +Type: `string` +Default: `""` + +```yml +# Examples + +check: this.type == "end_of_transaction" +``` + +### batching.processors + +A list of processors to apply to a batch as it is flushed. This allows you to aggregate and archive the batch however you see fit. Please note that all resulting messages are flushed as a single batch, therefore splitting the batch into smaller batches using these processors is a no-op. + + +Type: `array` + +```yml +# Examples + +processors: + - archive: + format: concatenate + +processors: + - archive: + format: lines + +processors: + - archive: + format: json_array +``` + +### multipart + +Create explicit multipart HTTP requests by specifying an array of parts to add to the request, each part specified consists of content headers and a data field that can be populated dynamically. If this field is populated it will override the default request creation behavior. + + +Type: `array` +Default: `[]` + +### multipart[].content_type + +The content type of the individual message part. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}) + + +Type: `string` +Default: `""` + +```yml +# Examples + +content_type: application/bin +``` + +### multipart[].content_disposition + +The content disposition of the individual message part. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}) + + +Type: `string` +Default: `""` + +```yml +# Examples + +content_disposition: form-data; name="bin"; filename='${! @AttachmentName } +``` + +### multipart[].body + +The body of the individual message part. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + + +Type: `string` +Default: `""` + +```yml +# Examples + +body: ${! this.data.part1 } +``` diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/stdout.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/stdout.md index 41a8be1e78..ee56405b76 100644 --- a/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/stdout.md +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/stdout.md @@ -1,7 +1,7 @@ --- title: stdout description: Explains an overview of configuring stdout output -tags: [ "Tyk Streams", "Stream Outputs", "Outputs" ] +tags: [ "Tyk Streams", "Stream Outputs", "Outputs", "stdout" ] --- Prints messages to stdout as a continuous stream of data. diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/sync-response.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/sync-response.md new file mode 100644 index 0000000000..759c6b1866 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/outputs/sync-response.md @@ -0,0 +1,38 @@ +--- +title: Sync Response +description: Explains an overview of configuring sync_response output +tags: [ "Tyk Streams", "Stream Outputs", "Outputs", "sync_response", "Sync Response" ] +--- + +Returns the final message payload back to the input origin of the message, where it is dealt with according to that specific input type. + +```yml +# Config fields, showing default values +output: + label: "" + sync_response: {} +``` + +For most inputs this mechanism is ignored entirely, in which case the sync response is dropped without penalty. It is therefore safe to use this output even when combining input types that might not have support for sync responses. An example of an input able to utilise this is [http_server]({{< ref "/product-stack/tyk-streaming/configuration/inputs/http-server" >}}). + +It is safe to combine this output with others using broker types. For example, with the [http_server]({{< ref "/product-stack/tyk-streaming/configuration/inputs/http-server" >}}) input we could send the payload to a Kafka topic and also send a modified payload back with: + +```yaml +input: + http_server: + path: /post +output: + broker: + pattern: fan_out + outputs: + - kafka: + addresses: [ TODO:9092 ] + topic: foo_topic + - sync_response: {} + processors: + - mapping: 'root = content().uppercase()' +``` + +Using the above example and POSTING the message *hello world* to the endpoint `/post` Tyk Streams would send it unchanged to the topic `foo_topic` and also respond with *HELLO WORLD*. + +For more information please read the [Synchronous Responses]({{< ref "/product-stack/tyk-streaming/guides/sync-responses" >}}) guide. diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/processors/http.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/processors/http.md new file mode 100644 index 0000000000..63092354fd --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/processors/http.md @@ -0,0 +1,679 @@ +--- +title: Http +description: Explains an overview of configuring Http processor +tags: [ "Tyk Streams", "Stream Processors", "Processors", "Http", "http_client" ] +--- + +Performs an HTTP request using a message batch as the request body, and replaces the original message parts with the body of the response. + +## Common + +```yml +# Common config fields, showing default values +label: "" +http: + url: "" # No default (required) + verb: POST + headers: {} + rate_limit: "" # No default (optional) + timeout: 5s + parallel: false +``` + +## Advanced + +```yml +# All config fields, showing default values +label: "" +http: + url: "" # No default (required) + verb: POST + headers: {} + metadata: + include_prefixes: [] + include_patterns: [] + dump_request_log_level: "" + oauth: + enabled: false + consumer_key: "" + consumer_secret: "" + access_token: "" + access_token_secret: "" + oauth2: + enabled: false + client_key: "" + client_secret: "" + token_url: "" + scopes: [] + endpoint_params: {} + basic_auth: + enabled: false + username: "" + password: "" + jwt: + enabled: false + private_key_file: "" + signing_method: "" + claims: {} + headers: {} + tls: + enabled: false + skip_cert_verify: false + enable_renegotiation: false + root_cas: "" + root_cas_file: "" + client_certs: [] + extract_headers: + include_prefixes: [] + include_patterns: [] + rate_limit: "" # No default (optional) + timeout: 5s + retry_period: 1s + max_retry_backoff: 300s + retries: 3 + backoff_on: + - 429 + drop_on: [] + successful_on: [] + proxy_url: "" # No default (optional) + batch_as_multipart: false + parallel: false +``` + +The `rate_limit` field can be used to specify a [rate limit]({{< ref "/product-stack/tyk-streaming/configuration/rate-limits/overview" >}}) to cap the rate of requests across all parallel components service wide. + +The URL and header values of this type can be dynamically set using [function interpolations]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + +In order to map or encode the payload to a specific request body, and map the response back into the original payload instead of replacing it entirely, you can use the [branch]({{< ref "/product-stack/tyk-streaming/configuration/processors/branch" >}}) processor. + +## Response Codes + +Tyk Streams considers any response code between 200 and 299 inclusive to indicate a successful response, you can add more success status codes with the field `successful_on`. + +When a request returns a response code within the `backoff_on` field it will be retried after increasing intervals. + +When a request returns a response code within the `drop_on` field it will not be reattempted and is immediately considered a failed request. + +## Adding Metadata + +If the request returns an error response code this processor sets a metadata field `http_status_code` on the resulting message. + +Use the field `extract_headers` to specify rules for which other headers should be copied into the resulting message from the response. + +## Error Handling + +When all retry attempts for a message are exhausted the processor cancels the attempt. These failed messages will continue through the pipeline unchanged, but can be dropped or placed in a dead letter queue according to your config, you can read about these patterns [here]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/error-handling" >}}). + +## Examples + +### Branched Request + +This example uses a [branch processor]({{< ref "/product-stack/tyk-streaming/configuration/processors/branch" >}}) to strip the request message into an empty body, grab an HTTP payload, and place the result back into the original message at the path `repo.status`: + +```yaml +pipeline: + processors: + - branch: + request_map: 'root = ""' + processors: + - http: + url: https://hub.docker.com/v2/repositories/tykio/tyk-gateway + verb: GET + headers: + Content-Type: application/json + result_map: 'root.repo.status = this' +``` + +## Fields + +### url + +The URL to connect to. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + + +Type: `string` + +### verb + +A verb to connect with + + +Type: `string` +Default: `"POST"` + +```yml +# Examples + +verb: POST + +verb: GET + +verb: DELETE +``` + +### headers + +A map of headers to add to the request. +This field supports [interpolation functions]({{< ref "/product-stack/tyk-streaming/configuration/common-configuration/interpolation#bloblang-queries" >}}). + + +Type: `object` +Default: `{}` + +```yml +# Examples + +headers: + Content-Type: application/octet-stream + traceparent: ${! tracing_span().traceparent } +``` + +### metadata + +Specify optional matching rules to determine which metadata keys should be added to the HTTP request as headers. + + +Type: `object` + +### metadata.include_prefixes + +Provide a list of explicit metadata key prefixes to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_prefixes: + - foo_ + - bar_ + +include_prefixes: + - kafka_ + +include_prefixes: + - content- +``` + +### metadata.include_patterns + +Provide a list of explicit metadata key regular expression (re2) patterns to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_patterns: + - .* + +include_patterns: + - _timestamp_unix$ +``` + +### dump_request_log_level + +Optionally set a level at which the request and response payload of each request made will be logged. + + +Type: `string` +Default: `""` +Options: `TRACE`, `DEBUG`, `INFO`, `WARN`, `ERROR`, `FATAL`, ``. + +### oauth + +Allows you to specify open authentication via OAuth version 1. + + +Type: `object` + +### oauth.enabled + +Whether to use OAuth version 1 in requests. + + +Type: `bool` +Default: `false` + +### oauth.consumer_key + +A value used to identify the client to the service provider. + + +Type: `string` +Default: `""` + +### oauth.consumer_secret + +A secret used to establish ownership of the consumer key. + + +Type: `string` +Default: `""` + +### oauth.access_token + +A value used to gain access to the protected resources on behalf of the user. + + +Type: `string` +Default: `""` + +### oauth.access_token_secret + +A secret provided in order to establish ownership of a given access token. + + +Type: `string` +Default: `""` + +### oauth2 + +Allows you to specify open authentication via OAuth version 2 using the client credentials token flow. + + +Type: `object` + +### oauth2.enabled + +Whether to use OAuth version 2 in requests. + + +Type: `bool` +Default: `false` + +### oauth2.client_key + +A value used to identify the client to the token provider. + + +Type: `string` +Default: `""` + +### oauth2.client_secret + +A secret used to establish ownership of the client key. + + +Type: `string` +Default: `""` + +### oauth2.token_url + +The URL of the token provider. + + +Type: `string` +Default: `""` + +### oauth2.scopes + +A list of optional requested permissions. + + +Type: `array` +Default: `[]` + +### oauth2.endpoint_params + +A list of optional endpoint parameters, values should be arrays of strings. + + +Type: `object` +Default: `{}` + +```yml +# Examples + +endpoint_params: + bar: + - woof + foo: + - meow + - quack +``` + +### basic_auth + +Allows you to specify basic authentication. + + +Type: `object` + +### basic_auth.enabled + +Whether to use basic authentication in requests. + + +Type: `bool` +Default: `false` + +### basic_auth.username + +A username to authenticate as. + + +Type: `string` +Default: `""` + +### basic_auth.password + +A password to authenticate with. + + +Type: `string` +Default: `""` + +### jwt + +Allows you to specify JWT authentication. + + +Type: `object` + +### jwt.enabled + +Whether to use JWT authentication in requests. + + +Type: `bool` +Default: `false` + +### jwt.private_key_file + +A file with the PEM encoded via PKCS1 or PKCS8 as private key. + + +Type: `string` +Default: `""` + +### jwt.signing_method + +A method used to sign the token such as RS256, RS384, RS512 or EdDSA. + + +Type: `string` +Default: `""` + +### jwt.claims + +A value used to identify the claims that issued the JWT. + + +Type: `object` +Default: `{}` + +### jwt.headers + +Add optional key/value headers to the JWT. + + +Type: `object` +Default: `{}` + +### tls + +Custom TLS settings can be used to override system defaults. + + +Type: `object` + +### tls.enabled + +Whether custom TLS settings are enabled. + + +Type: `bool` +Default: `false` + +### tls.skip_cert_verify + +Whether to skip server side certificate verification. + + +Type: `bool` +Default: `false` + +### tls.enable_renegotiation + +Whether to allow the remote server to repeatedly request renegotiation. Enable this option if you're seeing the error message `local error: tls: no renegotiation`. + + +Type: `bool` +Default: `false` + +### tls.root_cas + +An optional root certificate authority to use. This is a string, representing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. + + +Type: `string` +Default: `""` + +```yml +# Examples + +root_cas: |- + -----BEGIN CERTIFICATE----- + ... + -----END CERTIFICATE----- +``` + +### tls.root_cas_file + +An optional path of a root certificate authority file to use. This is a file, often with a .pem extension, containing a certificate chain from the parent trusted root certificate, to possible intermediate signing certificates, to the host certificate. + + +Type: `string` +Default: `""` + +```yml +# Examples + +root_cas_file: ./root_cas.pem +``` + +### tls.client_certs + +A list of client certificates to use. For each certificate either the fields `cert` and `key`, or `cert_file` and `key_file` should be specified, but not both. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +client_certs: + - cert: foo + key: bar + +client_certs: + - cert_file: ./example.pem + key_file: ./example.key +``` + +### tls.client_certs[].cert + +A plain text certificate to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].key + +A plain text certificate key to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].cert_file + +The path of a certificate to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].key_file + +The path of a certificate key to use. + + +Type: `string` +Default: `""` + +### tls.client_certs[].password + +A plain text password for when the private key is password encrypted in PKCS#1 or PKCS#8 format. The obsolete `pbeWithMD5AndDES-CBC` algorithm is not supported for the PKCS#8 format. Warning: Since it does not authenticate the ciphertext, it is vulnerable to padding oracle attacks that can let an attacker recover the plaintext. + + +Type: `string` +Default: `""` + +```yml +# Examples + +password: foo +``` + +### extract_headers + +Specify which response headers should be added to resulting messages as metadata. Header keys are lowercased before matching, so ensure that your patterns target lowercased versions of the header keys that you expect. + + +Type: `object` + +### extract_headers.include_prefixes + +Provide a list of explicit metadata key prefixes to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_prefixes: + - foo_ + - bar_ + +include_prefixes: + - kafka_ + +include_prefixes: + - content- +``` + +### extract_headers.include_patterns + +Provide a list of explicit metadata key regular expression (re2) patterns to match against. + + +Type: `array` +Default: `[]` + +```yml +# Examples + +include_patterns: + - .* + +include_patterns: + - _timestamp_unix$ +``` + +### rate_limit + +An optional [rate limit]({{< ref "/product-stack/tyk-streaming/configuration/rate-limits/overview" >}}) to throttle requests by. + + +Type: `string` + +### timeout + +A static timeout to apply to requests. + + +Type: `string` +Default: `"5s"` + +### retry_period + +The base period to wait between failed requests. + + +Type: `string` +Default: `"1s"` + +### max_retry_backoff + +The maximum period to wait between failed requests. + + +Type: `string` +Default: `"300s"` + +### retries + +The maximum number of retry attempts to make. + + +Type: `int` +Default: `3` + +### backoff_on + +A list of status codes whereby the request should be considered to have failed and retries should be attempted, but the period between them should be increased gradually. + + +Type: `array` +Default: `[429]` + +### drop_on + +A list of status codes whereby the request should be considered to have failed but retries should not be attempted. This is useful for preventing wasted retries for requests that will never succeed. Note that with these status codes the _request_ is dropped, but _message_ that caused the request will not be dropped. + + +Type: `array` +Default: `[]` + +### successful_on + +A list of status codes whereby the attempt should be considered successful, this is useful for dropping requests that return non-2XX codes indicating that the message has been dealt with, such as a 303 See Other or a 409 Conflict. All 2XX codes are considered successful unless they are present within `backoff_on` or `drop_on`, regardless of this field. + + +Type: `array` +Default: `[]` + +### proxy_url + +An optional HTTP proxy URL. + + +Type: `string` + +### batch_as_multipart + +Send message batches as a single request using [RFC1341](https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html). + + +Type: `bool` +Default: `false` + +### parallel + +When processing batched messages, whether to send messages of the batch in parallel, otherwise they are sent serially. + + +Type: `bool` +Default: `false` + diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/processors/sync-response.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/processors/sync-response.md new file mode 100644 index 0000000000..67cf99ba00 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/processors/sync-response.md @@ -0,0 +1,17 @@ +--- +title: Sync Response +description: Explains an overview of sync response processor +tags: [ "Tyk Streams", "Stream Processors", "Processors", "Sync Response", "sync_response" ] +--- + +Adds the payload in its current state as a synchronous response to the input source, where it is dealt with according to that specific input type. + +```yml +# Config fields, showing default values +label: "" +sync_response: {} +``` + +For most inputs this mechanism is ignored entirely, in which case the sync response is dropped without penalty. It is therefore safe to use this processor even when combining input types that might not have support for sync responses. An example of an input able to utilise this is [http_server]({{< ref "/product-stack/tyk-streaming/configuration/inputs/http-server" >}}). + +Further information is available in the [Synchronous Responses]({{< ref "/product-stack/tyk-streaming/guides/sync-responses" >}}) guide. diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/csv.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/csv.md new file mode 100644 index 0000000000..9716924fc5 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/csv.md @@ -0,0 +1,57 @@ +--- +title: CSV +description: Explains an overview of CSV scanner in Tyk Streams +tags: [ "Tyk Streams", "Scanners", "CSV", "CSV Scanner" ] +--- + +Consume comma-separated values row by row, including support for custom delimiters. + +```yml +# Config fields, showing default values +csv: + custom_delimiter: "" # No default (optional) + parse_header_row: true + lazy_quotes: false + continue_on_error: false +``` + +### Metadata + +This scanner adds the following metadata to each message: + +- `csv_row` The index of each row, beginning at 0. + + + +## Fields + +### custom_delimiter + +Use a provided custom delimiter instead of the default comma. + + +Type: `string` + +### parse_header_row + +Whether to reference the first row as a header row. If set to true the output structure for messages will be an object where field keys are determined by the header row. Otherwise, each message will consist of an array of values from the corresponding CSV row. + + +Type: `bool` +Default: `true` + +### lazy_quotes + +If set to `true`, a quote may appear in an unquoted field and a non-doubled quote may appear in a quoted field. + + +Type: `bool` +Default: `false` + +### continue_on_error + +If a row fails to parse due to any error emit an empty message marked with the error and then continue consuming subsequent rows when possible. This can sometimes be useful in situations where input data contains individual rows which are malformed. However, when a row encounters a parsing error it is impossible to guarantee that following rows are valid, as this indicates that the input data is unreliable and could potentially emit misaligned rows. + + +Type: `bool` +Default: `false` diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/lines.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/lines.md new file mode 100644 index 0000000000..a4f29493d1 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/lines.md @@ -0,0 +1,32 @@ + +--- +title: Lines +description: Explains an overview of line scanner in Tyk Streams +tags: [ "Tyk Streams", "Scanners", "Line", "Line Scanner" ] +--- + +Split an input stream into a message per line of data. + +```yml +# Config fields, showing default values +lines: + custom_delimiter: "" # No default (optional) + max_buffer_size: 65536 +``` + +## Fields + +### custom_delimiter + +Use a provided custom delimiter for detecting the end of a line rather than a single line break. + + +Type: `string` + +### max_buffer_size + +Set the maximum buffer size for storing line data, this limits the maximum size that a line can be without causing an error. + + +Type: `int` +Default: `65536` diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/overview.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/overview.md new file mode 100644 index 0000000000..5844f8caf5 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/overview.md @@ -0,0 +1,36 @@ +--- +title: Scanners +description: Explains an overview of scanners in Tyk Streams +tags: [ "Tyk Streams", "Scanners" ] +--- + +For most Tyk Streams inputs the data consumed comes pre-partitioned into discrete messages which can be comfortably held and processed in memory. However, some inputs such as the socket don't have a concept of consuming the data "entirely". + +For such inputs it's necessary to define a mechanism by which the stream of source bytes can be chopped into smaller logical messages, processed and outputted as a continuous process whilst the stream is being read, as this dramatically reduces the memory usage of Tyk Streams as a whole and results in a more fluid flow of data. + +The way in which we define this chopping mechanism is through scanners, configured as a field on each input that requires one. For example, if we wished to consume files line-by-line, which each individual line being processed as a discrete message, we could use the [lines scanner]({{< ref "/product-stack/tyk-streaming/configuration/scanners/lines" >}}) a file input: + +## Common + +```yaml +input: + file: + paths: [ "./*.txt" ] + scanner: + lines: {} +``` + +## Advanced + +```yaml +# Instead of newlines, use a custom delimiter: +input: + file: + paths: [ "./*.txt" ] + scanner: + lines: + custom_delimiter: "---END---" + max_buffer_size: 100_000_000 # 100MB line buffer +``` + +A scanner is a plugin similar to any other core Tyk Streams component (inputs, processors, outputs, etc), which means it's possible to define your own scanners that can be utilised by inputs that need them. diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/re-match.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/re-match.md new file mode 100644 index 0000000000..12bc783fb3 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/re-match.md @@ -0,0 +1,37 @@ +--- +title: Regular Express Match +description: Explains an overview of regular expression matching in Tyk Streams +tags: [ "Tyk Streams", "Scanners", "re_match" ] +--- + +Split an input stream into segments matching against a regular expression. + +```yml +# Config fields, showing default values +re_match: + pattern: (?m)^\d\d:\d\d:\d\d # No default (required) + max_buffer_size: 65536 +``` + +## Fields + +### pattern + +The pattern to match against. + + +Type: `string` + +```yml +# Examples + +pattern: (?m)^\d\d:\d\d:\d\d +``` + +### max_buffer_size + +Set the maximum buffer size for storing line data, this limits the maximum size that a message can be without causing an error. + + +Type: `int` +Default: `65536` diff --git a/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/switch.md b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/switch.md new file mode 100644 index 0000000000..062b71d505 --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/configuration/scanners/switch.md @@ -0,0 +1,68 @@ +--- +title: Switch +description: Explains an overview of switch scanner in Tyk Streams +tags: [ "Tyk Streams", "Scanners", "switch" ] +--- + +Select a child scanner dynamically for source data based on factors such as the filename. + +```yml +# Config fields, showing default values +switch: [] # No default (required) +``` + +This scanner outlines a list of potential child scanner candidates to be chosen, and for each source of data the first candidate to pass will be selected. A candidate without any conditions acts as a catch-all and will pass for every source, it is recommended to always have a catch-all scanner at the end of your list. If a given source of data does not pass a candidate an error is returned and the data is rejected. + +## Fields + +### [].re_match_name + +A regular expression to test against the name of each source of data fed into the scanner (filename or equivalent). If this pattern matches the child scanner is selected. + + +Type: `string` + +### [].scanner + +The scanner to activate if this candidate passes. + + +Type: `scanner` + +## Examples + +### Switch based on file name + +In this example a file input chooses a scanner based on the extension of each file + +```yaml +input: + file: + paths: [ ./data/* ] + scanner: + switch: + - re_match_name: '\.avro$' + scanner: { avro: {} } + + - re_match_name: '\.csv$' + scanner: { csv: {} } + + - re_match_name: '\.csv.gz$' + scanner: + decompress: + algorithm: gzip + into: + csv: {} + + - re_match_name: '\.tar$' + scanner: { tar: {} } + + - re_match_name: '\.tar.gz$' + scanner: + decompress: + algorithm: gzip + into: + tar: {} + + - scanner: { to_the_end: {} } +``` diff --git a/tyk-docs/content/product-stack/tyk-streaming/guides/sync-responses.md b/tyk-docs/content/product-stack/tyk-streaming/guides/sync-responses.md new file mode 100644 index 0000000000..a31a1b413c --- /dev/null +++ b/tyk-docs/content/product-stack/tyk-streaming/guides/sync-responses.md @@ -0,0 +1,133 @@ +--- +title: Synchronous Responses +description: Explains synchronous responses +tags: [ "Tyk Streams", "Synchronous Responses" ] +--- + +In a regular Tyk Streams pipeline, messages will flow in one direction and acknowledgements flow in the other: + +```text + ----------- Message -------------> + +Input (AMQP) -> Processors -> Output (AMQP) + + <------- Acknowledgement --------- +``` + +However, Tyk Streams has support for a number of protocols where this limitation is not the case. + +For example, HTTP is a request/response protocol, and so our [http_server]({{< ref "/product-stack/tyk-streaming/configuration/inputs/http-server" >}}) input is capable of returning a response payload after consuming a message from a request. + +When using these protocols it's possible to configure Tyk Stream pipelines that allow messages to pass in the opposite direction, resulting in response messages at the input level: + +```text + --------- Request Body --------> + +Input (HTTP Server) -> Processors -> Output (Sync Response) + + <--- Response Body (and ack) --- +``` + +## Routing Processed Messages Back + +It's possible to route the result of any Tyk Streams processing pipeline directly back to an input with a [sync_response]({{< ref "/product-stack/tyk-streaming/configuration/outputs/sync-response" >}}) output: + +```yaml +input: + http_server: + path: /post +pipeline: + processors: + - mapping: root = content().uppercase() +output: + sync_response: {} +``` + +Using the above example, POSTING *foo bar* to the path `/post` returns the response *FOO BAR*. + +It's also possible to combine a [sync_response]({{< ref "/product-stack/tyk-streaming/configuration/outputs/sync-response" >}}) output with other outputs using a [broker]({{< ref "/product-stack/tyk-streaming/configuration/outputs/broker" >}}): + +```yaml +input: + http_server: + path: /post +output: + broker: + pattern: fan_out + outputs: + - kafka: + addresses: [ TODO:9092 ] + topic: foo_topic + - sync_response: {} + processors: + - mapping: root = content().uppercase() +``` + +Using the above example, POSTING *foo bar* to the path `/post` passes the message unchanged to the Kafka topic `foo_topic` and also returns the response *FOO BAR*. + +{{< note >}} +**Note** + +It's safe to use these mechanisms even when combining multiple inputs with a broker, a response payload will always be routed back to the original source of the message. +{{< /note>}} + +## Returning Partially Processed Messages + +It's possible to set the state of a message to be the synchronous response before processing is finished by using the [sync_response]({{< ref "/product-stack/tyk-streaming/configuration/processors/sync-response" >}}) processor. This allows you to further mutate the payload without changing the response returned to the input: + +```yaml +input: + http_server: + path: /post + +pipeline: + processors: + - mapping: root = "%v baz".format(content().string()) + - sync_response: {} + - mapping: root = content().uppercase() + +output: + kafka: + addresses: [ TODO:9092 ] + topic: foo_topic +``` + +Using the above example, POSTING a request *foo bar* to the path `/post` passes the message *FOO BAR BAZ* to the Kafka topic `foo_topic` and also returns the response *foo bar baz*. + +However, it is important to keep in mind that due to Tyk Streams' strict delivery guarantees, the response message will not actually be returned until the message has reached its output destination and an acknowledgement can be made. + +## Routing Output Responses Back + +Some outputs, such as [http_client]({{< ref "/product-stack/tyk-streaming/configuration/outputs/http-client" >}}), have the potential to propagate payloads received from their destination after sending a message back to the input: + +```yaml +input: + http_server: + path: /post +output: + http_client: + url: http://localhost:4196/post + verb: POST + propagate_response: true +``` + +With the above example a message received from the endpoint `/post` would be sent unchanged to the address `http://localhost:4196/post`, and then the response from that request would get returned back. This basically turns Tyk Streams into a proxy server with the potential to mutate payloads between requests. + +The following config turns Tyk Streams into an HTTP proxy server that also sends all request payloads to a Kafka topic: + +```yaml +input: + http_server: + path: /post +output: + broker: + pattern: fan_out + outputs: + - kafka: + addresses: [ TODO:9092 ] + topic: foo_topic + - http_client: + url: http://localhost:4196/post + verb: POST + propagate_response: true +``` diff --git a/tyk-docs/data/menu.yaml b/tyk-docs/data/menu.yaml index 7d50fc3971..1dae55f2ed 100644 --- a/tyk-docs/data/menu.yaml +++ b/tyk-docs/data/menu.yaml @@ -3974,6 +3974,14 @@ menu: path: /product-stack/tyk-streaming/configuration/inputs/broker category: Page show: True + - title: "Generate" + path: /product-stack/tyk-streaming/configuration/inputs/generate + category: Page + show: True + - title: "Http Client" + path: /product-stack/tyk-streaming/configuration/inputs/http-client + category: Page + show: True - title: "Http Server" path: /product-stack/tyk-streaming/configuration/inputs/http-server category: Page @@ -4026,6 +4034,10 @@ menu: path: /product-stack/tyk-streaming/configuration/outputs/fallback category: Page show: True + - title: "Http Client" + path: /product-stack/tyk-streaming/configuration/outputs/http-client + category: Page + show: True - title: "Http Server" path: /product-stack/tyk-streaming/configuration/outputs/http-server category: Page @@ -4066,6 +4078,10 @@ menu: path: /product-stack/tyk-streaming/configuration/outputs/stdout category: Page show: True + - title: "Sync Response" + path: /product-stack/tyk-streaming/configuration/outputs/sync-response + category: Page + show: True - title: "Processors" category: "Directory" show: True @@ -4118,6 +4134,10 @@ menu: path: /product-stack/tyk-streaming/configuration/processors/group-by-value category: Page show: True + - title: "Http" + path: /product-stack/tyk-streaming/configuration/processors/http + category: Page + show: True - title: "Insert Part" path: /product-stack/tyk-streaming/configuration/processors/insert-part category: Page @@ -4202,6 +4222,10 @@ menu: path: /product-stack/tyk-streaming/configuration/processors/switch category: Page show: True + - title: "Sync Response" + path: /product-stack/tyk-streaming/configuration/processors/sync-response + category: Page + show: True - title: "Try" path: /product-stack/tyk-streaming/configuration/processors/try category: Page @@ -4210,6 +4234,30 @@ menu: path: /product-stack/tyk-streaming/configuration/processors/while catgeory: Page show: True + - title: "Scanners" + category: Directory + show: True + menu: + - title: "Overview" + path: /product-stack/tyk-streaming/configuration/scanners/overview + category: Page + show: true + - title: "CSV" + path: /product-stack/tyk-streaming/configuration/scanners/csv + category: Page + show: True + - title: "Line" + path: /product-stack/tyk-streaming/configuration/scanners/lines + category: Page + show: True + - title: "Regular Expression Matching" + path: /product-stack/tyk-streaming/configuration/scanners/re-match + category: Page + show: True + - title: "Switch" + path: /product-stack/tyk-streaming/configuration/scanners/switch + category: Page + show: True - title: "Common Configurations" category: Directory show: True @@ -4310,6 +4358,10 @@ menu: path: "/product-stack/tyk-streaming/guides/bloblang/methods/type-coercion" category: Page show: True + - title: "Sync Response" + path: "/product-stack/tyk-streaming/guides/sync-responses" + category: Page + show: True - title: "Rate Limits" category: Directory show: true