Skip to content

PPL queries use timestamp() resulting in scripted plan #632

Open
@jonnangle

Description

@jonnangle

What happened:

PPL queries use timestamp() to construct timestamp objects when searching, which results in a scripted plan. This is very slow compared to allowing OpenSearch to interpret the raw timestamp directly.

For example executing this query plan takes ~10 seconds on our data set, as it uses a script to filter:

POST /_plugins/_ppl/_explain
{
  "query": "index=myindex | where `@timestamp` >= timestamp('2025-05-07 15:04:05') and `@timestamp` <= timestamp('2025-05-07 15:38:30')"
}
{
  "root": {
    "name": "ProjectOperator",
    "description": {
      "fields": "[cloud, http, @timestamp, event, log]"
    },
    "children": [
      {
        "name": "OpenSearchIndexScan",
        "description": {
          "request": """OpenSearchQueryRequest(indexName=myindex, sourceBuilder={"from":0,"size":200,"timeout":"1m","query":{"bool":{"filter":[{"script":{"script":{"source":"<elided>","lang":"opensearch_query_expression"},"boost":1.0}},{"script":{"script":{"source":"<elided>","lang":"opensearch_query_expression"},"boost":1.0}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":["log","@timestamp","http","cloud","event"],"excludes":[]},"sort":[{"_doc":{"order":"asc"}}]}, needClean=true, searchDone=false, pitId=null, cursorKeepAlive=null, searchAfter=null, searchResponse=null)"""
        },
        "children": []
      }
    ]
  }
}

whereas this one without timestamp() takes ~200ms as it can use a range query directly:

POST /_plugins/_ppl/_explain
{
  "query": "index=myindex | where `@timestamp` >= '2025-05-07 15:04:05' and `@timestamp` <= '2025-05-07 15:38:30'"
}
{
  "root": {
    "name": "ProjectOperator",
    "description": {
      "fields": "[cloud, http, @timestamp, event, log]"
    },
    "children": [
      {
        "name": "OpenSearchIndexScan",
        "description": {
          "request": """OpenSearchQueryRequest(indexName=myindex, sourceBuilder={"from":0,"size":200,"timeout":"1m","query":{"bool":{"filter":[{"range":{"@timestamp":{"from":1746630245000,"to":null,"include_lower":true,"include_upper":true,"boost":1.0}}},{"range":{"@timestamp":{"from":null,"to":1746632310000,"include_lower":true,"include_upper":true,"boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":["@timestamp","event","log","cloud","http"],"excludes":[]},"sort":[{"_doc":{"order":"asc"}}]}, needClean=true, searchDone=false, pitId=null, cursorKeepAlive=null, searchAfter=null, searchResponse=null)"""
        },
        "children": []
      }
    ]
  }
}

What you expected to happen:

The plan should avoid expensive queries where possible.

How to reproduce it (as minimally and precisely as possible):

See above.

Anything else we need to know?:

Environment:

  • Grafana version: Grafana v11.6.0+security-01 (bb93ae3e12)
  • OpenSearch version: AWS Serverless
  • Plugin version: 2.25.0

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions