Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider removing unneccessary custom date format from index mapping #11

Open
lukas-vlcek opened this issue Sep 22, 2016 · 3 comments
Open

Comments

@lukas-vlcek
Copy link
Member

Currently, we are specifying custom date formats for some date fields in index templates, like:

"format": "yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ||yyyy-MM-dd'T'HH:mm:ssZ||dateOptionalTime",
"type": "date"

The following is a simple script that tests if Elasticsearch can index values formatted using those patterns out of the box (script available also here).

#!/bin/bash

ES=http://localhost:9200

function delete_index() {
  curl -X DELETE "${ES}/test"
}

function refresh() {
  curl -X POST "${ES}/_refresh"
}

function index_document() {
  echo "Testing $1"
  curl -X POST "${ES}/test/1" -d "{
    \"date\": \"$1\"
  }" 
}

function mapping() {
  curl -X GET "${ES}/test/_mapping?pretty"
}

# test formats: "yyyy-MM-dd'T'HH:mm:ss.SSSSSSZ||yyyy-MM-dd'T'HH:mm:ssZ"
for i in 2014-01-17T15:57:22.123456Z  2014-01-17T15:57:22Z 
do
  delete_index
  index_document $i
  mapping
  refresh
done

We can see below the values are correctly indexed. I think it makes sense to add only those custom formats that are not indexed by default (this means we are declaring we expect those format). When declaring formats that are indexed out of the box I wonder if this can contribute to later confusion ("Did we have issues indexing those values? What kind of issues?").

Results for Elasticsearch:

  • v1.7.2
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRJMoWGkau0j2E_IYd","_version":1,"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "dateOptionalTime"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRJMtNGkau0j2E_IYe","_version":1,"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "dateOptionalTime"
}}}}}}
  • v2.3.5
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRJlQxVUFkKwD07My_","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRJlVtVUFkKwD07MzA","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}
  • v2.4.0
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRJxF4bHsb53sfaNXO","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRJxLKbHsb53sfaNXP","_version":1,"_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date",
            "format" : "strict_date_optional_time||epoch_millis"
}}}}}}
  • v5.0.0-alpha5
# Testing 2014-01-17T15:57:22.123456Z
{"_index":"test","_type":"1","_id":"AVdRKG2bWRbH4ugGaVvq","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date"
}}}}}}

# Testing 2014-01-17T15:57:22Z
{"_index":"test","_type":"1","_id":"AVdRKG7tWRbH4ugGaVvr","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}{
  "test" : {
    "mappings" : {
      "1" : {
        "properties" : {
          "date" : {
            "type" : "date"
}}}}}
@portante
Copy link
Member

I finally took a look at this, sorry it took me so long to get to it.

I was not able to make this fail in our environment, 1.7.5:

{
status: 200,
name: "staging-perf34",
cluster_name: "elasticsearch.perf-dept",
version: {
number: "1.7.5",
build_hash: "00f95f4ffca6de89d68b7ccaf80d148f1f70e4d4",
build_timestamp: "2016-02-02T09:55:30Z",
build_snapshot: false,
lucene_version: "4.10.4"
},
tagline: "You Know, for Search"
}

I am going to guess that this was required for either 1.5 or an earlier version and is no longer needed with the latest versions of ES.

@lukas-vlcek
Copy link
Member Author

Thanks @portante

Yea, it is possible that it was needed in some earlier ES versions. It is not needed now [1.7.x - 5.0.x] and it may change in the future again.

May be the best thing to do would be to remove these specific date formats from index templates AND at the same time make sure we have some tests in place that verifies it is able to correctly index data with couple of such date strings OOB. Not sure where to put such tests at this moment (somewhere into openshift/origin-aggregated-logging?). But I think it should be part of Common Data Model introduction effort.

@portante
Copy link
Member

Hopefully we can do this as part of ViaQ here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants