Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upgrade to current ElasticSearch #63

Open
svigerske opened this issue Aug 12, 2024 · 2 comments
Open

upgrade to current ElasticSearch #63

svigerske opened this issue Aug 12, 2024 · 2 comments

Comments

@svigerske
Copy link
Member

I spend some days to get an ElasticSearch 8.15 up and running and to start migrating data from an ElasticSearch 2.x server.
But now I noticed that the README here says that Rubberband only works with ElasticSearch 2.x. (yes, I should have seen that earlier).

I see that there is a branch upgrade-elasticsearch that hasn't been received updates for 6 years. @fschloesser Do you remember what the state of this is? What is the difficulty in using a more recent ElasticSearch?

@fschloesser
Copy link
Collaborator

If I remember correctly, it wasn't a trivial change to migrate the database objects from the old version to the new version. Some fundamental change was introduced after version 2.x of elasticsearch and I didn't find the time to look into this in more depth. The comment that you're quoting applies to the object structure that is currently used in rubberband.
What I imagine needs to happen is to write a migrate script that translates the objects from the current database to a new format and pours them into the new database. Also, the way that rubberband interacts with the elasticsearch database probably needs some adjustments.

@svigerske
Copy link
Member Author

Just starting ES 8.15 on the data from ES 2.x is indeed not working. Upgrading one major release at a time may work, but also the reindex-from-remote feature of ES seemed promising. I got some migration started with this; I'll just put the script here so I find it later again:

# create the index and set dynamic mapping to runtime to get double as default for numerics
# https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-field-mapping.html
# https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic.html
curl -X PUT "localhost:9200/solver-results" -H 'Content-Type: application/json' -d '
{
  "mappings": {
    "dynamic": "runtime"
  }
}'

# say that we have many fields (default is 1000)
curl -X PUT "localhost:9200/_all/_settings?preserve_existing=false" -H 'Content-Type: application/json' -d '{ "index.mapping.total_fields.limit" : "20000" }'

# get data from previous ES server, but tunnel through system outside ZIB because old server is not reachable from new one
# https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html#reindex-from-remote
curl -X POST "localhost:9200/_reindex?pretty" -H 'Content-Type: application/json' -d'
{
  "source": {
    "remote": {
      "host": "https://example.gams.com:443",
      "username": "of the auth for the es server",
      "password": "of the auth for the es server"
    },
    "index": "solver-results",
    "size": 1
  },
  "dest": {
    "index": "solver-results"
  }
}
'
# "query": { "range" : { "upload_timestamp": { "gte" : "2020-01-01" } } },

However, there is a size limit of 100MB, which cannot be adjusted (it is not the http.max_content_length), and even with size:1 (that is, do only one document at a time), this process failed eventually. Probably some out file being that large. Also, it was very slow, partly due to setting size:1, partly due to having to tunnel through some machine outside ZIB in order to have the new server reach the old one.
And that was before attempting for Rubberband to talk to ES 8.15.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants