Skip to content
Open
Show file tree
Hide file tree
Changes from 43 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
88fcce6
CMR-10691: Update CMR local dev to spin up new embedded non-granule c…
jmaeng72 Aug 22, 2025
ca785c7
CMR-10761: Update Access Control App to work with split-cluster (loca…
jmaeng72 Aug 28, 2025
5d62130
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Sep 2, 2025
601191d
Merge branch 'CMR-10600-split-cluster-feature' of https://github.com/…
jmaeng72 Sep 2, 2025
c391ac8
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Sep 4, 2025
62d0a13
CMR-10751: Add Indexer index-set and rebalance API changes (#2288)
jmaeng72 Sep 5, 2025
759a710
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Sep 9, 2025
2553fd5
CMR-10759: Update bulk index funcs in bootstrap, fix some indexer bug…
jmaeng72 Sep 10, 2025
cc7ccee
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Sep 10, 2025
5226fef
CMR-10762: Update Search App to work with split-cluster (locally) (#2…
jmaeng72 Sep 10, 2025
cb0a5cb
CMR-10764: Update sys-tests to work with split cluster
jmaeng72 Sep 15, 2025
673c3c5
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Sep 23, 2025
3c96a95
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Sep 24, 2025
12b5474
CMR-10765: Update split cluster to account for alias (#2310)
jmaeng72 Sep 29, 2025
404025d
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Oct 6, 2025
b728aa5
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Oct 15, 2025
94e500f
CMR-10940: Update reshard funcs to use split cluster (#2324)
jmaeng72 Oct 20, 2025
f881cc9
CMR-10941: Update db-migrate to update index-set with granule indexes…
jmaeng72 Oct 27, 2025
07ac650
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Oct 27, 2025
dd1a571
CMR-10600: Update reshard funcs to use split cluster (#2338)
jmaeng72 Nov 3, 2025
22baed1
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Nov 3, 2025
c4e83c9
update test
jmaeng72 Nov 3, 2025
50d1e6b
update reshard readme
jmaeng72 Nov 4, 2025
73bd39f
CMR-11011: Fix split cluster to work with new reshard index names (#2…
jmaeng72 Nov 10, 2025
4a13a91
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Nov 10, 2025
1dd3809
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Nov 11, 2025
6185eb9
CMR-11014: Update reshard fixes with split cluster features (#2345)
jmaeng72 Nov 14, 2025
ec9f062
update errors
jmaeng72 Nov 19, 2025
5cf21aa
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Nov 19, 2025
752cc41
update broken funcs
jmaeng72 Nov 19, 2025
870bd1e
fix critical errors on funcs
jmaeng72 Nov 20, 2025
8d7c1ac
Update func names, add unit tests, guard against nils
jmaeng72 Nov 20, 2025
1b57707
fix guard in get-es-cluster-name-by-index-or-alias
jmaeng72 Nov 20, 2025
c1d4473
fix format of require
jmaeng72 Nov 20, 2025
3f53bd2
update int tests
jmaeng72 Nov 20, 2025
7c1619e
fix test issues
jmaeng72 Nov 20, 2025
89e729c
fix unit test
jmaeng72 Nov 21, 2025
2c46b5e
fix valid data crud tests
jmaeng72 Nov 21, 2025
78d6c40
minor clean up of tests
jmaeng72 Nov 21, 2025
1658e1b
fix test again
jmaeng72 Nov 21, 2025
f07e5dd
fix int test again
jmaeng72 Nov 21, 2025
753c9b3
update comments and func descriptions
jmaeng72 Nov 25, 2025
99a08db
fix issues by coderabbit
jmaeng72 Nov 25, 2025
384e7bc
add minor comments
jmaeng72 Nov 26, 2025
fd3b583
combine elastic server setup funcs, add log toggle
jmaeng72 Nov 26, 2025
a8ae351
update comments and index set id
jmaeng72 Nov 26, 2025
cb9544d
Merge branch 'master' of https://github.com/nasa/Common-Metadata-Repo…
jmaeng72 Dec 4, 2025
25132aa
update docstring
jmaeng72 Dec 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions access-control-app/src/cmr/access_control/data/bulk_index.clj
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,13 @@
* :force-version? - true indicates that we should overwrite whatever is in elasticsearch with the
latest regardless of whether the version in the database is older than the _version in elastic.
Returns a map with keys of :num-indexed and :max-revision-date."
([context concept-batches]
(bulk-index-with-revision-date context concept-batches nil))
([context concept-batches options]
([context concept-batches es-cluster-name]
(bulk-index-with-revision-date context concept-batches es-cluster-name nil))
([context concept-batches es-cluster-name options]
(reduce (fn [{:keys [num-indexed max-revision-date]} batch]
(let [max-revision-date (get-max-revision-date batch max-revision-date)
batch (prepare-batch context batch options)]
(es/bulk-index-documents context batch)
(es/bulk-index-documents context batch es-cluster-name)
{:num-indexed (+ num-indexed (count batch))
:max-revision-date max-revision-date}))
{:num-indexed 0 :max-revision-date nil}
Expand Down
3 changes: 2 additions & 1 deletion access-control-app/src/cmr/access_control/system.clj
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
[cmr.common.log :as log :refer [info]]
[cmr.common.nrepl :as nrepl]
[cmr.common.system :as common-sys]
[cmr.elastic-utils.config :as es-config]
[cmr.elastic-utils.search.es-index :as search-index]
[cmr.message-queue.queue.queue-broker :as queue-broker]
[cmr.transmit.config :as transmit-config]
Expand Down Expand Up @@ -96,7 +97,7 @@
[]
(let [sys {:instance-name (common-sys/instance-name "access-control")
:log (log/create-logger-with-log-level (log-level))
:search-index (search-index/create-elastic-search-index)
:search-index (search-index/create-elastic-search-index es-config/elastic-name)
:web (web-server/create-web-server (transmit-config/access-control-port) routes/handlers)
:nrepl (nrepl/create-nrepl-if-configured (access-control-nrepl-port))
:queue-broker (queue-broker/create-queue-broker (config/queue-config))
Expand Down
3 changes: 2 additions & 1 deletion access-control-app/src/cmr/db.clj
Original file line number Diff line number Diff line change
Expand Up @@ -2,14 +2,15 @@
"Entry point for the db (elasticsearch) related operations. Defines a main method that accepts arguments."
(:require
[cmr.common.log :refer (info error)]
[cmr.elastic-utils.config :as es-config]
[cmr.elastic-utils.search.access-control-index :as ac-index]
[cmr.elastic-utils.search.es-index :as search-index]
[cmr.common.lifecycle :as l])
(:gen-class))

(defn migrate
[]
(let [elastic-store (l/start (search-index/create-elastic-search-index) nil)]
(let [elastic-store (l/start (search-index/create-elastic-search-index es-config/elastic-name) nil)]
(ac-index/create-index-or-update-mappings elastic-store)))

(defn -main
Expand Down
24 changes: 21 additions & 3 deletions bootstrap-app/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,10 +94,16 @@ Starts the resharding process to create a new index with the given number of sha
the data from the old index to the new index. Both indexes will be used for ingest until
the resharding is finalized.

You MUST give the elastic_name parameter to tell CMR which cluster your index is in that is going to be resharded.

- Required params:
- elastic_name (str)
- options: gran-elastic and elastic

```
curl -i \
-X POST \
"http://localhost:3005/reshard/1_small_collections/start?num_shards=50"
"http://localhost:3005/reshard/1_small_collections/start?num_shards=50&elastic_name=gran-elastic"

HTTP/1.1 200 OK
{"message": "Resharding started for index 1_small_collections"}
Expand All @@ -107,10 +113,16 @@ HTTP/1.1 200 OK

Retrieves the resharding status for an index, including the original index name, target index name, and current resharding status. Returns a 404 status code if the specified index is not currently undergoing resharding.

You MUST give the elastic_name parameter to tell CMR which cluster your index is in that is going to be resharded.

- Required params:
- elastic_name (str)
- options: gran-elastic and elastic

```
curl -i \
-H "Accept: application/json" \
http://localhost:3006/reshard/1_c1234_prov1/status
http://localhost:3006/reshard/1_c1234_prov1/status?elastic_name=gran-elastic

HTTP/1.1 200 OK
{"original-index":"1_c1234_prov1","reshard-index":"1_c1234_prov1_75_shards", "reshard-status": "COMPLETE"}
Expand All @@ -120,10 +132,16 @@ HTTP/1.1 200 OK
Finalizes the resharding process to move the ElasticSearch alias to point to the newly resharded
index. Returns a 400 error if the index resharding is not complete.

You MUST give the elastic_name parameter to tell CMR which cluster your index is in that is going to be resharded.

- Required params:
- elastic_name (str)
- options: gran-elastic and elastic

```
curl -i \
-X POST \
"http://localhost:3005/reshard/1_small_collections/finalize"
"http://localhost:3005/reshard/1_small_collections/finalize?elastic_name=gran-elastic"

HTTP/1.1 200 OK
{"message": "Resharding completed for index 1_small_collections"}
Expand Down
33 changes: 23 additions & 10 deletions bootstrap-app/src/cmr/bootstrap/api/resharding.clj
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
(ns cmr.bootstrap.api.resharding
"Defines the resharding functions for the bootstrap API."
(:require
[clojure.string :as string]
[cmr.bootstrap.api.messages :as msg]
[cmr.bootstrap.api.util :as api-util]
[cmr.bootstrap.services.bootstrap-service :as service]
Expand All @@ -21,24 +22,36 @@
[(format "Invalid num_shards [%s]. Only integers greater than zero are allowed."
num-shards-str)]))))

(defn- validate-es-cluster-name-not-blank
[es-cluster-name]
(when (string/blank? es-cluster-name)
(errors/throw-service-error :bad-request "Empty elastic cluster name is not allowed.")))

(defn start
"Kicks off resharding of an index."
[context index params]
(let [dispatcher (api-util/get-dispatcher context params :migrate-index)
num-shards-str (:num_shards params)]
(validate-num-shards num-shards-str)
(service/start-reshard-index context dispatcher index (parse-long num-shards-str))
(let [es-cluster-name (:elastic_name params)
_ (validate-es-cluster-name-not-blank es-cluster-name)
num-shards-str (:num_shards params)
_ (validate-num-shards num-shards-str)
dispatcher (api-util/get-dispatcher context params :migrate-index)]

(service/start-reshard-index context dispatcher index (parse-long num-shards-str) es-cluster-name)
{:status 200
:body {:message (msg/resharding-started index)}}))

(defn get-status
"Gets the status of resharding an index."
[context index]
(service/reshard-status context index))
[context index params]
(let [es-cluster-name (:elastic_name params)]
(validate-es-cluster-name-not-blank es-cluster-name)
(service/reshard-status context index es-cluster-name)))

(defn finalize
"Completes resharding the index"
[context index]
(service/finalize-reshard-index context index)
{:status 200
:body {:message (msg/resharding-completed index)}})
[context index params]
(let [es-cluster-name (:elastic_name params)]
(validate-es-cluster-name-not-blank es-cluster-name)
(service/finalize-reshard-index context index es-cluster-name)
{:status 200
:body {:message (msg/resharding-completed index)}}))
8 changes: 4 additions & 4 deletions bootstrap-app/src/cmr/bootstrap/api/routes.clj
Original file line number Diff line number Diff line change
Expand Up @@ -120,12 +120,12 @@
(POST "/start" {:keys [request-context params]}
(acl/verify-ingest-management-permission request-context :update)
(resharding/start request-context index params))
(POST "/finalize" {:keys [request-context]}
(POST "/finalize" {:keys [request-context params]}
(acl/verify-ingest-management-permission request-context :update)
(resharding/finalize request-context index))
(GET "/status" {:keys [request-context]}
(resharding/finalize request-context index params))
(GET "/status" {:keys [request-context params]}
(acl/verify-ingest-management-permission request-context :update)
(resharding/get-status request-context index)))
(resharding/get-status request-context index params)))
(context "/virtual_products" []
(POST "/" {:keys [request-context params]}
(virtual-products/bootstrap request-context params)))
Expand Down
72 changes: 46 additions & 26 deletions bootstrap-app/src/cmr/bootstrap/data/bulk_index.clj
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
[cmr.common.concepts :as cc]
[cmr.common.log :refer (info warn error)]
[cmr.common.util :as util]
[cmr.elastic-utils.config :as es-config]
[cmr.elastic-utils.es-helper :as es-helper]
[cmr.indexer.indexer-util :as indexer-util]
[cmr.indexer.data.index-set :as index-set]
Expand Down Expand Up @@ -64,19 +65,19 @@
(db/get-concept (helper/get-metadata-db-db (:system context)) :collection provider collection-id))

(defn migrate-index
"Copy the contents of one index to another."
[system source-index target-index]
(info (format "Migrating from index [%s] to index [%s]" source-index target-index))
"Copy the contents of one index to another. The target index is assumed to be in the same elastic cluster as the source index."
[system source-index target-index elastic-name]
(info (format "Migrating from index [%s] to index [%s] in es cluster [%s]" source-index target-index elastic-name))
(let [indexer-context {:system (helper/get-indexer system)}
conn (indexer-util/context->conn indexer-context)]
conn (indexer-util/context->conn indexer-context elastic-name)]
(try
(let [result (es-helper/migrate-index conn source-index target-index)]
(when (:error result)
(throw (ex-info "Migration failed" {:source source-index :target target-index :error result}))))
(index-set-service/update-resharding-status indexer-context index-set/index-set-id source-index "COMPLETE")
(index-set-service/update-resharding-status indexer-context index-set/index-set-id source-index "COMPLETE" elastic-name)
(catch Throwable e
(error e (format "Migration from [%s] to [%s] failed: %s" source-index target-index (.getMessage e)))
(index-set-service/update-resharding-status indexer-context index-set/index-set-id source-index "FAILED")
(index-set-service/update-resharding-status indexer-context index-set/index-set-id source-index "FAILED" elastic-name)
(throw e)))))

(defn index-granules-for-collection
Expand All @@ -92,6 +93,7 @@
concept-batches (db/find-concepts-in-batches db provider params (:db-batch-size system) start-index)
num-granules (index/bulk-index {:system (helper/get-indexer system)}
concept-batches
es-config/gran-elastic-name
{:target-index-key target-index-key})]
(info "Indexed" num-granules "granule(s) for provider" provider-id "collection" collection-id)
(when completion-message
Expand All @@ -113,7 +115,10 @@
params {:concept-type :granule
:provider-id provider-id}
concept-batches (db/find-concepts-in-batches db provider params (:db-batch-size system) start-index)
num-granules (index/bulk-index {:system (helper/get-indexer system)} concept-batches {})]
num-granules (index/bulk-index {:system (helper/get-indexer system)}
concept-batches
es-config/gran-elastic-name
{})]
(info "Indexed" num-granules "granule(s) for provider" provider-id)
num-granules))

Expand All @@ -125,7 +130,10 @@
params {:concept-type :collection
:provider-id provider-id}
concept-batches (db/find-concepts-in-batches db provider params (:db-batch-size system))
num-collections (index/bulk-index {:system (helper/get-indexer system)} concept-batches {})]
num-collections (index/bulk-index {:system (helper/get-indexer system)}
concept-batches
es-config/elastic-name
{})]
(info "Indexed" num-collections "collection(s) for provider" provider-id)
num-collections))

Expand All @@ -145,10 +153,10 @@

(defn- bulk-index-concept-batches
"Bulk index the given concept batches in both regular index and all revisions index."
[system concept-batches]
[system concept-batches es-cluster-name]
(let [indexer-context {:system (helper/get-indexer system)}]
(index/bulk-index indexer-context concept-batches {:all-revisions-index? true})
(index/bulk-index indexer-context concept-batches {})))
(index/bulk-index indexer-context concept-batches es-cluster-name {:all-revisions-index? true})
(index/bulk-index indexer-context concept-batches es-cluster-name {})))

(defn- index-concepts-by-provider
"Bulk index concepts for the given provider and concept-type."
Expand All @@ -165,7 +173,10 @@
db provider
params
(:db-batch-size system))
num-concepts (bulk-index-concept-batches system concept-batches)
es-cluster-name (if (= concept-type :granule)
es-config/gran-elastic-name
es-config/elastic-name)
num-concepts (bulk-index-concept-batches system concept-batches es-cluster-name)
msg (format "Indexing of %s %s revisions for provider %s completed."
num-concepts
(name concept-type)
Expand All @@ -190,14 +201,14 @@
(defn- index-access-control-concepts
"Bulk index ACLs or access groups"
[system concept-batches]
(info "Indexing concepts")
(ac-bulk-index/bulk-index-with-revision-date {:system (helper/get-indexer system)} concept-batches))
(info "Indexing access control concepts")
(ac-bulk-index/bulk-index-with-revision-date {:system (helper/get-indexer system)} concept-batches es-config/elastic-name))

(defn- index-concepts
"Bulk index the given concepts using the indexer-app"
[system concept-batches]
[system concept-batches es-cluster-name]
(info "Indexing concepts")
(index/bulk-index-with-revision-date {:system (helper/get-indexer system)} concept-batches))
(index/bulk-index-with-revision-date {:system (helper/get-indexer system)} concept-batches es-cluster-name))

(defn- fetch-and-index-new-concepts
"Get batches of concepts for a given provider/concept type that have a revision-date
Expand All @@ -212,10 +223,13 @@
(dissoc params :provider-id)
params)
concept-batches (db/find-concepts-in-batches
db provider params (:db-batch-size system))
db provider params (:db-batch-size system))
es-cluster-name (if (= concept-type :granule)
es-config/gran-elastic-name
es-config/elastic-name)
{:keys [max-revision-date num-indexed]} (if (contains? #{:acl :access-group} concept-type)
(index-access-control-concepts system concept-batches)
(index-concepts system concept-batches))]
(index-access-control-concepts system concept-batches)
(index-concepts system concept-batches es-cluster-name))]

(info (format (str "Indexed %d %s(s) for provider %s with revision-date later than %s and max "
"revision date was %s.")
Expand All @@ -241,7 +255,7 @@
(:db-batch-size system)
start-index)]]
(:num-indexed (if (= concept-type :tag)
(index-concepts system concept-batches)
(index-concepts system concept-batches es-config/elastic-name)
(index-access-control-concepts system concept-batches)))))]
(info "Indexed" total "system concepts.")
total))
Expand All @@ -261,12 +275,15 @@
{:concept-type concept-type :concept-id batch}
(:db-batch-size system))]
concept-batch)
total (index/bulk-index {:system (helper/get-indexer system)} concept-batches)]
es-cluster-name (if (= :granule concept-type)
es-config/gran-elastic-name
es-config/elastic-name)
total (index/bulk-index {:system (helper/get-indexer system)} concept-batches es-cluster-name)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the first parameter is a context, and it is now repeated (and I see it below as well). Can you make this a local function in case we have to add something to all of them latter we can do it once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this warrants a separate func since it just calls another func to return a var


;; for concept types that have all revisions index, also index the all revisions index
(when-not (#{:tag :granule} concept-type)
(index/bulk-index
{:system (helper/get-indexer system)} concept-batches {:all-revisions-index? true}))
{:system (helper/get-indexer system)} concept-batches es-config/elastic-name {:all-revisions-index? true}))

(info "Indexed " total " concepts.")
total))
Expand All @@ -279,7 +296,7 @@
[system _ _ concept-ids]
(let [query {:terms {:concept-id concept-ids}}
indexer-context {:system (helper/get-indexer system)}]
(es-helper/delete-by-query (indexer-util/context->conn indexer-context) "_all" "granule" query)))
(es-helper/delete-by-query (indexer-util/context->conn indexer-context es-config/gran-elastic-name) "_all" "granule" query)))

(defmethod delete-concepts-by-id :default
[system provider-id concept-type concept-ids]
Expand All @@ -295,7 +312,10 @@
{:concept-type concept-type :concept-id batch}
(:db-batch-size system))]
(map #(assoc % :deleted true) concept-batch))
total (index/bulk-index {:system (helper/get-indexer system)} concept-batches)]
es-cluster-name (if (= concept-type :granule)
es-config/gran-elastic-name
es-config/elastic-name)
total (index/bulk-index {:system (helper/get-indexer system)} concept-batches es-cluster-name)]
(info "Deleted " total " concepts")
total))

Expand Down Expand Up @@ -409,7 +429,7 @@
(let [channel (:migrate-index-channel core-async-dispatcher)]
(async/thread (while true
(try ; log errors but keep the thread alive)
(let [{:keys [source-index target-index]} (<!! channel)]
(migrate-index system source-index target-index))
(let [{:keys [source-index target-index elastic-name]} (<!! channel)]
(migrate-index system source-index target-index elastic-name))
(catch Throwable e
(error e (.getMessage e)))))))))
5 changes: 3 additions & 2 deletions bootstrap-app/src/cmr/bootstrap/data/rebalance_util.clj
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
[clojurewerkz.elastisch.query :as q]
[cmr.bootstrap.embedded-system-helper :as helper]
[cmr.elastic-utils.es-helper :as es-helper]
[cmr.elastic-utils.config :as es-config]
[cmr.indexer.indexer-util :as indexer-util]
[cmr.indexer.data.index-set :as index-set]))

Expand All @@ -20,7 +21,7 @@
(defn- granule-count-for-collection
"Gets the granule count for the collection in the elastic index."
[indexer-context index-name concept-id]
(let [conn (indexer-util/context->conn indexer-context)
(let [conn (indexer-util/context->conn indexer-context es-config/gran-elastic-name)
query (es-query-for-collection-concept-id concept-id)]
(:count (es-helper/count-query conn index-name granule-mapping-type-name query))))

Expand Down Expand Up @@ -59,5 +60,5 @@
small-coll-index (get-in index-names [:index-names :granule :small_collections])
granule-mapping-type-name (-> index-set/granule-mapping keys first name)]
(es-helper/delete-by-query
(indexer-util/context->conn indexer-context) small-coll-index granule-mapping-type-name
(indexer-util/context->conn indexer-context es-config/gran-elastic-name) small-coll-index granule-mapping-type-name
(es-query-for-collection-concept-id concept-id))))
Loading