Renaming the node role search to warm #17573

vinaykpud · 2025-03-11T18:03:44Z

In this PR we are renaming the existing Node "Search Role" to "Warm Role"

Description

This is done based on the decision taken as part of the discussion in this thread: #17422 (comment)

Related Issues

Related to #15306
Related to #17422

Check List

Functionality includes testing.
API changes companion pull request created, if applicable.
Public documentation issue/PR [created] [DOC] Rename Search node role to Warm node role for searchable snapshots documentation-website#9392

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

mch2 · 2025-03-11T18:38:38Z

This is a straight rename correct? It does not resolve #17422 ?

github-actions · 2025-03-11T18:47:08Z

❌ Gradle check result for 835fd60: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

vinaykpud · 2025-03-11T19:32:57Z

This is a straight rename correct? It does not resolve #17422 ?

Yes, Do you want me to add a new issue for this?

github-actions · 2025-03-11T20:10:18Z

❌ Gradle check result for b0d026a: FAILURE

Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change?

github-actions · 2025-03-11T21:09:38Z

✅ Gradle check result for b0d026a: SUCCESS

codecov · 2025-03-11T21:10:49Z

Codecov Report

Attention: Patch coverage is 92.85714% with 2 lines in your changes missing coverage. Please review.

Project coverage is 72.49%. Comparing base (1166998) to head (d05b81a).
Report is 2 commits behind head on main.

Files with missing lines	Patch %	Lines
.../java/org/opensearch/env/NodeRepurposeCommand.java	81.81%	0 Missing and 2 partials ⚠️

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #17573      +/-   ##
============================================
+ Coverage     72.48%   72.49%   +0.01%     
+ Complexity    65771    65752      -19     
============================================
  Files          5311     5311              
  Lines        304973   304973              
  Branches      44229    44229              
============================================
+ Hits         221045   221085      +40     
+ Misses        65830    65760      -70     
- Partials      18098    18128      +30

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

andrross · 2025-03-11T23:17:39Z

Let's make sure this aligns with what @gbbafna mentioned here before merging.

We'd generally want to do a deprecation cycle before just replacing a term like this. However, it does appear that search was a bit of a misnomer and it is not in wide use beyond searchable snapshots, so I'm not totally opposed with doing this to unblock the usage of the search role where it might actually make more sense.

gbbafna · 2025-03-12T06:37:45Z

@vinaykpud : Can we verify if we upgrade the cluster from 2.x to 3.0 with this change, searchable snapshots continue to work ? There might be few nuances here like changing the cluster-manager first . Want to make sure that rolling restarts works with this change

server/src/main/java/org/opensearch/cluster/node/DiscoveryNode.java

server/src/main/java/org/opensearch/cluster/node/DiscoveryNodeRole.java

server/src/main/java/org/opensearch/common/settings/ClusterSettings.java

vinaykpud · 2025-03-14T23:11:49Z

@vinaykpud : Can we verify if we upgrade the cluster from 2.x to 3.0 with this change, searchable snapshots continue to work ? There might be few nuances here like changing the cluster-manager first . Want to make sure that rolling restarts works with this change

Sure @gbbafna

Performed a test to check the cluster upgrade scenario. Using opensearch-cluster-cdk I have setup cluster with 3 cluster manager and 2 data nodes.
Here is the summary of the steps performed.

Created index with 1P and 1R and indexed few documents
Registered a S3 repo for snapshots
took a snapshot
Deleted the index
Updated the node role from data to search by changing the yaml file, restarted the process in both data nodes to make them snapshot search nodes
restored searchable snapshot and made sure shards are assigned
created tarball of the OpenSearch with changes in this PR and copied to all the nodes
Restarted the OS process with latest binaries first in each cluster managers one by one and cluster was healthy.
Then restarted the OS process with latest binaries in each search node one after another.
During this process I had _cat/nodes, _cat/shards, health APIs running in loop for monitoring the cluster.
Cluster din't go Red and upgraded to OS 3.0.0
Also verified by querying _search on the searchable snapshot index and its working.

Let me know if we missed anything here.

cc @mch2 @andrross

mch2 · 2025-03-17T17:37:54Z

@vinaykpud : Can we verify if we upgrade the cluster from 2.x to 3.0 with this change, searchable snapshots continue to work ? There might be few nuances here like changing the cluster-manager first . Want to make sure that rolling restarts works with this change

Sure @gbbafna

Performed a test to check the cluster upgrade scenario. Using opensearch-cluster-cdk I have setup cluster with 3 cluster manager and 2 data nodes. Here is the summary of the steps performed.

Created index with 1P and 1R and indexed few documents

Registered a S3 repo for snapshots

took a snapshot

Deleted the index

Updated the node role from data to search by changing the yaml file, restarted the process in both data nodes to make them snapshot search nodes

restored searchable snapshot and made sure shards are assigned

created tarball of the OpenSearch with changes in this PR and copied to all the nodes

Restarted the OS process with latest binaries first in each cluster managers one by one and cluster was healthy.

Then restarted the OS process with latest binaries in each search node one after another.

During this process I had _cat/nodes, _cat/shards, health APIs running in loop for monitoring the cluster.

Cluster din't go Red and upgraded to OS 3.0.0

Also verified by querying _search on the searchable snapshot index and its working.

Let me know if we missed anything here.

cc @mch2 @andrross

So to summarize, you did a successful rolling upgrade from 2.19 to 3.0 but it required cm's to migrate first. Did you not have to update the routing pool logic for bwc? I don't see this in the pr.

    public static RoutingPool getNodePool(DiscoveryNode node) {
        if (node.isWarmNode() || (node.isSearchNode() && (node.getVersion().before(Version.V_3_0_0)))) {
            return REMOTE_CAPABLE;
        }
        return LOCAL_ONLY;
    }

vinaykpud · 2025-03-17T18:02:38Z

@vinaykpud : Can we verify if we upgrade the cluster from 2.x to 3.0 with this change, searchable snapshots continue to work ? There might be few nuances here like changing the cluster-manager first . Want to make sure that rolling restarts works with this change

Sure @gbbafna
Performed a test to check the cluster upgrade scenario. Using opensearch-cluster-cdk I have setup cluster with 3 cluster manager and 2 data nodes. Here is the summary of the steps performed.

Created index with 1P and 1R and indexed few documents

Registered a S3 repo for snapshots

took a snapshot

Deleted the index

Updated the node role from data to search by changing the yaml file, restarted the process in both data nodes to make them snapshot search nodes

restored searchable snapshot and made sure shards are assigned

created tarball of the OpenSearch with changes in this PR and copied to all the nodes

Restarted the OS process with latest binaries first in each cluster managers one by one and cluster was healthy.

Then restarted the OS process with latest binaries in each search node one after another.

During this process I had _cat/nodes, _cat/shards, health APIs running in loop for monitoring the cluster.

Cluster din't go Red and upgraded to OS 3.0.0

Also verified by querying _search on the searchable snapshot index and its working.

Let me know if we missed anything here.
cc @mch2 @andrross

So to summarize, you did a successful rolling upgrade from 2.19 to 3.0 but it required cm's to migrate first. Did you not have to update the routing pool logic for bwc? I don't see this in the pr.
    public static RoutingPool getNodePool(DiscoveryNode node) {
        if (node.isWarmNode() || (node.isSearchNode() && (node.getVersion().before(Version.V_3_0_0)))) {
            return REMOTE_CAPABLE;
        }
        return LOCAL_ONLY;
    }

@mch2 , No, I haven't updated the routing pool logic, it worked without adding any new logic.
But I haven't tried other way ie: First Upgrading Search/Warm Nodes and Upgrading CM's later. We might need to add this logic for that.

mch2 · 2025-03-17T18:57:45Z

First Upgrading Search/Warm Nodes and Upgrading CM's later

@vinaykpud No this logic would be on 3.0 cms not 2.x. I was thinking this would provide added safety for mixed cluster cases where we need to make alloc decisions and assign ss shards in that mixed state. Though in the normal upgrade case this isn't required. I think we can go ahead without this.

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

github-actions · 2025-03-17T20:20:03Z

❕ Gradle check result for c6d46d2: UNSTABLE

TEST FAILURES:

      1 org.opensearch.snapshots.DedicatedClusterSnapshotRestoreIT.testSnapshotWithStuckNode
      1 org.opensearch.cluster.MinimumClusterManagerNodesIT.testThreeNodesNoClusterManagerBlock

Please review all flaky tests that succeeded after retry and create an issue if one does not already exist to track the flaky failure.

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

github-actions · 2025-03-18T05:16:22Z

✅ Gradle check result for d05b81a: SUCCESS

* Renaming search node role to warm Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * Added Changelog Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * fixed failing tests Signed-off-by: Vinay Krishna Pudyodu <[email protected]> * fixed PR comments Signed-off-by: Vinay Krishna Pudyodu <[email protected]> --------- Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

Renaming search node role to warm

f500153

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

github-actions bot added bug Something isn't working Search:Performance labels Mar 11, 2025

vinaykpud changed the title ~~Renaming search node role to warm~~ Renaming the node role search to warm Mar 11, 2025

Added Changelog

a106fe2

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

vinaykpud force-pushed the rw/rename-role branch from 835fd60 to a106fe2 Compare March 11, 2025 19:24

fixed failing tests

b0d026a

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

vinaykpud mentioned this pull request Mar 11, 2025

[DOC] Rename Search node role to Warm node role for searchable snapshots opensearch-project/documentation-website#9392

Closed

4 tasks

vinaykpud closed this Mar 11, 2025

vinaykpud reopened this Mar 11, 2025

vinaykpud marked this pull request as ready for review March 11, 2025 21:13

vinaykpud requested review from anasalkouz, andrross, ashking94, bugmakerrrrrr, Bukhtawar, CEHENKLE, cwperks, dblock, dbwiddis, gbbafna, jainankitk, kotwanikunal and linuxpi as code owners March 11, 2025 21:13

vinaykpud requested review from sachinpkale, saratvemulapalli, shwetathareja, sohami and VachaShah as code owners March 11, 2025 21:13

mch2 added the skip-changelog label Mar 11, 2025

mch2 approved these changes Mar 11, 2025

View reviewed changes

bugmakerrrrrr reviewed Mar 14, 2025

View reviewed changes

vinaykpud added 2 commits March 17, 2025 12:09

Merge branch 'main' into rw/rename-role

6decb07

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

fixed PR comments

c6d46d2

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

This was referenced Mar 18, 2025

[AUTOCUT] Gradle Check Flaky Test Report for DedicatedClusterSnapshotRestoreIT #15806

Open

[AUTOCUT] Gradle Check Flaky Test Report for MinimumClusterManagerNodesIT #14289

Open

Merge branch 'main' into rw/rename-role

d05b81a

Signed-off-by: Vinay Krishna Pudyodu <[email protected]>

bugmakerrrrrr approved these changes Mar 18, 2025

View reviewed changes

mch2 merged commit 1c86dd1 into opensearch-project:main Mar 18, 2025
31 checks passed

zane-neo mentioned this pull request Mar 18, 2025

fix compilation error opensearch-project/ml-commons#3667

Merged

5 tasks

This was referenced Mar 18, 2025

[AUTOCUT] Gradle Check Flaky Test Report for RemoteStorePinnedTimestampsGarbageCollectionIT #16088

Open

[AUTOCUT] Gradle Check Flaky Test Report for AzureBlobStoreRepositoryTests #14291

Open

vinaykpud mentioned this pull request Mar 27, 2025

[BUG] isDataNode method in DiscoveryNode returns true for any node which can contain data #17706

Closed

BrewTestBot mentioned this pull request May 6, 2025

opensearch 3.0.0 Homebrew/homebrew-core#222665

Merged

Renaming the node role search to warm #17573

Renaming the node role search to warm #17573

Uh oh!

Conversation

vinaykpud commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Check List

Uh oh!

mch2 commented Mar 11, 2025

Uh oh!

github-actions bot commented Mar 11, 2025

Uh oh!

vinaykpud commented Mar 11, 2025

Uh oh!

github-actions bot commented Mar 11, 2025

Uh oh!

github-actions bot commented Mar 11, 2025

Uh oh!

codecov bot commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

andrross commented Mar 11, 2025

Uh oh!

gbbafna commented Mar 12, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

vinaykpud commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mch2 commented Mar 17, 2025

Uh oh!

vinaykpud commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mch2 commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Mar 17, 2025

Uh oh!

github-actions bot commented Mar 18, 2025

Uh oh!

Uh oh!

Uh oh!

vinaykpud commented Mar 11, 2025 •

edited

Loading

codecov bot commented Mar 11, 2025 •

edited

Loading

vinaykpud commented Mar 14, 2025 •

edited

Loading

vinaykpud commented Mar 17, 2025 •

edited

Loading

mch2 commented Mar 17, 2025 •

edited

Loading