Missing metrics on Cluster with Down Node #48

tmysl · 2017-12-28T20:12:19Z

Hello, we recently lost a node in our cluster and started getting some odd behavior since that happened.

Can no longer see:

cluster CPU usage on graphs
protocol based metrics
job engine
cluster network

I'm not sure if the node loss is related, but i am seeing this in the logs:

2017-12-28 19:56:46,043:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.siq.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,043:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.jobd.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,043:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.smb2.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,043:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.nfs4.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,043:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.irp.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,044:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.lsass_in.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,044:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.lsass_out.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,044:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.papi.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:56:46,044:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.protostats.hdfs.total' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,889:isi_data_insights_daemon:WARNING: Query for stat: 'ifs.bytes.in.rate' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,889:isi_data_insights_daemon:WARNING: Query for stat: 'ifs.bytes.out.rate' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,889:isi_data_insights_daemon:WARNING: Query for stat: 'ifs.ops.in.rate' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,889:isi_data_insights_daemon:WARNING: Query for stat: 'ifs.ops.out.rate' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,889:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.cpu.idle.avg' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,889:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.cpu.intr.avg' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,889:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.cpu.user.avg' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'.
2017-12-28 19:57:16,890:isi_data_insights_daemon:WARNING: Query for stat: 'cluster.cpu.sys.avg' on 'MY-CLUSTER', returned error: 'Degraded result, some node input missing'

I've ruled out configuration, network, even used a different mgmt ip address. We have a second cluster that is health and reading metrics just fine, is this expected for a degraded cluster?

The text was updated successfully, but these errors were encountered:

tenortim · 2018-03-23T16:18:10Z

No, it's not expected that you would be missing the data like that. I'll need to set up a test to repro but this is definitely a real bug.

tenortim · 2021-11-03T19:23:28Z

There seems to be an issue with the degraded functionality not working correctly in the SDK. Still looking into it.

tenortim added the bug label Nov 6, 2019

tenortim self-assigned this Nov 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missing metrics on Cluster with Down Node #48

Missing metrics on Cluster with Down Node #48

tmysl commented Dec 28, 2017 •

edited

Loading

tenortim commented Mar 23, 2018

tenortim commented Nov 3, 2021

Missing metrics on Cluster with Down Node #48

Missing metrics on Cluster with Down Node #48

Comments

tmysl commented Dec 28, 2017 • edited Loading

tenortim commented Mar 23, 2018

tenortim commented Nov 3, 2021

tmysl commented Dec 28, 2017 •

edited

Loading