Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to read partitions info on MSK cluster #208

Open
ncbrown1 opened this issue Sep 13, 2024 · 2 comments
Open

Unable to read partitions info on MSK cluster #208

ncbrown1 opened this issue Sep 13, 2024 · 2 comments

Comments

@ncbrown1
Copy link

When attempting to fetch consumer lags via topicctl get lags, I am getting the following error:

[2024-09-13 17:57:53]  WARN Error reading result: Error getting offsets for partition 0: write tcp 172.17.0.2:55188-><broker-public-ip>:9196: i/o timeout

Other commands work just fine, such as topicctl get members. I know that the SASL user I am using to fetch the details does have the right permissions to read this information, since I was able to use the regular kafka-consumer-groups.sh script as well as a custom script based off of sarama's library to read consumer offsets. I can also use topicctl tail to read data from the topic in question.

The kafka cluster I am interacting with is hosted in AWS eu-central-1, running Kafka version 3.6.0. I am using topicctl version v1.18.0 (ref:dev).

Debug logs:

root@7e641a34dd16:/workspaces/obs-kafka-management# topicctl --debug get lags --cluster-config=topicctl/cluster.yml data.euc.sre_observability-elasticsearch_audit logstash_1
[2024-09-13 17:57:44] DEBUG No ZK addresses provided, using broker admin client
[2024-09-13 17:57:44] DEBUG Setting SASL username from override value
[2024-09-13 17:57:44] DEBUG Setting SASL password from override value
[2024-09-13 17:57:44] DEBUG Connecting to cluster on address <redacted>:9196 with TLS enabled=true, SASL enabled=true
[2024-09-13 17:57:44] DEBUG Getting supported API versions
...
[2024-09-13 17:57:47] DEBUG Supported features: {Reads:true Applies:true Locks:false DynamicBrokerConfigs:true ACLs:true Users:true}
[2024-09-13 17:57:47] DEBUG Metadata request: {Addr:<nil> Topics:[data.euc.sre_observability-elasticsearch_audit]}
[2024-09-13 17:57:47] DEBUG Metadata response: &{Throttle:0s ClusterID:... Controller:{Host:b-1-public...kafka.eu-central-1.amazonaws.com Port:9196 ID:1 Rack:euc1-az3} Brokers:[{Host:b-1-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:1 Rack:euc1-az3} {Host:b-2-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:2 Rack:euc1-az1} {Host:b-3-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:3 Rack:euc1-az1} {Host:b-4-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:4 Rack:euc1-az2} {Host:b-5-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:5 Rack:euc1-az3} {Host:b-6-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:6 Rack:euc1-az2}] Topics:[{Name:data.euc.sre_observability-elasticsearch_audit Internal:false Partitions:[{Topic:data.euc.sre_observability-elasticsearch_audit ID:0 Leader:{Host:b-4-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:4 Rack:euc1-az2} Replicas:[{Host:b-4-public.o..c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:4 Rack:euc1-az2} {Host:b-1-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:1 Rack:euc1-az3}] Isr:[{Host:b-4-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:4 Rack:euc1-az2} {Host:b-1-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:1 Rack:euc1-az3}] OfflineReplicas:[] Error:<nil>} {Topic:data.euc.sre_observability-elasticsearch_audit ID:1 Leader:{Host:b-1-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:1 Rack:euc1-az3} Replicas:[{Host:b-1-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:1 Rack:euc1-az3} {Host:b-3-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:3 Rack:euc1-az1}] Isr:[{Host:b-1-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:1 Rack:euc1-az3} {Host:b-3-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:3 Rack:euc1-az1}] OfflineReplicas:[] Error:<nil>} {Topic:data.euc.sre_observability-elasticsearch_audit ID:2 Leader:{Host:b-3-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:3 Rack:euc1-az1} Replicas:[{Host:b-3-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:3 Rack:euc1-az1} {Host:b-6-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:6 Rack:euc1-az2}] Isr:[{Host:b-3-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:3 Rack:euc1-az1} {Host:b-6-public...c6.kafka.eu-central-1.amazonaws.com Port:9196 ID:6 Rack:euc1-az2}] OfflineReplicas:[] Error:<nil>}] Error:<nil>}]} (<nil>)
[2024-09-13 17:57:47] DEBUG DescribeConfigs request: {Addr:<nil> Resources:[{ResourceType:Topic ResourceName:data.euc.sre_observability-elasticsearch_audit ConfigNames:[]}] IncludeSynonyms:false IncludeDocumentation:false}
Loading: [========>           ][2024-09-13 17:57:48] DEBUG DescribeConfigs response: &{Throttle:0s Resources:[{ResourceType:2 ResourceName:data.euc.sre_observability-elasticsearch_audit Error:<nil> ConfigEntries:[{ConfigName:compression.type ConfigValue:gzip ReadOnly:false IsDefault:false ConfigSource:1 IsSensitive:false ConfigSynonyms:[] ConfigType:2 ConfigDocumentation:} {ConfigName:leader.replication.throttled.replicas ConfigValue: ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:7 ConfigDocumentation:} {ConfigName:remote.storage.enable ConfigValue:false ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:1 ConfigDocumentation:} {ConfigName:message.downconversion.enable ConfigValue:true ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:1 ConfigDocumentation:} {ConfigName:min.insync.replicas ConfigValue:2 ReadOnly:false IsDefault:false ConfigSource:1 IsSensitive:false ConfigSynonyms:[] ConfigType:3 ConfigDocumentation:} {ConfigName:segment.jitter.ms ConfigValue:0 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:local.retention.ms ConfigValue:-2 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:cleanup.policy ConfigValue:delete ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:7 ConfigDocumentation:} {ConfigName:flush.ms ConfigValue:9223372036854775807 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:follower.replication.throttled.replicas ConfigValue: ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:7 ConfigDocumentation:} {ConfigName:segment.bytes ConfigValue:134217728 ReadOnly:false IsDefault:false ConfigSource:4 IsSensitive:false ConfigSynonyms:[] ConfigType:3 ConfigDocumentation:} {ConfigName:retention.ms ConfigValue:86400000 ReadOnly:false IsDefault:false ConfigSource:1 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:flush.messages ConfigValue:9223372036854775807 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:remote.log.msk.disable.policy ConfigValue:None ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:2 ConfigDocumentation:} {ConfigName:message.format.version ConfigValue:3.0-IV1 ReadOnly:false IsDefault:false ConfigSource:4 IsSensitive:false ConfigSynonyms:[] ConfigType:2 ConfigDocumentation:} {ConfigName:max.compaction.lag.ms ConfigValue:9223372036854775807 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:file.delete.delay.ms ConfigValue:60000 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:max.message.bytes ConfigValue:1048576 ReadOnly:false IsDefault:false ConfigSource:1 IsSensitive:false ConfigSynonyms:[] ConfigType:3 ConfigDocumentation:} {ConfigName:min.compaction.lag.ms ConfigValue:0 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:message.timestamp.type ConfigValue:CreateTime ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:2 ConfigDocumentation:} {ConfigName:local.retention.bytes ConfigValue:-2 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:preallocate ConfigValue:false ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:1 ConfigDocumentation:} {ConfigName:min.cleanable.dirty.ratio ConfigValue:0.5 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:6 ConfigDocumentation:} {ConfigName:index.interval.bytes ConfigValue:4096 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:3 ConfigDocumentation:} {ConfigName:unclean.leader.election.enable ConfigValue:false ReadOnly:false IsDefault:false ConfigSource:4 IsSensitive:false ConfigSynonyms:[] ConfigType:1 ConfigDocumentation:} {ConfigName:retention.bytes ConfigValue:549755813888 ReadOnly:false IsDefault:false ConfigSource:1 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:delete.retention.ms ConfigValue:86400000 ReadOnly:false IsDefault:false ConfigSource:1 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:message.timestamp.after.max.ms ConfigValue:86400000 ReadOnly:false IsDefault:false ConfigSource:4 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:message.timestamp.before.max.ms ConfigValue:86400000 ReadOnly:false IsDefault:false ConfigSource:4 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:segment.ms ConfigValue:604800000 ReadOnly:false IsDefault:false ConfigSource:1 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:message.timestamp.difference.max.ms ConfigValue:9223372036854775807 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:5 ConfigDocumentation:} {ConfigName:segment.index.bytes ConfigValue:10485760 ReadOnly:false IsDefault:false ConfigSource:5 IsSensitive:false ConfigSynonyms:[] ConfigType:3 ConfigDocumentation:}]}]} (<nil>)
[2024-09-13 17:57:48] DEBUG DescribeGroups request: {Addr:<nil> GroupIDs:[logstash_1]}
Loading: [===================>][2024-09-13 17:57:49] DEBUG DescribeGroups response: &{Groups:[{Error:<nil> GroupID:logstash_1 GroupState:Stable Members:[{MemberID:euc156.logstash_1-0-ddbafca1-9350-405a-b918-d8da8d09bc9a ClientID:euc156.logstash_1-0 ClientHost:/3.123.222.15 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc147.logstash_1-0-36614bab-4faa-42af-ad2d-9d06dc68bc4e ClientID:euc147.logstash_1-0 ClientHost:/52.58.58.34 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc151.logstash_1-0-0cd1dbca-baac-4078-8b6d-040d8b9c4c76 ClientID:euc151.logstash_1-0 ClientHost:/52.59.144.98 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc149.logstash_1-0-a8f167e1-09a9-42eb-b6f5-a2b8819642ba ClientID:euc149.logstash_1-0 ClientHost:/52.58.101.119 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc157.logstash_1-0-d1dfe838-fea1-425e-87d3-51d9fd4ab28d ClientID:euc157.logstash_1-0 ClientHost:/3.76.135.106 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc150.logstash_1-0-21469321-6d1d-4454-9232-6606d732a4df ClientID:euc150.logstash_1-0 ClientHost:/3.74.75.242 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc154.logstash_1-0-395afce3-abdd-400d-a504-af4d2644cfed ClientID:euc154.logstash_1-0 ClientHost:/18.158.110.34 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc158.logstash_1-0-01842480-2f15-49a2-8c91-06f27218c15a ClientID:euc158.logstash_1-0 ClientHost:/35.156.244.45 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc148.logstash_1-0-0f75a4f7-f804-4c1c-852f-1e99979deb37 ClientID:euc148.logstash_1-0 ClientHost:/35.158.2.184 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc155.logstash_1-0-a1f0abc6-9892-4654-a854-84231d6dcb15 ClientID:euc155.logstash_1-0 ClientHost:/3.76.153.219 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc160.logstash_1-0-0bcc4580-d9c6-42e2-a0e7-2821cc1c5682 ClientID:euc160.logstash_1-0 ClientHost:/52.29.131.1 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc153.logstash_1-0-15103949-69a8-4b3b-b901-11ad877e1f3b ClientID:euc153.logstash_1-0 ClientHost:/18.197.237.94 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc144.logstash_1-0-17a7cf44-1294-427b-902d-9a5f4207d9db ClientID:euc144.logstash_1-0 ClientHost:/52.28.241.246 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[{Topic:data.euc.sre_observability-elasticsearch_audit Partitions:[0]}] UserData:[]}} {MemberID:euc146.logstash_1-0-0b3da8a4-06a0-403b-9917-8550b6b7b8ec ClientID:euc146.logstash_1-0 ClientHost:/52.58.155.186 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[{Topic:data.euc.sre_observability-elasticsearch_audit Partitions:[2]}] UserData:[]}} {MemberID:euc161.logstash_1-0-581d5b7a-35f6-4392-928d-fdc676ab9b24 ClientID:euc161.logstash_1-0 ClientHost:/3.76.181.96 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc152.logstash_1-0-34f9b9a0-2558-4bd7-b597-3da351a21c0e ClientID:euc152.logstash_1-0 ClientHost:/18.156.113.251 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}} {MemberID:euc145.logstash_1-0-2249da15-23ef-4587-90a9-c6265a47c49b ClientID:euc145.logstash_1-0 ClientHost:/3.73.203.146 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[{Topic:data.euc.sre_observability-elasticsearch_audit Partitions:[1]}] UserData:[]}} {MemberID:euc159.logstash_1-0-b7a02336-a657-4635-8874-27b1f7d6e1a2 ClientID:euc159.logstash_1-0 ClientHost:/18.197.252.176 MemberMetadata:{Version:1 Topics:[data.euc.sre_observability-elasticsearch_audit] UserData:[] OwnedPartitions:[]} MemberAssignments:{Version:1 Topics:[] UserData:[]}}]}]}
Loading: [=>                  ][2024-09-13 17:57:49] DEBUG Received consumerOffsets: map[0:1860713 1:1899930 2:1913661]
Loading: [==========>         ][2024-09-13 17:57:51] DEBUG Getting bounds for topic data.euc.sre_observability-elasticsearch_audit, partition 0 with minOffset 1860713
[2024-09-13 17:57:51] DEBUG Getting bounds for topic data.euc.sre_observability-elasticsearch_audit, partition 2 with minOffset 1913661
[2024-09-13 17:57:51] DEBUG Getting bounds for topic data.euc.sre_observability-elasticsearch_audit, partition 1 with minOffset 1899930
Loading: [========>           ][2024-09-13 17:57:53]  WARN Error reading result: Error getting offsets for partition 2: write tcp 172.17.0.2:50604->1...:9196: i/o timeout
[2024-09-13 17:57:53]  WARN Error reading result: Error getting offsets for partition 1: write tcp 172.17.0.2:53884->2...:9196: i/o timeout
[2024-09-13 17:57:53]  WARN Error reading result: Error getting offsets for partition 0: write tcp 172.17.0.2:55188->3...:9196: i/o timeout
[2024-09-13 17:57:53]  INFO Group member lags:
------------+-----------+---------------+-------------+---------------+-------------+------------+-----------
  PARTITION | MEMBER ID | MEMBER OFFSET | MEMBER TIME | LATEST OFFSET | LATEST TIME | OFFSET LAG | TIME LAG  
------------+-----------+---------------+-------------+---------------+-------------+------------+-----------
------------+-----------+---------------+-------------+---------------+-------------+------------+-----------
@petedannemann
Copy link
Contributor

Can you try providing --conn-timeout with a larger value? get lags fetches messages which can timeout easily, especially when running against t3.small MSK instances

@camgraff
Copy link

camgraff commented Nov 6, 2024

I'm seeing the same issue and increasing the timeout didn't change anything

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants