-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Defining multiple KAFKA topics that our consumer VM will listen to #212
Comments
|
Yes, that's correct. I'll update the wording of the main comment. |
#213 updates the requirements needed to start the Kafka -> Pub/Sub connector for our consumer VM. Previously, when a single topic was defined, the consumer's startup script would search for the topic in a file called In order to accommodate for multiple topics, we now parse through the string containing the comma-separated Kafka topics and append each topic name to an array called If all of the topics are present in |
The ELAsTiCC2 Challenge restarted last Friday, November 10th. The streaming and database statistics were just updated on the TOM Toolkit, and I wanted to give a brief update. We began this challenge by listening to the following topics: Below is a screenshot of the metrics provided by the TOM Toolkit: I was surprised to see that we were only processing about 50% of the alerts streamed every night. From this, I inferred that we were likely only consuming alerts from one of the two topics specified above. To learn more about this issue, I ssh'd into our consumer's VM and ran the startup script manually to see if there were any issues with subscribing to the topics. As you can see in the screenshot below, there seems to be no issues subscribing to the two topics. As of right now, I've updated the consumer's metadata to subscribe to all three Kafka topics ( |
Are the "WARN" messages in the log expected? If so, would it be appropriate to suppress them so that someone reading the log doesn't have to re-check why those configs aren't "know config"s? |
At least some of those "WARN" messages about unknown configs are normal in the sense that they've always occurred for us but don't seem to cause problems. It would be great to suppress them, but I don't know how offhand. The logs are generated by the Kafka -> Pub/Sub Connector, which is a Java application and I'm not very familiar with that language. (We don't ever touch Java explicitly ourselves; we just start up the application and pass in the configs). A little history: Back when I set up this consumer I tried to track these down and "fix" them but I wasn't successful. Strangely, for at least one of those configs, removing it actually broke the consumer -- yet it complains about not knowing what it is. It's possible that I just didn't know enough about what I was doing. I wish I would have documented my trials better. We've made config changes since then (perhaps especially in #205), so some of these warnings may also be new and potentially causing problems. It's worth putting more work into, though I don't know if/when it'll be high enough on the priority list. |
Ah, thanks for the details and history. Agree that it doesn't look like it's like going to rise in the priority list to fix. |
In order to ingest alerts, we (or a user) must specify the
KAFKA_TOPIC
a consumer VM in our pipeline must listen to. PR #205 notes that this may be defined as a comma-separated list of topics, which allows a single consumer VM to listen to multiple topics simultaneously.Currently, this is relevant in the context of the ELAsTiCC2 Challenge. The size of the alerts has increased significantly since the original ELAsTiCC Challenge, and as a result, DESC has decided to create three independent alert streams. For the recent test streams, this includes:
elasticc2-stN-wfd
-- WFD objects, including up to 365 days of previous sources and forced sourceselasticc2-stN-ddf-full
-- DDF objects, including up to 365 days of previous sources and forced sourceselasticc2-stN-ddf-limited
-- DDF objects, including up to 30 days of previous sources and forced sources(Replace
N
with the numbers 1, 2, 3, ..., etc. These numbers represent theN
th test stream before the re-start of the ELAsTiCC2 Challenge).This issue serves to document my experience and findings associated with having our consumer VM listen to multiple KAFKA topics.
So far, I would like to note:
The text was updated successfully, but these errors were encountered: