Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why force users to use an annotation DB? #122

Open
jensenja opened this issue Jul 10, 2019 · 8 comments · May be fixed by #372
Open

why force users to use an annotation DB? #122

jensenja opened this issue Jul 10, 2019 · 8 comments · May be fixed by #372

Comments

@jensenja
Copy link

jensenja commented Jul 10, 2019

I was attempting to train my first model when I was surprised to discover that loudml was trying to create a new TSDB called chronograf:

requests.exceptions.SSLError: HTTPSConnectionPool(host='10.0.1.207', port=8086): Max retries exceeded with url: /query?db=chronograf&q=CREATE+DATABASE+%22chronograf%22 (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:720)'),))

Ignoring the SSL error because I know what the problem is with that, I had to go digging into the source code to find out why LoudML was trying to create a chronograf database in my InfluxDB node. It would seem that having an "annotation DB" is a requirement - why? I tried setting the following in my config.yml:

annotation_db: null

But the train command didn't like that:

root@9c8421ac3474:~# loudml train test_model -f now-7d -t now
ERROR:root:invalid field annotation_db: expected str

So I have to create another TSDB for annotation purposes, when I won't ever be looking at or caring about annotations because I don't use Chronograf...

Are there plans to add a config knob that just disables annotation functionality all together?

@regel
Copy link
Owner

regel commented Jul 11, 2019

Can you please dump your config.yml file?

@regel
Copy link
Owner

regel commented Jul 11, 2019

The annotation_db's main purpose is to log anomalies. The loudml train command is trying to query this data in order to exclude abnormal data points from the training set. This should be optional. Let me check the settings.

@regel
Copy link
Owner

regel commented Jul 11, 2019

A YAML config example that references an annotations source and specifies that the database should not be created (create_database: false) if it doesn't exist.

---
datasources:
 - name: influx
   type: influxdb
   addr: localhost
   database: mydb
   create_database: true
   retention_policy: autogen
   max_series_per_request: 2000
   annotation_db: annotations
 - name: annotations
   type: influxdb
   addr: localhost
   database: annodb
   create_database: false

@jensenja
Copy link
Author

jensenja commented Jul 11, 2019

hi @regel - thanks for the reply. Here's my config.yml with some bits retracted:

---
datasources:
  - name: influx_[redacted]
    type: influxdb
    addr: [redacted]:8086
    database: snmp_[redacted]
    dbuser: admin
    retention_policy: 30d
    dbuser_password: [redacted]
    use_ssl: true
    verify_ssl: false

  - name: kapacitor
    type: influxdb
    addr: [redacted]:9092
    use_ssl: true
    verify_ssl: false
    database: from_loudml
    retention_policy: autogen

storage:
  path: /var/lib/loudml

server:
  listen: 0.0.0.0:8077

I'll use the suggested config.yml approach you've provided - thank you for that. I had to resort to setting an admin account in my config because LoudML was always trying to create the chronograf database despite me creating one manually and I was getting permissions failures. I also have an SSL verification problem that I'll be posting a new issue for. Are you guys also accepting pull requests for documentation updates? I had a real hard time figuring out how to adapt bits and pieces from blog posts and documentation to my setup in production. Thanks again.

@jensenja
Copy link
Author

@regel - your config suggestion isn't working. loudml is still trying to create the database. my revised config.yml:

---
datasources:
  - name: influx_[redacted]
    type: influxdb
    addr: host:8086
    database: snmp_[redacted]
    dbuser: snmpstats
    dbuser_password: password
    retention_policy: 30d
    use_ssl: true
    verify_ssl: true
    annotation_db: annotations

  - name: annotations
    type: influxdb
    addr: host:8086
    database: annodb
    use_ssl: true
    verify_ssl: true
    create_database: false

  - name: kapacitor
    type: influxdb
    addr: host:9092
    use_ssl: true
    verify_ssl: true
    database: from_loudml
    retention_policy: autogen

storage:
  path: /var/lib/loudml

server:
  listen: 0.0.0.0:8077

error in the container log:

INFO:root:job[a4eef69b-0034-4713-82a2-3f1780c1e04a] starting, nice=5
ERROR:root:403: {"error":"error authorizing query: snmpstats not authorized to execute statement 'CREATE DATABASE annotations', requires admin privilege"}
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/loudml/worker.py", line 57, in run
    res = getattr(self, func_name)(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/loudml/worker.py", line 92, in train
    tags={ 'model': model_name },
  File "/usr/lib/python3/dist-packages/loudml/influx.py", line 845, in list_anomalies
    result = self.annotationdb.query(query)
  File "/usr/lib/python3/dist-packages/loudml/influx.py", line 388, in annotationdb
    self._annotationdb.create_database(db)
  File "/usr/lib/loudml/vendor/influxdb/client.py", line 579, in create_database
    method="POST")
  File "/usr/lib/loudml/vendor/influxdb/client.py", line 416, in query
    expected_response_code=expected_response_code
  File "/usr/lib/loudml/vendor/influxdb/client.py", line 286, in request
    raise InfluxDBClientError(response.content, response.status_code)
influxdb.exceptions.InfluxDBClientError: 403: {"error":"error authorizing query: snmpstats not authorized to execute statement 'CREATE DATABASE annotations', requires admin privilege"}

@regel
Copy link
Owner

regel commented Jul 14, 2019

@jensenja

Are you guys also accepting pull requests for documentation updates?

Hi John, yes 100%. Pull requests much appreciated, I should set a badge on the main Github page.

Documentation is located in the docs folder.

@regel
Copy link
Owner

regel commented Jul 25, 2019

The documentation is now deployed continuously by Netlify. We welcome contributions.

Thank you for the extra log information. We can reproduce and fix this.

@KarstenB
Copy link

KarstenB commented Dec 2, 2019

I am stuck in the same position and have currently no idea on how to train my data on real-world production data if loudml requires admin privileges. I have no problem creating a READ/WRITE user for a specific database, with read privileges for the production data. But allowing that user to drop stuff?!
Or am I missing something here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants