Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github blocks downloading the dbt schema #410

Closed
ekini opened this issue Nov 14, 2024 · 4 comments · Fixed by #412
Closed

Github blocks downloading the dbt schema #410

ekini opened this issue Nov 14, 2024 · 4 comments · Fixed by #412
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@ekini
Copy link
Contributor

ekini commented Nov 14, 2024

The tap fails consistently in some conditions with

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/tap_dbt/client.py", line 29, in load_openapi
    return yaml.safe_load(response.text)
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/__init__.py", line 125, in safe_load
    return load(stream, SafeLoader)
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/constructor.py", line 49, in get_single_data
    node = self.get_single_node()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/composer.py", line 58, in compose_document
    self.get_event()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/parser.py", line 118, in get_event
    self.current_event = self.state()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/parser.py", line 193, in parse_document_end
    token = self.peek_token()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/scanner.py", line 129, in peek_token
    self.fetch_more_tokens()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/scanner.py", line 223, in fetch_more_tokens
    return self.fetch_value()
  File "/meltano/.meltano/extractors/tap-dbt/venv/lib/python3.10/site-packages/yaml/scanner.py", line 577, in fetch_value
    raise ScannerError(None, None,
yaml.scanner.ScannerError: mapping values are not allowed here
  in "<unicode string>", line 9, column 25:
            background-color: #f1f1f1;

If we look closely, this is what it receives:

>>> requests.get(OPENAPI_URL, timeout=10)
<Response [403]>
>>> requests.get(OPENAPI_URL, timeout=10).text
'\r\n<!DOCTYPE html>\r\n<html>\r\n  <head>\r\n    <meta content="origin" name="referrer">\r\n    <title>Forbidden &middot; GitHub</title>\r\n    <style type="text/css" media="screen">\r\n      body {\r\n        background-color: #f1f1f1;\r\n        margin: 0;\r\n      }\r\n      body,\r\n      input,\r\n      button {\r\n        font-family: "Helvetica Neue", Helvetica, Arial, sans-serif;\r\n      }\r\n      .container { margin: 30px auto 40px auto; width: 800px; text-align: center; }\r\n      a { color: #4183c4; text-decoration: none; font-weight: bold; }\r\n      a:hover { text-decoration: underline; }\r\n      h1, h2, h3 { color: #666; }\r\n      ul { list-style: none; padding: 25px 0; }\r\n      li {\r\n        display: inline;\r\n        margin: 10px 50px 10px 0px;\r\n      }\r\n      .logo { display: inline-block; margin-top: 35px; }\r\n      .logo-img-2x { display: none; }\r\n      @media\r\n      only screen and (-webkit-min-device-pixel-ratio: 2),\r\n      only screen and (   min--moz-device-pixel-ratio: 2),\r\n  only screen and (     -o-min-device-pixel-ratio: 2/1),\r\n      only screen and (        min-device-pixel-ratio: 2),\r\n      only screen and (                min-resolution: 192dpi),\r\n      only screen and (                min-resolution: 2dppx) {\r\n    .logo-img-1x { display: none; }\r\n        .logo-img-2x { display: inline-block; }\r\n      }\r\n    </style>\r\n  </head>\r\n  <body>\r\n\r\n    <div class="container">\r\n      <h1>Access to this site has been restricted.</h1>\r\n\r\n      <p>\r\n <br>\r\n        If you believe this is an error,\r\n        please contact <a href="https://support.github.com/">Support</a>.\r\n     </p>\r\n\r\n      <div id="s">\r\n        <a href="https://githubstatus.com/">GitHub Status</a> &mdash;\r\n        <a href="https://twitter.com/githubstatus">@githubstatus</a>\r\n      </div>\r\n    </div>\r\n  </body>\r\n</html>\r\n'

Which is definitely not a YAML file, also note the 403 response code.

This happens most likely because the schema gets downloaded using the default User-Agent header set by requests library - https://github.com/MeltanoLabs/tap-dbt/blob/main/tap_dbt/client.py#L28

Maybe it should at least use the user-agent set in the tap config. Or, even better, the schema should be committed to the repo, as this dependency on Github availability reduces the whole system reliability.

@edgarrmondragon
Copy link
Member

Or, even better, the schema should be committed to the repo, as this dependency on Github availability reduces the whole system reliability.

Agreed. PRs welcome!

@ekini
Copy link
Contributor Author

ekini commented Nov 14, 2024

PR added :)

edgarrmondragon added a commit that referenced this issue Nov 14, 2024
Fixes #410

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Edgar Ramírez Mondragón <[email protected]>
@github-project-automation github-project-automation bot moved this from Todo to Done in MeltanoLabs Overview Nov 14, 2024
@edgarrmondragon
Copy link
Member

@ekini
Copy link
Contributor Author

ekini commented Nov 15, 2024

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
Development

Successfully merging a pull request may close this issue.

2 participants