Skip to content

'NoneType' object is not iterable #359

@soham-dasgupta

Description

@soham-dasgupta

Describe the bug

I am trying to use stage_external_sources to run to materialize s3 location as external table before using Spark 3.5.0 in emr-7.0.0

First query runs fine show table extended in external_tables like '*'
Second query runs fine but doesn't give any output and dbt run-operation errors out create schema if not exists external_tables

6:38:29  Running with dbt=1.8.9
16:38:29  Registered adapter: spark=1.8.0
16:38:29  Found 76 models, 23 data tests, 52 sources, 2 exposures, 247 metrics, 700 macros, 16 semantic models
16:38:29  1 of 1 START external source external_tables.tommy
16:38:32  1 of 1 (1) create schema if not exists external_tables
16:38:34  Encountered an error while running operation: Compilation Error
  'NoneType' object is not iterable
  
  > in macro stage_external_sources (macros/common/stage_external_sources.sql)
  > called by <Unknown>

Steps to reproduce

  1. Add sources.yml
version: 2

sources:
  - name: external_tables
    description: Contains data in s3
    tables:
      - name: tommy
        external:
          location: s3://xxxx-xxx/xxx
          using: parquet
        columns:
          - name: weekend_day
            data_type: date
          - name: region_id
            data_type: smallint

  1. Run dbt run-operation stage_external_sources --args "select: external_tables.tommy" --profile spark

Expected results

Expect Operation to complete

Actual results

'NoneType' object is not iterable

Screenshots and log output

�[0m22:08:29.249944 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'start', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1075f6090>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x107633a10>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1076200d0>]}


============================== 22:08:29.252813 | 0c457ee0-660f-4a43-85a0-7947603a8375 ==============================
�[0m22:08:29.252813 [info ] [MainThread]: Running with dbt=1.8.9
�[0m22:08:29.253090 [debug] [MainThread]: running dbt with arguments {'printer_width': '80', 'indirect_selection': 'eager', 'log_cache_events': 'False', 'write_json': 'True', 'partial_parse': 'True', 'cache_selected_only': 'False', 'warn_error': 'None', 'debug': 'False', 'version_check': 'True', 'log_path': 'logs', 'fail_fast': 'False', 'profiles_dir': 'HerculesAmex', 'use_colors': 'True', 'use_experimental_parser': 'False', 'no_print': 'None', 'quiet': 'False', 'empty': 'None', 'warn_error_options': 'WarnErrorOptions(include=[], exclude=[])', 'static_parser': 'True', 'log_format': 'default', 'introspect': 'True', 'target_path': 'None', 'invocation_command': 'dbt run-operation stage_external_sources --args select: external_tables.tommy --profile spark', 'send_anonymous_usage_stats': 'True'}
�[0m22:08:29.355830 [debug] [MainThread]: Spark adapter: Setting pyhive.hive logging to ERROR
�[0m22:08:29.356097 [debug] [MainThread]: Spark adapter: Setting thrift.transport logging to ERROR
�[0m22:08:29.356243 [debug] [MainThread]: Spark adapter: Setting thrift.protocol logging to ERROR
�[0m22:08:29.418957 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'project_id', 'label': '0c457ee0-660f-4a43-85a0-7947603a8375', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1075e1ed0>]}
�[0m22:08:29.440477 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'adapter_info', 'label': '0c457ee0-660f-4a43-85a0-7947603a8375', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1207ae510>]}
�[0m22:08:29.440878 [info ] [MainThread]: Registered adapter: spark=1.8.0
�[0m22:08:29.504317 [debug] [MainThread]: checksum: 85b61b60f3abb79f042182ca285d45490368bec2b59770b399e796a03cafaa67, vars: {}, profile: spark, target: , version: 1.8.9
�[0m22:08:29.653291 [debug] [MainThread]: Partial parsing enabled: 0 files deleted, 0 files added, 0 files changed.
�[0m22:08:29.653554 [debug] [MainThread]: Partial parsing enabled, no changes found, skipping parsing
�[0m22:08:29.707335 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'load_project', 'label': '0c457ee0-660f-4a43-85a0-7947603a8375', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121961d10>]}
�[0m22:08:29.838657 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'resource_counts', 'label': '0c457ee0-660f-4a43-85a0-7947603a8375', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x1075e9b10>]}
�[0m22:08:29.838955 [info ] [MainThread]: Found 76 models, 23 data tests, 52 sources, 2 exposures, 247 metrics, 700 macros, 16 semantic models
�[0m22:08:29.839114 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'runnable_timing', 'label': '0c457ee0-660f-4a43-85a0-7947603a8375', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121357350>]}
�[0m22:08:29.839354 [debug] [MainThread]: Acquiring new spark connection 'macro_stage_external_sources'
�[0m22:08:29.839488 [debug] [MainThread]: Spark adapter: NotImplemented: add_begin_query
�[0m22:08:29.839583 [debug] [MainThread]: Spark adapter: NotImplemented: commit
�[0m22:08:29.846846 [info ] [MainThread]: 1 of 1 START external source external_tables.tommy
�[0m22:08:29.849240 [debug] [MainThread]: On "macro_stage_external_sources": cache miss for schema ".external_tables", this is inefficient
�[0m22:08:29.853775 [debug] [MainThread]: Using spark connection "macro_stage_external_sources"
�[0m22:08:29.853990 [debug] [MainThread]: On macro_stage_external_sources: /* {"app": "dbt", "dbt_version": "1.8.9", "profile_name": "spark", "target_name": "dev", "connection_name": "macro_stage_external_sources"} */
show table extended in external_tables like '*'
  
�[0m22:08:29.854107 [debug] [MainThread]: Opening a new connection, currently in state init
�[0m22:08:31.888206 [debug] [MainThread]: Spark adapter: Poll status: 2, query complete
�[0m22:08:31.889978 [debug] [MainThread]: SQL status: OK in 2.036 seconds
�[0m22:08:32.378575 [debug] [MainThread]: While listing relations in database=, schema=external_tables, found: 
�[0m22:08:32.402418 [info ] [MainThread]: 1 of 1 (1) create schema if not exists external_tables
�[0m22:08:32.404740 [debug] [MainThread]: Using spark connection "macro_stage_external_sources"
�[0m22:08:32.405017 [debug] [MainThread]: On macro_stage_external_sources: /* {"app": "dbt", "dbt_version": "1.8.9", "profile_name": "spark", "target_name": "dev", "connection_name": "macro_stage_external_sources"} */

                 create schema if not exists external_tables
            
�[0m22:08:32.868860 [debug] [MainThread]: Spark adapter: Poll status: 2, query complete
�[0m22:08:32.869427 [debug] [MainThread]: SQL status: OK in 0.464 seconds
�[0m22:08:33.350666 [debug] [MainThread]: Spark adapter: Error while running:
macro stage_external_sources
�[0m22:08:33.351951 [debug] [MainThread]: Spark adapter: Compilation Error
  'NoneType' object is not iterable
  
  > in macro stage_external_sources (macros/common/stage_external_sources.sql)
  > called by <Unknown>
�[0m22:08:33.352762 [debug] [MainThread]: On macro_stage_external_sources: ROLLBACK
�[0m22:08:33.353346 [debug] [MainThread]: Spark adapter: NotImplemented: rollback
�[0m22:08:33.353929 [debug] [MainThread]: On macro_stage_external_sources: Close
�[0m22:08:34.027395 [error] [MainThread]: Encountered an error while running operation: Compilation Error
  'NoneType' object is not iterable
  
  > in macro stage_external_sources (macros/common/stage_external_sources.sql)
  > called by <Unknown>
�[0m22:08:34.032499 [debug] [MainThread]: Traceback (most recent call last):
  File ".venv/amex/lib/python3.11/site-packages/dbt_common/clients/jinja.py", line 346, in exception_handler
    yield
  File ".venv/amex/lib/python3.11/site-packages/dbt_common/clients/jinja.py", line 323, in call_macro
    return macro(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/jinja2/runtime.py", line 770, in __call__
    return self._invoke(arguments, autoescape)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/jinja2/runtime.py", line 784, in _invoke
    rv = self._func(*arguments)
         ^^^^^^^^^^^^^^^^^^^^^^
  File "<template>", line 52, in macro
  File ".venv/amex/lib/python3.11/site-packages/jinja2/sandbox.py", line 401, in call
    return __context.call(__obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/jinja2/runtime.py", line 303, in call
    return __obj(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt/adapters/base/impl.py", line 399, in execute
    return self.connections.execute(sql=sql, auto_begin=auto_begin, fetch=fetch, limit=limit)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt/adapters/sql/connections.py", line 221, in execute
    table = self.get_result_from_cursor(cursor, limit)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt/adapters/sql/connections.py", line 203, in get_result_from_cursor
    rows = cursor.fetchall()
           ^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt/adapters/spark/connections.py", line 251, in fetchall
    return self._cursor.fetchall()
           ^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/pyhive/common.py", line 142, in fetchall
    return list(iter(self.fetchone, None))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/pyhive/common.py", line 111, in fetchone
    self._fetch_while(lambda: not self._data and self._state != self._STATE_FINISHED)
  File ".venv/amex/lib/python3.11/site-packages/pyhive/common.py", line 51, in _fetch_while
    self._fetch_more()
  File ".venv/amex/lib/python3.11/site-packages/pyhive/hive.py", line 507, in _fetch_more
    zip(response.results.columns, schema)]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: 'NoneType' object is not iterable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File ".venv/amex/lib/python3.11/site-packages/dbt/task/run_operation.py", line 62, in run
    self._run_unsafe(package_name, macro_name)
  File ".venv/amex/lib/python3.11/site-packages/dbt/task/run_operation.py", line 47, in _run_unsafe
    res = adapter.execute_macro(
          ^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt/adapters/base/impl.py", line 1193, in execute_macro
    result = macro_function(**kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt_common/clients/jinja.py", line 355, in __call__
    return self.call_macro(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt_common/clients/jinja.py", line 323, in call_macro
    return macro(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/jinja2/runtime.py", line 770, in __call__
    return self._invoke(arguments, autoescape)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/jinja2/runtime.py", line 784, in _invoke
    rv = self._func(*arguments)
         ^^^^^^^^^^^^^^^^^^^^^^
  File "<template>", line 220, in macro
  File ".venv/amex/lib/python3.11/site-packages/jinja2/sandbox.py", line 401, in call
    return __context.call(__obj, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/jinja2/runtime.py", line 303, in call
    return __obj(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt/clients/jinja.py", line 84, in __call__
    return self.call_macro(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File ".venv/amex/lib/python3.11/site-packages/dbt_common/clients/jinja.py", line 321, in call_macro
    with self.exception_handler():
  File ".local/share/mise/installs/python/3.11.11/lib/python3.11/contextlib.py", line 158, in __exit__
    self.gen.throw(typ, value, traceback)
  File ".venv/amex/lib/python3.11/site-packages/dbt_common/clients/jinja.py", line 348, in exception_handler
    raise CaughtMacroErrorWithNodeError(exc=e, node=self.macro)
dbt_common.exceptions.macros.CaughtMacroErrorWithNodeError: Compilation Error
  'NoneType' object is not iterable
  
  > in macro stage_external_sources (macros/common/stage_external_sources.sql)
  > called by <Unknown>

�[0m22:08:34.037470 [debug] [MainThread]: Resource report: {"command_name": "run-operation", "command_success": false, "command_wall_clock_time": 4.824697, "process_in_blocks": "0", "process_kernel_time": 0.261603, "process_mem_max_rss": "134938624", "process_out_blocks": "0", "process_user_time": 0.939263}
�[0m22:08:34.037897 [debug] [MainThread]: Command `dbt run-operation` failed at 22:08:34.037835 after 4.83 seconds
�[0m22:08:34.038124 [debug] [MainThread]: Connection 'macro_stage_external_sources' was properly closed.
�[0m22:08:34.038358 [debug] [MainThread]: Sending event: {'category': 'dbt', 'action': 'invocation', 'label': 'end', 'context': [<snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x100661b90>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121e12450>, <snowplow_tracker.self_describing_json.SelfDescribingJson object at 0x121e05a50>]}
�[0m22:08:34.038625 [debug] [MainThread]: Flushing usage events
�[0m22:08:35.236076 [debug] [MainThread]: An error was encountered while trying to flush usage events

System information

The contents of your packages.yml file:

packages:
  - package: dbt-labs/redshift
    version: 0.9.0
  - package: dbt-labs/codegen
    version: 0.13.1
  - package: dbt-labs/dbt_utils
    version: 1.3.0
  - package: starburstdata/trino_utils
    version: 0.6.0
  - package: dbt-labs/dbt_external_tables
    version: 0.11.1

Which database are you using dbt with?

  • redshift
  • snowflake
  • other (specify:Spark in EMR)

The output of dbt --version:

1.8.9 

The operating system you're using:

MAC

The output of python --version:

Python 3.11.11

Additional context

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingtriage

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions