Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with bind variables #187

Closed
jaroslawjarek opened this issue May 17, 2023 · 4 comments
Closed

problem with bind variables #187

jaroslawjarek opened this issue May 17, 2023 · 4 comments
Labels
question Further information is requested

Comments

@jaroslawjarek
Copy link

I am running on:
oracledb.version -> 1.3.1
oracle client: instantclient-basic-linux.x64-21.9.0.0.0dbru.zip
os: Red Hat Enterprise Linux"VERSION="8.7 (Ootpa)
source: oracle: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.10.0.0.0
destination: Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production Version 19.16.0.0.0

What is my case:

  1. I am connecting to the source system, which is source of my large set of data,
  2. I am executing my extraction query but just to receive metadata - it is the same query as for extraction but with enclosed with upper query with condition 1=0,
  3. based on this query I am preparing list of source column names and datatypes of these columns
    e.g.
    v_records_column_names = tuple([col[0] for col in v_source_meta_cursor.description])
    v_records_column_types = tuple([col[1] for col in v_source_meta_cursor.description])

4)Based on my column names I am query empty destination table to receive destination column datatypes - the same way as in point2.

5)Based on source query column names I am preparing insert statement with bind variables e.g.:
INSERT INTO SDH_STAGING.B16735660_MATERIAL (INVENTORY_ITEM_ID,SEGMENT1,CREATION_DATE,CREATED_BY,ITEM_CATALOG_GROUP_ID,ATTRIBUTE16,INVENTORY_ITEM_STATUS_CODE,DESCRIPTION,ITEM_TYPE,PRIMARY_UOM_CODE,DIMENSION_UOM_CODE,UNIT_HEIGHT,UNIT_LENGTH,UNIT_VOLUME,UNIT_WEIGHT,UNIT_WIDTH,VOLUME_UOM_CODE,WEIGHT_UOM_CODE,ATTRIBUTE3,LAST_UPDATE_DATE,CUSTOMER_ORDER_ENABLED_FLAG,FULL_LEAD_TIME,SAFETY_STOCK_BUCKET_DAYS,PREPROCESSING_LEAD_TIME,POSTPROCESSING_LEAD_TIME,MINIMUM_ORDER_QUANTITY,PRODUCT_FAMILY_ITEM_ID,BRAND,STARS_CODE,CATEGORY_SET_ID,LOCAL_HTTS_CODE,SEGMENT5,GDP,GLOBAL_ATTRIBUTE3) values (:INVENTORY_ITEM_ID,:SEGMENT1,:CREATION_DATE,:CREATED_BY,:ITEM_CATALOG_GROUP_ID,:ATTRIBUTE16,:INVENTORY_ITEM_STATUS_CODE,:DESCRIPTION,:ITEM_TYPE,:PRIMARY_UOM_CODE,:DIMENSION_UOM_CODE,:UNIT_HEIGHT,:UNIT_LENGTH,:UNIT_VOLUME,:UNIT_WEIGHT,:UNIT_WIDTH,:VOLUME_UOM_CODE,:WEIGHT_UOM_CODE,:ATTRIBUTE3,:LAST_UPDATE_DATE,:CUSTOMER_ORDER_ENABLED_FLAG,:FULL_LEAD_TIME,:SAFETY_STOCK_BUCKET_DAYS,:PREPROCESSING_LEAD_TIME,:POSTPROCESSING_LEAD_TIME,:MINIMUM_ORDER_QUANTITY,:PRODUCT_FAMILY_ITEM_ID,:BRAND,:STARS_CODE,:CATEGORY_SET_ID,:LOCAL_HTTS_CODE,:SEGMENT5,:GDP,:GLOBAL_ATTRIBUTE3

  1. Using connection and cursor to source system I am extracting data in chunk in the following way:
v_source_cursor=self.source_db_connection.cursor('server_side_cursor')
v_source_cursor.itersize = 2*v_fetch_size
#v_tsql - extraction query
v_source_cursor.execute(v_tsql)


v_total_cnt:int=0
logger.info(f"Extraction from source {self.source_system_id} has started")
#fetch data and insert data in loops
v_sdh_cursor_open=True
while True:

    v_tsql_record_set = v_source_cursor.fetchmany(v_fetch_size)

    if  len(v_tsql_record_set)==0:
        break
    
    else:
        
        # we take a new cursor before we insert new chunk of data
        v_total_cnt+=len(v_tsql_record_set)
        logger.debug(f"v_total_cnt: {v_total_cnt}")
        #v_sdh_cursor=self.sdh_connection.cursor()
        v_sdh_cursor.executemany(v_sql_staging_table_insert_template,v_tsql_record_set)
        #v_sdh_cursor.close()
        #v_sdh_cursor_open=False
        self.sdh_connection.commit()
  1. Randomly for different records I was receiving this message:

    v_sdh_cursor.executemany(v_sql_staging_table_insert_template,v_tsql_record_set)
    File "/home/py_elt/.local/lib/python3.8/site-packages/oracledb/cursor.py", line 438, in executemany
    self._impl.bind_many(self, parameters)
    File "src/oracledb/impl/base/cursor.pyx", line 339, in oracledb.base_impl.BaseCursorImpl.bind_many
    File "src/oracledb/impl/base/cursor.pyx", line 61, in oracledb.base_impl.BaseCursorImpl._bind_values
    File "src/oracledb/impl/base/cursor.pyx", line 122, in oracledb.base_impl.BaseCursorImpl._bind_values_by_position
    File "src/oracledb/impl/base/bind_var.pyx", line 129, in oracledb.base_impl.BindVar._set_by_value
    File "src/oracledb/impl/base/var.pyx", line 86, in oracledb.base_impl.BaseVarImpl._check_and_set_value
    File "src/oracledb/impl/base/var.pyx", line 59, in oracledb.base_impl.BaseVarImpl._check_and_set_scalar_value
    File "src/oracledb/impl/base/connection.pyx", line 147, in oracledb.base_impl.BaseConnImpl._check_value
    File "/home/py_elt/.local/lib/python3.8/site-packages/oracledb/errors.py", line 118, in _raise_err
    raise exc_type(_Error(message)) from cause
    oracledb.exceptions.NotSupportedError: DPY-3013: unsupported Python type int for database type DB_TYPE_VARCHAR

If I took just the records, which made a problem - I started to process in chunks of two records - I was able to insert them into db. I used such code:

v_insert_stm="""INSERT INTO SDH_STAGING.B16735660_MATERIAL (INVENTORY_ITEM_ID,SEGMENT1,CREATION_DATE,CREATED_BY,ITEM_CATALOG_GROUP_ID,ATTRIBUTE16,INVENTORY_ITEM_STATUS_CODE,DESCRIPTION,ITEM_TYPE,PRIMARY_UOM_CODE,DIMENSION_UOM_CODE,UNIT_HEIGHT,UNIT_LENGTH,UNIT_VOLUME,UNIT_WEIGHT,UNIT_WIDTH,VOLUME_UOM_CODE,WEIGHT_UOM_CODE,ATTRIBUTE3,LAST_UPDATE_DATE,CUSTOMER_ORDER_ENABLED_FLAG,FULL_LEAD_TIME,SAFETY_STOCK_BUCKET_DAYS,PREPROCESSING_LEAD_TIME,POSTPROCESSING_LEAD_TIME,MINIMUM_ORDER_QUANTITY,PRODUCT_FAMILY_ITEM_ID,BRAND,STARS_CODE,CATEGORY_SET_ID,LOCAL_HTTS_CODE,SEGMENT5,GDP,GLOBAL_ATTRIBUTE3) values (:INVENTORY_ITEM_ID,:SEGMENT1,:CREATION_DATE,:CREATED_BY,:ITEM_CATALOG_GROUP_ID,:ATTRIBUTE16,:INVENTORY_ITEM_STATUS_CODE,:DESCRIPTION,:ITEM_TYPE,:PRIMARY_UOM_CODE,:DIMENSION_UOM_CODE,:UNIT_HEIGHT,:UNIT_LENGTH,:UNIT_VOLUME,:UNIT_WEIGHT,:UNIT_WIDTH,:VOLUME_UOM_CODE,:WEIGHT_UOM_CODE,:ATTRIBUTE3,:LAST_UPDATE_DATE,:CUSTOMER_ORDER_ENABLED_FLAG,:FULL_LEAD_TIME,:SAFETY_STOCK_BUCKET_DAYS,:PREPROCESSING_LEAD_TIME,:POSTPROCESSING_LEAD_TIME,:MINIMUM_ORDER_QUANTITY,:PRODUCT_FAMILY_ITEM_ID,:BRAND,:STARS_CODE,:CATEGORY_SET_ID,:LOCAL_HTTS_CODE,:SEGMENT5,:GDP,:GLOBAL_ATTRIBUTE3)"""
v_test_data=[(77440536, 'QWPRJ-QMCA32919-00', datetime.datetime(2023, 5, 2, 14, 10, 13), '227173', None, 'QWPRJ', 'Relsd SKU', 'PRODUCT PROFESSIONAL SERVICES OTHER APC CONTINGENCY FUNDS', 'ETO', 'EA', None, None, None, None, None, None, None, None, None, datetime.datetime(2023, 5, 2, 14, 10, 13), 'Y', None, None, None, 0, None, None, 'THIRD PARTY PROCUREMENT', 'NA', 3, None, '501', None, None), (763774, 'ACECCP100', datetime.datetime(2010, 9, 15, 14, 29, 52), '32740', None, None, 'Inactive', 'EcoBreeze Remote Display', 'FG', 'EA', 'IN', 15, 9.0157, 1.0171, 13.002, 12.9921, 'FT3', 'LBS', '731304283973', datetime.datetime(2023, 5, 4, 17, 21, 31), 'N', None, None, None, 0, None, None, 'ECOBREEZE', 'NA', 3, None, '337', '16S4', 'PIM')]

v_sdh_cursor.executemany(v_insert_stm,v_test_data)

At the end I decided to close cursor after processing each chunk of data - than issue has gone. However, as we speak about production system, I would like to receive some help on the issue and what can be potential root cause? I am sure order of columns between select and insert statement is correct. Also datatypes between databases match. Python datatypes match as well.

I found in such issue:
oracle/python-cx_Oracle#586

and in my case columns sometimes have values and sometimes there are NULL.

@jaroslawjarek jaroslawjarek added the question Further information is requested label May 17, 2023
@anthony-tuininga
Copy link
Member

I suspect your problem is that in the first chunk of data all of the data (in one column) is NULL and so python-oracledb simply assumes that you are inserting data of type DB_TYPE_VARCHAR with a length of 1. In your next chunk of data you have some numbers (or dates or strings longer than 1 byte in length) and then you get an error. The solution is to use cursor.setinputsizes() to tell the driver up front what type of data you intend to insert -- that way it won't make a bad guess based on input data.

@jaroslawjarek
Copy link
Author

I did it but it was exactly the same behavior.
I did it in the following way:

db_types = (d[1] for d in db1_cursor.description)
db2_cursor.setinputsizes(*db_types)
I used by destination table as source for the operation as destination has margin of columns length.

As I said, only change cursor after each chunk (4000 records) helped.

@anthony-tuininga
Copy link
Member

That way doesn't cover all scenarios. The method shown in a tool I built a while back called CopyData is what you need to do if you want to be certain of success. Switching the cursor after each chunk will also "solve" the problem -- at the cost of having to determine the types and sizes of the data for each chunk.

@anthony-tuininga
Copy link
Member

Closing - no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants