-
-
Notifications
You must be signed in to change notification settings - Fork 89
Description
I'm a contributor to python-mysql-replication and you might remember me from the issue. I would like your opinion on eliminating table_map in python-mysql-replication.
Background:
The current approach in python-mysql-replication for gathering column schema is to SELECT information_schema.columns
. pg_chameleon seems to be referring to column name in specific.
pg_chameleon/pg_chameleon/lib/mysql_lib.py
Lines 1428 to 1441 in 3e431ec
for column_name in event_after: | |
try: | |
column_type=column_map[column_name] | |
except KeyError: | |
self.logger.debug("Detected inconsistent structure for the table %s. The replay may fail. " % (table_name)) | |
column_type = 'text' | |
if column_type in self.hexify and event_after[column_name]: | |
event_after[column_name]=binascii.hexlify(event_after[column_name]).decode() | |
elif column_type in self.hexify and isinstance(event_after[column_name], bytes): | |
event_after[column_name] = '' | |
elif column_type == 'json': | |
event_after[column_name] = self.__decode_dic_keys(event_after[column_name]) | |
elif column_type in self.spatial_datatypes and event_after[column_name]: | |
event_after[column_name] = self.__get_text_spatial(event_after[column_name]) |
I have used python-mysql-replication connecting to MySQL which serves 2,500+ qps. Under circumstances where replication gap exists, the result of
SELECT
would represent the column schema at the time of execution of SELECT
rather than the time when the event was generated. This results in receiving wrong column names, maybe in wrong orders or dummy column names that does not exist in the MySQL.
Concern:
Given that pf_chameleon depends on python-mysql-replication, I wanted to get your input on potential disruptions. The old approach (SELECT
ing information_schema) could have had its own set of issues or limitations that users of pf_chameleon might have faced.
Proposed Solutions:
Drop support for the old approach and parse optional_metadata which holds column names. This could lead to a cleaner codebase but might introduce breaking changes for those who rely on the old behavior.
julien-duponchelle/python-mysql-replication#477
I would like to know what you think about this change. I will assist you with anything I can.