Skip to content

@W-20921403: Smart Lookup Resolution for Mixed Salesforce IDs and Local References#3946

Merged
aditya-balachander merged 2 commits intomainfrom
W-20921403/snowfakery-lookup-issue
Jan 28, 2026
Merged

@W-20921403: Smart Lookup Resolution for Mixed Salesforce IDs and Local References#3946
aditya-balachander merged 2 commits intomainfrom
W-20921403/snowfakery-lookup-issue

Conversation

@aditya-balachander
Copy link
Contributor

Problem

When a Snowfakery recipe contains multiple instances of the same Salesforce object where:

  • One instance uses SalesforceQuery.find_record for a lookup field (returns a real Salesforce ID)
  • Another instance uses reference: for the same field (returns a local ID)

The --generate-cci-mapping-file command creates a single mapping step with that field marked as a lookup. This causes CumulusCI data load failures because the pre-resolved Salesforce ID from find_record cannot be resolved through the ID table lookup mechanism—it returns NULL, resulting in REQUIRED_FIELD_MISSING errors.

Example Recipe

- plugin: snowfakery.standard_plugins.Salesforce.SalesforceQuery

- object: Pricebook2
  nickname: CustomPricebook
  fields:
    Name: Custom Pricebook

# PricebookEntry #1 - uses find_record (returns real SF ID like "01sSG00000Dsd89YAB")
- object: PricebookEntry
  fields:
    Pricebook2Id:
      SalesforceQuery.find_record:
        from: Pricebook2
        where: IsStandard=true
    UnitPrice: 100

# PricebookEntry #2 - uses reference (returns local ID "1")
- object: PricebookEntry
  fields:
    Pricebook2Id:
      reference: Pricebook2
    UnitPrice: 200

Generated SQL

INSERT INTO "PricebookEntry" VALUES(1, '01sSG00000Dsd89YAB', '100');  -- SF ID from find_record
INSERT INTO "PricebookEntry" VALUES(2, '1', '200');                   -- Local reference

What Happened Before This Fix

The lookup resolution performed a LEFT OUTER JOIN on the ID table:

SELECT PricebookEntry.*, cumulusci_id_table.sf_id
FROM PricebookEntry
LEFT OUTER JOIN cumulusci_id_table
  ON cumulusci_id_table.id = PricebookEntry.Pricebook2Id
  • Row 2 (Pricebook2Id = '1'): Match found → correct SF ID returned ✓
  • Row 1 (Pricebook2Id = '01sSG00000Dsd89YAB'): No match → returns NULL

Result: REQUIRED_FIELD_MISSING: Required fields are missing: [Pricebook2Id]


Solution

Implemented "smart lookup" resolution that detects when a value is already a valid Salesforce ID and uses it directly instead of attempting ID table lookup.

Changes

1. cumulusci/tasks/bulkdata/query_transformers.py

Added helper functions to detect valid Salesforce IDs:

import re

# Salesforce ID pattern: 15 or 18 alphanumeric characters
SF_ID_PATTERN = re.compile(r"^[a-zA-Z0-9]{15}$|^[a-zA-Z0-9]{18}$")

def is_salesforce_id(value):
    """Check if a value looks like a valid Salesforce ID."""
    if value is None:
        return False
    return bool(SF_ID_PATTERN.match(str(value)))

def _is_salesforce_id_sqlite(value):
    """SQLite UDF wrapper for is_salesforce_id."""
    return 1 if is_salesforce_id(value) else 0

def register_sqlite_functions(connection):
    """Register custom SQLite functions on a database connection."""
    dbapi_connection = connection.connection.dbapi_connection
    dbapi_connection.create_function("is_salesforce_id", 1, _is_salesforce_id_sqlite)

Modified AddLookupsToQuery.columns_to_add to use a CASE expression:

@cached_property
def columns_to_add(self):
    """Build column expressions for lookup fields with smart ID resolution."""
    columns = []
    for lookup in self.lookups:
        lookup.aliased_table = aliased(self.metadata.tables[ID_TABLE_NAME])
        key_field = lookup.get_lookup_key_field(self.model)
        value_column = getattr(self.model, key_field)

        sf_id_from_table = lookup.aliased_table.columns.sf_id
        
        smart_lookup = case(
            # If we found a match in the ID table, use that
            (sf_id_from_table.isnot(None), sf_id_from_table),
            # If the original value is already a SF ID, use it directly
            (func.is_salesforce_id(value_column) == 1, value_column),
            # Otherwise return NULL (lookup not found)
            else_=None,
        )
        columns.append(smart_lookup)
    return columns

2. cumulusci/tasks/bulkdata/load.py

Register the custom SQLite function when initializing the database:

from cumulusci.tasks.bulkdata.query_transformers import register_sqlite_functions

@contextmanager
def _init_db(self):
    with self._database_url() as database_url:
        parent_engine = create_engine(database_url)
        with parent_engine.connect() as connection:
            # Register custom SQLite functions for smart lookup resolution
            register_sqlite_functions(connection)
            # ... rest of initialization

How It Works Now

The generated SQL query now uses a CASE expression:

SELECT PricebookEntry.*,
       CASE 
           WHEN cumulusci_id_table.sf_id IS NOT NULL THEN cumulusci_id_table.sf_id
           WHEN is_salesforce_id(PricebookEntry.Pricebook2Id) = 1 THEN PricebookEntry.Pricebook2Id
           ELSE NULL
       END AS resolved_lookup
FROM PricebookEntry
LEFT OUTER JOIN cumulusci_id_table
  ON cumulusci_id_table.id = PricebookEntry.Pricebook2Id
  • Row 2 (Pricebook2Id = '1'): ID table match found → uses resolved SF ID ✓
  • Row 1 (Pricebook2Id = '01sSG00000Dsd89YAB'): No ID table match, but value is valid SF ID → uses original value ✓

jkasturi-sf
jkasturi-sf previously approved these changes Jan 27, 2026
Copy link
Contributor

@jkasturi-sf jkasturi-sf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

Copy link
Contributor

@jkasturi-sf jkasturi-sf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good

@aditya-balachander aditya-balachander merged commit 4cf2cfd into main Jan 28, 2026
23 of 26 checks passed
@aditya-balachander aditya-balachander deleted the W-20921403/snowfakery-lookup-issue branch January 28, 2026 11:12
dipakparmar pushed a commit to ClaritiSoftware/CumulusCI that referenced this pull request Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants