-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Overview
The Vendor and Master Blanket PO SharePoint lists have hundreds of records being updated even when no information is actually changing on the record. This is happening because of the way the list of locations related to that Master Blanket PO and Vendor are aggregated from the Purchase Order Releases.
Steps to Reproduce
Steps to reproduce the behavior:
- Run
dgs_fiscal contract_management
from the command line to trigger the Contract Management ETL workflow - After completing that update immediately run
dgs_fiscal contract_management
again, to re-trigger the workflow
Expected Behavior
Two runs of the Contract Management workflow should produce few if any updates during the second run, because all of the changes made to CitiBuy should have been captured during the first run.
Additional Context
I believe this is happening because ContractManagement.get_citibuy_data()
method uses the following code to aggregate the PO Release locations at the level of the Master Blanket PO and the Vendor, which can result in a different order each time the code is run whether or not there have been changes to the list of locations:
# aggregates the location field for contracts and vendors
# excluding all non DGS locations
dgs_po = df["agency"] == "DGS" # excludes non-DGS POs
df["con_loc"] = df["po_nbr"].map(
df[dgs_po].groupby("po_nbr")["unit"].agg(set).str.join(", ")
)
df["ven_loc"] = df["vendor_id"].map(
df[dgs_po].groupby("vendor_id")["unit"].agg(set).str.join(", ")
)
The solution to this issue will likely involve sorting the set of PO locations before joining them as a string.