Replies: 5 comments 12 replies
-
@sf-dcp I like where this is going! My immediate thought is to orient this around a product/dataset stage in our lifecycle, and anything in dcpy.lifecycle that changes the status of a products lifecycle should also add a record to the logging table. So the columns would look like:
I love the idea of being able to track builds etc, but maybe we just chuck those specifics in a JSON field. I'm thinking we'll probably always do things manually sometimes... so maybe someone distributes to Socrata from their local machine. I'd want that to be logged, but there wouldn't be an action to track back to. And maybe we set up some automation around Draft Review issues in Github, where changes to the issue trigger something to add to your table. |
Beta Was this translation helpful? Give feedback.
-
Not hugely related to this issue but just noting that my "s3" scraping was really
The source data page in the QA app is aimed at this db table, not s3. The scraping only happened once! Back to actual discussion. I like this a lot. A couple random notes
|
Beta Was this translation helpful? Give feedback.
-
seems like we do use python for all the events we'd wanna log
so maybe it'd be cleanest to just call a new python function for logging during those existing python steps? @fvankrieken you said "right now a "build" doesn't really happen in python" and I agreed at first. but looking at things now it doesn't seem like we need a new GHA or CLI to log relevant events |
Beta Was this translation helpful? Give feedback.
-
I don't think so. part of me likes the idea of including nightly QA, but we never promote those builds to draft so we might as well treat them like test artifacts might be nice to have a "blacklist" of build names to ignore in the logging function. that way we don't have to add any logic to any GHAs
I don't think so. since we'll only care about things that happened (e.g. files in DO changed), logs of failed actions in this type of table don't seem worth it |
Beta Was this translation helpful? Give feedback.
-
Thank you all for the feedback. I think I have enough info to keep in mind to start coding :) |
Beta Was this translation helpful? Give feedback.
-
Motivation
A need for a promotions logging table came up in the PLUTO DE QA project to monitor how long it takes from product build to publish. And overall, we would like to have an easy way to get a product status (though our whiteboard is nice too 😊).
@fvankrieken created a feature a while back to scrape s3 which enables us to build a feature on top to estimate a given product lifecycle. A limitation of this approach is: if an s3 file gets automatically/manually overwritten, deleted, or modified, we won't see it unless we store a historical information from scraping. Additionally, there may be costs with making too many calls to s3 (couldn't confirm our plan details due to no access to billing info in DO). From what I'm reading about DO s3 plans, there is a set amount of outbound traffic per plan; any additional calls outside of the limit charged extra.
Proposal
Some questions I would like to answer with the future logging table are:
❓How long did it take from build to publish for product X?
❓What is the current status for product X version V?
❓How long did QA take for product X? What's the average QA timeline?
Sample table
The table below represents what it would look like us publishing a product and then later patching the published version. The questions above could be answered via creating custom views in DB. For example, one view could be for product status, one view for the whole lifecycle timeline per product, or we could have one view per product.
link
timestamp
link
timestamp
link
timestamp
link
timestamp
link
timestamp
link
timestamp
link
timestamp
Implementation
📌 Log what: build, promote to draft, publish, and distribute GH actions. Later we could add packaging when it's automated.
📌Where: in our database. Either create a separate db or use default-db, public schema. We could also dump a copy of the table somewhere in S3 occasionally as a backup.
📌 How:
build-name
attribute to product metadata. Having this attribute helps sort out any draft-draft builds like nightly qa. Will be added to metadata during build action.Feedback needed
It would be nice to get feedback asap for this to be implemented in this sprint.
Beta Was this translation helpful? Give feedback.
All reactions