-
Notifications
You must be signed in to change notification settings - Fork 0
Update the Subscription class #117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Coverage summary from CodacySee diff coverage on Codacy
Coverage variation details
Coverage variation is the difference between the coverage for the head and common ancestor commits of the pull request branch: Diff coverage details
Diff coverage is the percentage of lines that are covered by tests out of the coverable lines that the pull request added or modified: See your quality gate settings Change summary preferences |
|
@troyraen requesting your review and noting the error caused when building the documentation. Have you encountered that error before? |
|
I've seen similar things. They can be a little tricky. But I would think this should be using our poetry.lock file, which we haven't changed in awhile. The most recent change I see that I would suspect could be causing this is #113 which upgraded setuptools for actions/setup-python, but the Build Documentation completed successfully twice after that merged. I would start by trying to build the documentation locally, if you haven't already. That may or may not provide insight since this could be related to the underlying os environment which will be different for you locally. But it's the easiest thing to do to start looking. If the local build works for you, I'd suggest upgrading our dependencies in the poetry.lock file. There is a more recent version of pyarrow, so just upgrading that might fix the problem. (But go ahead and upgrade everything.) I think the command is something like |
d766522 to
7407e55
Compare
|
@troyraen thanks for your help! |
troyraen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Try to write at least one unit test to check some part of the code you're adding here. (Maybe the lines that parse the function name out of the javascript string?)
Might want to copy the validation piece of #116 into a new issue to be dealt with later.
| import attrs.validators | ||
| import google.api_core.exceptions | ||
| import google.cloud.pubsub_v1 | ||
| from google.pubsub_v1.types import JavaScriptUDF, MessageTransform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do import google.pubsub_v1.types instead of from ....
Also, is this a new dependency that needs to be added to the toml file?
| attribute_filter (str, optional): | ||
| Specify a filter to only receive the messages whose attributes match the filter. The filter is an immutable | ||
| property of a subscription. After you create a subscription, you cannot update the subscription to modify | ||
| the filter. | ||
| udf (str, optional): | ||
| Specify a JavaScript User-Defined Function (UDF). UDFs attached to a subscription can enable a wide range | ||
| of use cases, including: message filtering (based on the message payload and/or attributes), simple data | ||
| transformations, data masking and redaction, and data format conversions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to go into the details here, but do give the user a heads up that both of these strings must conform to very specific requirements and then provide links to the google documentation where they can learn what those requirements are.
In the udf docstring, incorporate "Pub/Sub Single Message Transforms (SMTs)" so it's clear what this is used for.
Check our other docstrings and examples to see what words we typically use to describe message attributes and payloads. New users probably won't immediately understand what they mean and I may have intentionally used different words elsewhere that I thought would be more recognizable. Would be good to be consistent throughout our docs. (Maybe we do use "attributes" and "payload", I just literally don't remember right now.)
| Example: | ||
| Create a subscription to Pitt-Google's 'ztf-loop' topic and pull messages: | ||
| Create a subscription to Pitt-Google's 'lsst-loop' topic and pull messages: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make it clear that this is simulated data. (Ideally, the topic itself would be called something like lsstsims-loop, but also ok just to make a note right here instead of changing the topic name.)
| keepDiaObjects = "attributes:diaObject_diaObjectId" | ||
| filterByNPrevDetections = ''' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally, I would name these two variables attribute_filter and udf to match the kwarg names so it's clear from the start where each is destined to be applied. But whatever you prefer.
Add a comment for each explaining what kind of messages will pass the filter. (Does the first one actually do anything with just a field name?)
| Specify a filter to only receive the messages whose attributes match the filter. The filter is an immutable | ||
| property of a subscription. After you create a subscription, you cannot update the subscription to modify | ||
| the filter. | ||
| udf (str, optional): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call this variable smt_udf to include the info about what the string is used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now I see that google calls this variable javascript_udf, so that would be a good name for it here as well. It's helpful to know that this is javascript. Could also add in the SMT info and call it smt_javascript_udf.
| raise TypeError("The subscription needs to be created but no topic was provided.") | ||
|
|
||
| if self.udf: | ||
| import re |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this library is quite small, so it's fine to just import it at the top of this file.
| _function_name = match.group(1) if match else "filter" | ||
| if not match: | ||
| LOGGER.warning("Could not parse function name from UDF; using default 'filter'.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't necessarily a filter though, right? It could be a transformation that doesn't filter.
| attribute_filter: Optional[str] = attrs.field(default=None) | ||
| udf: Optional[str] = attrs.field(default=None) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these should just be kwargs in the touch() function. These are only used once when creating the subscription. And when a user creates a Subscription object for a gcp subscription that already exists, our code doesn't make the filter or udf available in the Subscription object even if they exist on the actual subscription. So I don't think they should be class attributes.
And if the user passes one in but the subscription already exists, should probably raise a warning saying the subscription exists so the kwarg wasn't used.
Summary of Changes
Added
pittgoogle/pubsub.pySubscriptionclass now supports the arguments:attribute_filterandudf_create()function for theSubscriptionclass now supports the creation of subscriptions that use Pub/Sub's built-in filters (i.e., filter based on message attributes) and/or single message transforms through user-defined functionsCloses #116.