-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to V4 protocol support #102
base: master
Are you sure you want to change the base?
Conversation
It looks like it was meant to be moved by 20fd9a8, but only the copy half of the rename was committed.
Add `api_secret` and `client_id` requirement to Tracker. When they are nil, NoopTracker will be used instead. Require `client_id`. This is a big change, but I believe it the most appropriate approach to Google's v4 API. Because access to standard analytics `client_id` is at least partially a matter of end-user consent, I believe it would be wrong for this library to automatically generate a missing `client_id`. Users of the gem may know that it is reasonable and permissible to generate a missing `client_id`, but they should manage that themselves. Other general updates: * Add many specs for V4::Tracker and V4::Event. * Some changes to event method names to use nomenclature more similar to the Measurement API v4 documentation. * Added `uri` methods to the Net::HTTP and Validation adapters, for both testing and debugging. * Experimental support for auto-generating `timestamp_micros`. * Moved all of the default_adapter related methods from V4 and V4::Tracker into V4::AdapterDefaults, and then extended V4 and V4::Tracker with them. For V4::Tracker, this moves the methods up to the class level, but this also enables memoization at the class level. * Updated `Staccato::Adapter::Validation` to accept a custom uri. Updates to Event: * Adds `Event.param` DSL, backward-compatible with `FIELDS`. This is a step towards changing the `Event` module to a generic superclass. * Return custom parameters from `#as_json`, in addition to the standard recommended parameters. * Recursively encode `#as_json` parameter values, eg: `items: [...]`. * Correctly encode Item, without requiring ActiveSupport. * Add `#engagement_time_msec` and `#session_id` as recommended parameters. * Add `#update`, for adding and updating parameters. * Add `#apply_defaults`, for adding missing parameters to an event. Updates to Tracker * Add a bang-method version of each dynamically generated event type method, to immediately post all enqueued events. * Add `tracker << event` as an alternative form of `#add`. * Update `#add` to accept simple String, Symbol, and Hash events. * Update `#add` to accept already constructed Event objects. * Update `#add` to reverse merge with `#default_event_params`. * Update `#event` to return a frozen duplicate array. This protects against events being added directly. `#add`, `#<<`, `#clear`, and `#trim` should provide adequate support for generic event data, so direct mutation of `#events` shouldn't be necessary. * Update `#post` visibility to be public. * Update `#post` to send in batches of 25 events, returning an array. * Update `#post` to clear `#events` as they are posted. * Add `#default_event_params`, which are applied to each event. * Add basic support for `#user_id` and `#user_properties`. * Convert `NoopTracker` to a subclass of `Tracker`. * Clear `#events` after successfully calling `#track`. * Update `#adapters` to memoize the default adapter on the first access. This simplifies adding a new adapter without losing the default. * Document that `#validate!` sends only the first 25 events. * Document that `#validate!` doesn't clear the #events array. * Update `#validate!` to use the class-memoized validation adapter. * Update `#add_adapter` to accept `:net_http`, `:validation`, or `:logger` with optional logger arguments. * Add `tracker.clear` to clear the events array * Add `tracker.trim` to drop the first events from the events array
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments, have to comment here for them to not be pending? 🙃
# adds convert_booleans() | ||
# The custom event parameters to be sent for this event | ||
# @return (see params) | ||
def custom_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't appear to be used yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for the end-user to use as they see fit, and for documentation. All event params (both "recommended" and "custom") are stored on params
and encoded by as_json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to hold off on this for now until there's a use for it before we add it to the public API.
.to_sym | ||
end | ||
|
||
def param(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to discuss this DSL as a separate PR. Sorry if that's a pain …
I suspect if we do this now, we should just switch to this and get rid of FIELDS. I'd also like to see it documented in the README in a section on custom even classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to have a DSL … maybe field
AND fields
? So we could replace the FIELDS
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I like field
and fields
and that matches the existing FIELDS
constant, but the Measurement protocol API docs all use params
, so I stuck with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect if we do this now, we should just switch to this and get rid of FIELDS. I'd also like to see it documented in the README in a section on custom even classes.
Yeah, I haven't documented anything in the README. 🙁
end | ||
|
||
private | ||
def ==(other) other.class == self.class && other.as_json == as_json end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this used? I'm not seeing it, but I could be missing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used implicitly by the specs expect(actual).to eq(expected)
. It's just nice behavior to have on data container objects (whenever object identity matters less than values).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like as_json
is an "expensive" method, but it's only really for the tests, cool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: expensive, I should probably add a short-circuit at the front of the boolean expression: super || ...
to check object identity before checking the contents.
In this case, as_json
is really just a way to iterate through all of the contents while also compacting away nil
s. And that's basically what's needed for checking value equality.
def self.validation_adapter | ||
require 'staccato/adapter/validate' | ||
Staccato::Adapter::Validate | ||
def self.encode_params_json(params) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? Is it just to remove empty hash/array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC, this was needed to recursively encode items: [Item.new]
while also avoiding a dependency on ActiveSupport.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh … I figured as_json
would recursively call as_json
when it was being encoded by the JSON library. Maybe I didn't put as_json
on Item
? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like struct has as_json https://ruby-doc.org/stdlib-2.5.1/libdoc/json/rdoc/Struct.html so hopefully that would simplify things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more that stdlib's as_json
extensions behave very differently from ActiveSupport's as_json
. In particular, stdlib's Struct#as_json
won't work for Item
. Also, I wanted to compact away any nil
values.
Stdlib's as_json
feels more appropriate for serializing data similarly to how me might YAML or Marshal. ActiveSupport's is generally more appropriate for APIs like this one. This allows us to not care whether either extension is loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I didn't put as_json on Item? 🤔
I might have been able to get away with just adding as_json
on Item
. But this removes nil
values, allows custom structs, and just seemed more generically useful.
self.options = Staccato::OptionSet.new(options) | ||
module ClassMethods | ||
|
||
attr_reader :recommended_params |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this meant to be an attr_writer
? I see def recommended_params
below, and a usage in the param
DSL that maybe was expecting a writer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh … this is in ClassMethods
… the other is the instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be recommended_fields
… since it'll just be the field keys?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the ClassMethods
half of the DSL. It could be renamed to recommended_param_names
to avoid confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah … GA used to call it "fields" now events have "parameters" … so yeah, recommended_param_names
(though I do often misspell recommended 🙃 )
def with_defaults!(defaults) | ||
defaults.to_hash.each_pair do |key, value| | ||
key = key.to_sym | ||
options[key] = value unless options.to_h.key?(key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would ||=
work here?
If not, should options
have a has_key?
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't remember, but I might have copied this implementation from ActiveSupport. They need to do it this way because they need to distinguish between nil
values and no matching key. We don't need to make that distinction, so ||=
should suffice.
default_event_params: {}) | ||
raise ArgumentError, 'measurement_id is required' unless measurement_id | ||
raise ArgumentError, 'api_secret is required' unless api_secret | ||
raise ArgumentError, 'client_id is required' unless client_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think letting client_id be a random UUID as it does in V3 is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I puzzled over this one.
Requiring client_id
is a big change, but I believe it is necessary for to truly satisfy Google's v4 API. As far as I can tell, there is no documentation anywhere for using anything other than the gtag
or gtm
generated client_id
. The big push to v4 seems to be at motivated by regulatory compliance concerns. And my interpretation of the docs is that access to standard analytics client_id
could potentially be a matter of end-user consent (depending on what you're tracking).
In other words, I believe it might be wrong for the library to automatically generate a missing client_id
. If users of the gem think that it is reasonable and permissible to generate a missing client_id, they should manage that themselves.
Also, using a randomly generated UUID can result in tremendously inflated session counts and your boss telling you to try something else. Ask me how I know. 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how a random UUID would be a matter of end-user consent.
I could see it being a session issue, as it's going to make a new session for each post.
Is there a way we could split the difference? Maybe make it a configuration somewhere? require_client_id: true
by default, and a user could disable that with false to get the random UUID back?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how a random UUID would be a matter of end-user consent.
Being forced get the real client_id
also forces the developer into a position where it should be easy for them to also check for consent. That might encourage the developer to consider whether they're allowed to send everything they send.
Not knowing the real client_id
suggests that we haven't even attempted to check whether we have end-user consent. We might have both on the frontend but never communicated it to the backend. Or we might not have a client_id
because we don't have consent.
Either way, that doesn't mean we can't send any data, but (as I understand it) it does mean we shouldn't send certain private or user identifying data. It might be that your app is fine sending everything you intended to send. In my case, we will send certain user_properties
and custom parameters and custom events if we have consent, and we won't if we don't.
I'd prefer a config setting to an always on default. But personally, I think it's better to force the developer to make their own decision about how to handle it:
- the above reason: force the developer to consider the consequences.
- Google's docs never give any valid alternatives to the official
client_id
. - Unique client IDs for each tracker instance causes an explosion in session count.
- The app probably has internal session identifiers of its own that are more stable.
IMO it would be fine to document that SecureRandom.uuid
seems to more-or-less work. But, assuming you're using rails, the following would work much better and we'd be better off recommending something like it:
class Current < ActiveSupport::CurrentAttributes
attribute :ga_real_cid
attribute :ga_fake_cid
attribute :app_session_id
def ga_cid = ga_real_cid || self.ga_fake_cid ||= app_session_id || SecureRandom.uuid
end
...using before_filters
in ApplicationController
to get and set these values appropriately. They could even be forwarded into background jobs and used there as well, using Current.set(ga_real_cid:, ga_fake_cid:, app_session_id:) { the bg job code }
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the info. As you know I haven't been working with this for awhile, so this is all very useful. I appreciate the time you've taken to explain it.
Okay, as long as we document what they'll have to do in case they just need to get it working ASAP, I'm fine moving forward with this being required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tpitale thanks! and there's some other bits of this PR that could use more docs too. not least of which is adding something to the README.md. 😉
# @param logger [Logger] used when `adapter` is `:logger` | ||
# @param formatter [Proc] used when `adapter` is `:logger` | ||
def add_adapter(adapter, logger = nil, formatter = nil) | ||
case adapter |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepting a symbol here is interesting. I'm not sure it's a necessary feature for this PR. I haven't had a feature request for this, and I'd like to consider it more before adding it to the API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It really isn't necessary for this PR.
But logging is such a useful and basic feature that I wanted to remove barriers to adding it. Also... it simplified my app's code. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it's a nice feature. I'm just trying to break down the PR. There's a LOT here 😄
|
||
# The required API Secret that is generated through the Google Analytics UI | ||
# @return [String] | ||
attr_reader :api_secret |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I appreciate having this documented. But I don't know that we need attr_reader
for all of these used internally within the tracker. Again, this adds to the public API we would have to support going forward. Maybe not a huge deal since it matches what GA does … but still.
end | ||
end | ||
|
||
# A tracker which does no tracking | ||
# Useful in testing | ||
class NoopTracker | ||
class NoopTracker < Tracker | ||
attr_accessor :hit_defaults |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I probably should have removed this as it's not a hit anymore, just an event. Oops!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I meant to delete that. Oops.
Thank you again @nevans. I left a bunch of comments. Appreciate all the work. You can address the comments here, or split the PR up. I'll do my best to review and merge this. |
@tpitale thanks for your time! I don't know if I'll split it up this week, but there are some obvious chunks to break off as separate PRs. |
when :logger | ||
require 'staccato/adapter/logger' | ||
uri = self.class.collection_uri | ||
adapter = Staccato::Adapter::Logger.new(uri, logger, formatter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this follow the same pattern as the other adapters in the defaults module? self.class.logging_adapter(uri, logger, formatter)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that. I decided against it because those were effectively accessors for shared singleton-ish objects, and this would need to generate a new logger adapter every time.
But, on reconsidering the matter, it can still be useful to delegate the decision of how to generate that logger to the class.
My laptop is dying and I need to get back to my vacation. I'll try to check back later. |
I probably won't get to this again before Monday, so please: enjoy your vacation! |
I'm happy to move forward with this if @nevans commits to helping me do README/docs updates after this 😄 |
@grega @tpitale sorry for the slow follow up. Barring any unforseen circumstances, I'll definitely get around to cleaning up this PR eventually--both splitting it into smaller PRs and adding documentation. I'll try to work on it a little bit more this week (maybe even today?), but I probably can't commit to completing it before July 1st! (sorry) I can say that I've been running my branch in production for for over six weeks, and I only have a few issues to report:
|
@nevans As long as you feel you can help with docs at some point, I'll go ahead and merge this. |
@tpitale Sorry, I missed this... and I never got back to finishing this otherwise. I've been in the middle of a completely different project and it sucked away all of my attention. This does need more documentation and I should eventually get to that (ideally before I completely forget what I was trying to do with this code!). I can't say whether merging this before or after I start on docs PR(s) is a good idea. Breaking it into separate PRs is a good idea and it isn't a burden... to the contrary, splitting off smaller focused chunks might help the documentation project. And I did want to implement your suggestions as we discussed. On the other hand, if you simply merged this behemoth and released a new version, and I could stop using my fork... I certainly won't complain. ;) |
Recreating #101