Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: Add query caching pipeline behavior #75

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

matthewtoghill
Copy link
Collaborator

Description

Implement a query caching pipeline behavior.
Adds cached versions of each of the existing query base classes:
CachedQueryBase
CachedQueryPagedBase
CachedQueryByIdBase
CachedQueryByKeyBase

Any query definition inheriting from a cached query base class is required to override the CacheKey property. This should be defined in a way that results in a cache key that is specific to the parameters of that query. Where the query includes a QueryString the base class QueryBase has been updated to include a method GetQueryStringHashCode to help with generating an appropriate cache key.

Related Issue

#8

Motivation and Context

Allows specific queries to have their result cached for a given time period (defaults of 5 minutes but can be set on each query definition as needed) meaning that subsequent calls to that query endpoint matching the cache key will return the cached result without requiring a roundtrip to the database.

How Has This Been Tested?

Tested queries inheriting from the base classes CachedQueryByIdBase, CachedQueryBase, CachedQueryPagedBase with SQL server profiler to confirm that only the first query within the specified cache expiration window resulted in a database query to retrieve the query results, subsequent queries matching the cache key returned the results without any database query.

Additionally, tested changing query parameters to the same endpoint within the cache expiration window resulted in a different cache key and a database query to fetch new data. Any subsequent calls to the endpoint using previously used query params within the cache expiration window would return the cached results.

Once the cache expiration window had expired the next call to the endpoint successfully queried the database and cached the results.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

@CesarD
Copy link
Collaborator

CesarD commented Jun 24, 2024

Thanks for submitting it, Matt!
I'll review it later today 😉

Please don't merge it until I had a chance of reviewing it, guys.

Copy link
Collaborator

@CesarD CesarD left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewtoghill I've been taking a look at it.
Any reason why not implementing the cache abstraction with Polly? Since we already have it, I was expecting to keep leveraging it for this as well, since it's easier to make it implement either a Memory Cache or a Distributed Cache and enclose pieces of code that will automatically detect when to perform the execution of the code and when to resort to the cached result.
Also, it could be a good chance to evaluate whether to perform some Output Caching for endpoints directly.
Worth to give it a refresh to the documentation on this matter: https://learn.microsoft.com/en-us/aspnet/core/performance/caching/overview?view=aspnetcore-8.0

@CesarD
Copy link
Collaborator

CesarD commented Aug 13, 2024

Hey @matthewtoghill I've been going over this again because I came across a library that seems pretty interesting for implementing the caching: it's called FusionCache and has a lot of features that can be very useful to have at our disposal in comparison to other libs.

My idea is that we can use it to have both levels of cache it offers (L1 Memory and L2 Distributed with a Redis backplane) and apply it through the OutputCache I mentioned before to offer cached endpoints directly without having to resort to go too down into the Application layer to perform caching. At least for a basic implementation sample which we might do with the Countries endpoints.
Then, perhaps we can use it to perform caching of Products and trigger some messages to evict the cache when some product is created/updated.

Also, perhaps we can refactor the implementation a bit from additional Query classes to maybe some attributes-based one, so we can just re-use the same queries we have right now but apply the QueryCachingBehavior based on whether the Query has been decorated with the "CachedAttribute". Or have you considered some scenarios where this might fall short?

Let me know your thoughts! :)

@matthewtoghill
Copy link
Collaborator Author

Hey @matthewtoghill I've been going over this again because I came across a library that seems pretty interesting for implementing the caching: it's called FusionCache and has a lot of features that can be very useful to have at our disposal in comparison to other libs.

My idea is that we can use it to have both levels of cache it offers (L1 Memory and L2 Distributed with a Redis backplane) and apply it through the OutputCache I mentioned before to offer cached endpoints directly without having to resort to go too down into the Application layer to perform caching. At least for a basic implementation sample which we might do with the Countries endpoints. Then, perhaps we can use it to perform caching of Products and trigger some messages to evict the cache when some product is created/updated.

Also, perhaps we can refactor the implementation a bit from additional Query classes to maybe some attributes-based one, so we can just re-use the same queries we have right now but apply the QueryCachingBehavior based on whether the Query has been decorated with the "CachedAttribute". Or have you considered some scenarios where this might fall short?

Let me know your thoughts! :)

Hey! Yea I agree that FusionCache looks like a really interesting library.

Initially I was thinking the implementation in this PR would get refactored to use the HybridCache that is coming with .NET 9 later this year and it would be okay for an initial working version to use the Memory Cache only. The HybridCache would then give us the same L1 Memory and L2 Distributed (if optionally set up) support if I understand it correctly?

I had some doubts about using Polly for the caching, it wasn't clear to me if Polly v8 properly supports caching right now? Polly issue 1127 has a good discussion on this and I can see you've commented on it recently too so are aware and the developer of Fusion Cache is also contributing which is cool.

The implementation in this PR was also not so focussed on resilience but instead avoiding unnecessary calls to the Db for those endpoints where the data won't change and we don't need to bother about cache invalidation for example. I appreciate though that this may not have been your original intent for the feature proposed in the issue.

How would we define the cache keys if we used the Attribute approach? That's the core purpose of the new cached query classes.

@CesarD
Copy link
Collaborator

CesarD commented Aug 15, 2024

Well, if it's just for the sake of reducing calls to DB I would focus on implementing the OutputCache for endpoints like Countries and such, where we can just cache their content once and then we can forget about it until we need to redeploy something because the seed changed due to an addition/edition, or just for a short period of time (like, let's say, 5 mins, for an initial implementation in, for example, Companies).

Now, regarding the rest, I am thinking now based on what I've been reading and what you mentioned: do we really need to cache information at the Application level? Perhaps we can just leverage OutputCache with FusionCache in the background which will keep the Memory and Redis backplane in sync and then just evict/invalidate via some integration messages (like for example in Products or Companies, when we insert/update/delete some record). It would also allow us to play with the simulator and configure auto-recovery, things that the HybridCaching won't cover since it's not implementation, just something that can work side-by-side with other stuff like FusionCache (discussion here)

What do you think on that?

It would also be interesting for implementing it at the API Gateway level (where we can configure in combination with YARP), since that will be another point in Monaco to possibly implement caching so any internal APIs wouldn't get hit if there's already some cached content that the GW can return very early on.

I know this goes beyond the initial starting point you mentioned, but it's kind of the scope I've been looking at more recently.

@matthewtoghill
Copy link
Collaborator Author

Okay I think I'm catching up to where you are coming from as I'm learning more about Output versus Response Caching. I expect you've read it, but this article is also interesting: https://learn.microsoft.com/en-us/aspnet/core/performance/caching/output?view=aspnetcore-8.0

I like your suggestion to explore the output caching approach instead, perhaps we create a new branch and implement output caching and see how it develops?

@CesarD
Copy link
Collaborator

CesarD commented Aug 17, 2024

Yeah, sorry for the confusion and the shift on the direction of this issue.
I originally expected to do it as it's described and mostly as you implemented it in this PR, but then the more I think about it, the more I doubt about the original plan 😅

With the arrival of OutputCache (I already read the article you pointed) and other options at the API level, things can be implemented better there without having to resort to my original idea (MediatR behavior), which can scale better if we later decide to take it to the API GW level as well (which would add way more resilience in turn, since the internal API could be down but the GW could still return cached data).

Adding another branch and start work there regarding the output caching looks great to me 😃 Let's also leverage FusionCache for doing all this. The backplane syncronization and monitoring seem really great features we can bake into this whole implementation to make everything more flexible.

I'll edit the issue later in order to make it match all this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants