-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IConfigurationProvider ? #136
Comments
Qualifier: I know practically nothing about FoundationDb or whether it is really suited for this sort of use case. I suspect its overkill to use it as a distributed config provider, but just thought I'd ask a probing question. |
Yes, if the complete list of key/values that would make up your configuration would be "small enough" (ideally less than 1 MB) then you could use FoundationDB as a repository for such a configuration, and be able to mutate/publish a new set of settings with ACID guarantees. There has even been a "port" of ZooKeeper using FoundationDB (see https://github.com/pH14/fdb-zk). FoundationDB supports the concept of "watches" that allows your process to be automatically notified as soon as some keys have been changed in the cluster, which could be used as well to automatically reload a configuration at runtime. The only potential problems to solve for such an IConfigurationProvider would be:
I'm not sure how you are supposed to deal with async configuration providers in .NET during build/DI stage of the application, but if you have a way to be able to await the read of the settings from the cluster, then you could write a simple wrapper that will read the keys from fdb, and expose them to the application. Though, in my experience, potentially blocking the application during startup can be difficult to troubleshoot: logging may not be enabled yet, or even the OpenTelemetry provider may not be started so you would not see any logs coming from process that are blocking on the database. As for your second question, this repository already includes several additional packages for generic and experimental layers, as well as support for Aspire. I could see the utility for such an IConfigurationProvider implementation, which could live in this repo as well. |
If you are using fdb only for this, it would probably be overkill. Though if you need to setup a system specifically for such a task, then FoundationDB could be a good fit, depending on the scale and criticality of the system. Though, once you have access to a FoundationDB cluster, you can use it for a lot of things, like row/document stores, indexing, pub sub, distributed queues, etc... You would even be able to combine all of these different shapes of data in the same transaction, which is impossible/very difficult to do when you are combining multiple different systems to achieve all of this. |
Thanks @KrzysFR
I have the shell of an Async config provider over here: https://github.com/dazinator/Dazinator.Extensions.Configuration
Yes the dotnet IConfigurationProvider picture is all synchronous. However as per the approach I have laid out above, to use ann aync provider you need to prime it at application startup with its initial values, before building the host. This is typically in your application entrypoint, and so you can wrap this with resilience (Polly) policies etc, and terminate the application after X retries etc. I think this would be acceptable to most who are relying on a non local config store. |
There is a potential issue with this approach when using fdb, is that if you want to easily integrate FoundationDB in a modern .NET application, you will probably inject an All of this needs a fully built If you have to use FoundationDB "before" all of this, it means that you would need to replicate all this logic and call note: all FoundationDB bindings (Java, Rust, Python, Go, .NET, ...) are "just" wrappers around the native client library that handles all the work. In the case of .NET, the binding will P/Invoke into the native C lib and basically wraps all of this into Tasks. A possible workaround to "asyncify" the IServiceProviderI've had many design issues with the limitation of "no async" in the DI. The pattern I've ended up using, that works "good enough", is having an The "FooProvider" will handle all the async loading of options/settings, as well as async initialization of the service, pre-loading of caches. The resulting instance - once fully initialized - is cached, so that all subsequent calls to That's the approach I've taken with This would be more cumbersome to use, since you would always have to call GetDatabase(...), await it, and then call the ReadAsync(..) or ReadWriteAsync(...) method. To help fix this, I've added extension methods on the So, instead of
you can do
Using the same approach, you would need an Another solution that leverages FoundationDBA last word of another solution to this problem, but specific to FoundationDB: when you create Layers with fdb and the .NET binding, you can use the mechanism of Deferred Value Checks, in order to cache any metadata that would be stored in the database itself. For example, if you have a layer that emulates SQL Tables or Document Collections, each table or collection has a schema that is stored as metadata in the db, alongside the data, and this schema can change at any time, but ALL servers in the cluster MUST observe the schema change at the same time (if not, one server lagging behind would silently still insert data using the old schema, or fail to update a new index). The solution is to use Deferred Value Checks as a mechanism to build a distributed cache: any new transaction will attempt to use the cached metadata from the previous call. If there is none, then it will have to read the metadata from the db (using any async reads as required). If there is already a cache value (in memory), it may still be valid, but it is possible that a transaction that committed a microsecond before just changed the metadata. To be robust, it would need to re-check the metadata (using a random token that changes on every edit) by reading the key, BEFORE being able to read any other data. This means that you need at least one round-trip to the db before being able to do any work. Deferred Value Checks are a way for a transaction to "protect" itself against this, without incurring the initial latency cost. When you reuse the cache, you can issue the "check read", but don't need to await it. You can immediately start reading from the db using the cached schema, while the check read is still pending. When the transaction has to commit, any pending check will be awaited. All "check reads" expect a specific value in the db to still be equal to the value it observed previously. If any check fails (the value changed), then the whole transaction is retried, simulated a Read Conflict. On the next try, the cache mechanism will recognize that its data is stale, and will drop the cached data, prompting a full schema reload. Once you have such a mechanism, it means that it is very easy to use "infrequently changing" data in your code, which is still guaranteed to be up to date, and with a way to automatically reload it as soon as it has changed. You are also guaranteed that all servers in the cluster will observe the change at the same time, and there will not be any server that lags behind. |
I think this could be solved by keeping foundation client / provider in its own DI container. So in the entry point of the application:
|
If you'd only intend to use FDB as a distributed IOptions<...> provider and nothing else, then you could simply define it on the initial service provider, and make sure its lifetime is as long as the application itself (or self the DisposeAsync will call Fdb.Stop which nukes the native client library handle) But in my case, and probably most people using FDB, you will also need it for the rest of the application. Since the lifetime of the fdb client is a singleton, and since the instances requires other injected services, like ILogger, IClock, OpenTelemetry tracing/metrics context and so on, it would be difficult to reuse the singleton created from the initial IServiceProvider, since it would probably have use its own instances, which are not the same and most probably don't use the same settings (especially logging). I really don't like the fact that the out-of-the-box DI container forces you in a corner like this. It has clearly been designed for static pre-defined configuration (coming from env variables or .json files). There is maybe another way to work around this. The main issue is that, if you have Singleton services that are injected into other types like API controllers, and they need an Since you will need to support reloading of options at runtime anyway, it means that the values that were true in the constructor could change at any point, so the singleton would either need to defer looking at the options until an actual runtime method is called, OR have a way to reload it's internal state from some signal. The trick would be to replace This could work like this:
That way, the same config update mechanism would be used for the initial async load, as well as any subsequent updates. The only thing that the implementers would have take care of, is the initial "None" state. This is very similar to using |
My understanding is that this is the flow of events:-
So solving the async reload of the IConfiguration which I think sounds like its possible would fix those issues.
This is a common issue in my experience. When configuring the host, you can add your pre-initialised ILoggerProvider instead of allowing the host to create you one on the fly which you can't access until after the DI container is built. Applying these kinds of principles my entry point resembles:
If we need to augment our applications configuration with one time values fetched from database we can do that as an async task to fetch some IConfiguration between steps 3) and 4). Note however that this fetch will want to log. Luckily we have logging initialised so this is no issue. It may also require its own IConfiguration so it knows the db connection string etc. We can build that IConfiguration explicitly. Basically this approach establishes some subsystems, putting logging as the first subsystem to be established, and overlaying other critical or supporting subsystems, each can have its own IConfiguration in the mix. Interestingly - we need configuration, in order to build the subsystem to provide additional configuration! I think this is ok. Ayway - these are just ideas, and my personal opinions but in this model, if we also needed to share the same |
How do you deal with third-party libraries that only accept an If all your code uses My issue is that I am forced to add more and more external types that themselves require a ton of builders, options, providers, sinks etc... and the probability that at least one will not play nicely with dynamic options will approach 1 (if not already the case). It almost looks like the most compatible and safe way would be to move the async option provisioning outside of the process: a bootloader process that would query the database, populate an appsettings.json file, and spawn the actual server process (which would read this json file to populate the IConfiguration). The bootloader process would be the one in charge of watching the database for changes, and would simply write a new JSON file with the new configuration on disk. The child process would be configured to automatically reload the config whenever the JSON file changes. Or if this is not possible anyway (at least one third-party dependency does not support this), the bootloader would kill/restart the process. This really looks like re-inventing the wheel, and doing what Kubernetes would already do for you... |
If they accept If this was causing me issues, I'd have to comment on a concrete use case, but for example an extreme soluton might be that I need to build this library dependency in its on rebuildable DI Container, so I can |
I see and appreciate the thinking here. It removes the need to use dotnet at all for the foundation integration and is more akin to microservices approach. It still requires we relay the changed state via the file system and this can be unreliable - due to things like locks with files being accessed, or non atomic writes (if the process dies half way through flushing to disk), |
It is clear that due to some limitations in the .NET way of doing DI, we have to choose which flavor of poison we have to drink, each with its own set of issues and limitations. Most if not all of my processes are almost always stateless, with everything (including most of the configuration) stored in a FoundationDB keyspace, and using the caching mechanisms described above for the very "hot" data like schemas and config snapshots. I had to solve this issue well before the introduction of IServiceProvider and IOptions, so I guess I already have something that already works. The rest of the settings that would be required during startup end up being only things for the infrastructure part (where to log, where to send OTEL data, any credentials/secrets for these), which usually are handled on the hosting side (for ex Kubernetes) which already is responsible for the lifetime of the process anyway. That's probably why I can live with a static IConfiguration that will not change during a process lifetime, and rely on the fact that restarts are very fast and have a limited global impact (if you have a pool of servers of course, different story if you have a single node!) Anyway, if you are interested in how you could store a set of key/values into FDB, this can be done very easily with something similar to a "Map" layer, cf https://github.com/Doxense/foundationdb-dotnet-client/blob/master/FoundationDB.Layers.Common/Collections/FdbMap%602.cs. If you look at the implementation, writing keys or reading a single key is trivial, the only trick is when reading all the keys in the same transaction (so that you end up with a coherent view of all the settings): public async Task<TValue> GetAsync(IFdbReadOnlyTransaction trans, TKey id)
{
// ...
var data = await trans.GetAsync(this.Subspace[id]).ConfigureAwait(false);
if (data.IsNull) throw new KeyNotFoundException("The given id was not present in the map.");
return this.ValueEncoder.DecodeValue(data)!;
}
public void Set(IFdbTransaction trans, TKey id, TValue value)
{
// ...
trans.Set(this.Subspace[id], this.ValueEncoder.EncodeValue(value));
}
public IAsyncEnumerable<KeyValuePair<TKey, TValue?>> All(IFdbReadOnlyTransaction trans, FdbRangeOptions? options = null)
{
// ...
return trans
.GetRange(this.Subspace.ToRange(), options)
.Select(kv => DecodeItem(this.Subspace, this.ValueEncoder, kv));
} The There is a limit of 100kB maximum value size, so if you have a single value that is more than 100kB you have to split it. If the value is a JSON object, it could be best to explode it and store each fields separately ("obj.foo", "obj.bar", "obj.baz", "obj.foo[2].bar.baz", ...). Some people even explode objects into individual leaves (one entry per field, with the key being the full json path in the object) which may or may not be overkill. The This works fine, but there is a limitation of 5 seconds per transaction in FoundationDB, which indirectly limits the total number of bytes that you can read (depends on the network speed as well as global load in the cluster). In theory, 5 seconds with a 1Gbps pipeline is already > 500 MB of data, and I don't think that you'd ever need a configuration that big (or we are using a different definition of "configuration" ! :) ) If you would ever need to store multiple configuration, each for a different tenant, or maybe you have different pools of server (prod1, prod2, staging, ...) you could simply have multiple different subspaces each with it's own set of key/value pairs, and the GetRange would only stream the keys from this specific subspace. The Directory Layer (standard in all fdb bindings) allows you to split the keyspace into a hierarchy of "subspaces" which very similar to how a disk volume would be split into folders and subfolders. You could have a subspace location use the tenant id or server pool id as part of the "path" to the subspace that holds all the keys that are part of the same configuration. This way, you don't have to do something like |
Given Foundation is known for its ACID transactions accross the distributed cluster and its key-value nature, I thought perhaps it would naturally fit in to a dotnet application as an IConfiguration source.
For example, the application could update the configuration at runtime in foundation using a transaction, and that config would be highly available, and the new configuration could be pulled into all instances of the application as reloaded IConfiguration.
Has anyone thought much around developing an IConfiguration provider that leverages foundation - and would this be something that would or could fit within the confines of this project? Or would it be best reflected as a seperate github project that leverages this one as a dependency?
The text was updated successfully, but these errors were encountered: