Skip to content

docs(adr): unified flagd evaluator#1848

Open
aepfli wants to merge 6 commits intomainfrom
1842-feature-unified-flagd-evaluator-poc-javapython-reduced-maintenance-poc-and-reference
Open

docs(adr): unified flagd evaluator#1848
aepfli wants to merge 6 commits intomainfrom
1842-feature-unified-flagd-evaluator-poc-javapython-reduced-maintenance-poc-and-reference

Conversation

@aepfli
Copy link
Member

@aepfli aepfli commented Jan 20, 2026

This PR

This pull request introduces a comprehensive Architecture Decision Record (ADR) detailing the strategic move towards a unified flag evaluation engine within the OpenFeature ecosystem. The core objective is to consolidate the currently fragmented, language-specific evaluation implementations into a single, robust solution. By leveraging Rust for its performance and cross-compilation capabilities (WASM and native bindings), the proposal aims to significantly reduce maintenance burden, ensure consistent behavioral outcomes across all SDKs, and accelerate feature development. The ADR meticulously outlines the scope, technical considerations, and a phased adoption plan, emphasizing a gradual, opt-in migration to ensure stability and community validation.

  • Introduction of Unified Evaluator ADR: This pull request adds a new Architecture Decision Record (ADR) proposing a 'Unified Evaluator' for the OpenFeature ecosystem. This evaluator aims to centralize flag evaluation logic, currently fragmented across various language-specific SDKs.
  • Problem Statement and Proposed Solution: The ADR identifies issues like maintenance overhead, inconsistent behavior, and duplicated effort due to disparate SDK implementations. It proposes a single, canonical evaluator implemented in Rust, designed to compile to WebAssembly (WASM) for portability and native bindings (PyO3, napi-rs) for performance-critical languages.
  • Scope and Responsibilities: The unified evaluator will be responsible for evaluation logic, in-memory flag storage, and JSON schema validation (maximal scope). SDKs will become thinner adapters, focusing on OpenFeature API compliance, language-specific concerns, and sync protocol implementation.
  • Migration Strategy: A phased, experimental opt-in rollout strategy is outlined to minimize risk and gather feedback, eventually leading to the unified evaluator becoming the default implementation across all SDKs.

Related Issues

Fixes #1842

@aepfli aepfli requested review from a team as code owners January 20, 2026 10:00
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Jan 20, 2026
@netlify
Copy link

netlify bot commented Jan 20, 2026

Deploy Preview for polite-licorice-3db33c ready!

Name Link
🔨 Latest commit d619e81
🔍 Latest deploy log https://app.netlify.com/projects/polite-licorice-3db33c/deploys/697bcc2113d81900089ed72a
😎 Deploy Preview https://deploy-preview-1848--polite-licorice-3db33c.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @aepfli, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a comprehensive Architecture Decision Record (ADR) detailing the strategic move towards a unified flag evaluation engine within the OpenFeature ecosystem. The core objective is to consolidate the currently fragmented, language-specific evaluation implementations into a single, robust solution. By leveraging Rust for its performance and cross-compilation capabilities (WASM and native bindings), the proposal aims to significantly reduce maintenance burden, ensure consistent behavioral outcomes across all SDKs, and accelerate feature development. The ADR meticulously outlines the scope, technical considerations, and a phased adoption plan, emphasizing a gradual, opt-in migration to ensure stability and community validation.

Highlights

  • Introduction of Unified Evaluator ADR: This pull request adds a new Architecture Decision Record (ADR) proposing a 'Unified Evaluator' for the OpenFeature ecosystem. This evaluator aims to centralize flag evaluation logic, currently fragmented across various language-specific SDKs.
  • Problem Statement and Proposed Solution: The ADR identifies issues like maintenance overhead, inconsistent behavior, and duplicated effort due to disparate SDK implementations. It proposes a single, canonical evaluator implemented in Rust, designed to compile to WebAssembly (WASM) for portability and native bindings (PyO3, napi-rs) for performance-critical languages.
  • Scope and Responsibilities: The unified evaluator will be responsible for evaluation logic, in-memory flag storage, and JSON schema validation (maximal scope). SDKs will become thinner adapters, focusing on OpenFeature API compliance, language-specific concerns, and sync protocol implementation.
  • Migration Strategy: A phased, experimental opt-in rollout strategy is outlined to minimize risk and gather feedback, eventually leading to the unified evaluator becoming the default implementation across all SDKs.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new Architecture Decision Record (ADR) for a unified flag evaluator. The document thoroughly outlines the problem statement, requirements, considered options, and a detailed proposal for a Rust-based evaluator with multiple compilation targets (WASM, native bindings). It also covers performance considerations, migration strategies, and potential consequences.

My review focused on clarity, consistency, and accuracy within the document. I've identified several minor typos and formatting inconsistencies, particularly in technical terms and code examples, which could improve the document's professionalism and readability. Additionally, some strong claims have been softened to reflect more realistic expectations. Overall, this is a well-structured and comprehensive ADR, and addressing these minor points will further enhance its quality.


**Pros**:

- Native performance
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generated code might not always be as efficient and fast as it could be with real native code

|----------|------------------|---------|
| Simple flags (no targeting) | ~3-4µs baseline | Negligible |
| Moderate targeting | 3-5x slower | Acceptable for most use cases |
| Complex targeting (large context) | 4-6x slower | Serialization dominates |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might be able to reduce the serialization overhead with some caching inside the WASM module

@tangenti
Copy link
Contributor

Could you share the prompts that you used to generate this doc instead?

I would suggest having a proposal to support bring-your-own JSONLogic engine / evaluator.

Having a mandatory WASM-based evaluation engine is NOT an option for our 1P use cases. We'll have to drop flagd or having a significantly different branch of flagd internally if we will move forward with this proposal.

CC @toddbaert @cupofcat

@aepfli
Copy link
Member Author

aepfli commented Jan 20, 2026

my prompt was based on: #1842

I have this PoC - and i need to formalize this a little in the form of an adr. I do not want you to create a pullrequest, i want you to create the markdown file, for the adr. what do i need to care about, what should i not miss.

and i tweaked it a little with copilot to be more generic, and not too much implementation focused. So i do not think that this will be helpful.


This is an exploration with native bindings, and wasm only where wasm is needed. eg. java is precompiled to java code with chicory, python uses py03 to generate bindings. It is more like a show case to trigger the discussion and reduce the efforts.

- JsonLogic evaluation
- Targeting rule processing
- Fractional evaluation algorithms
- Default variant resolution
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is perhaps the case to be made that some evaluation logic may be left to the SDK, for instance the default variant resolution when there is no targeting logic.
It does contradict some of the core principles layed out below, but this would allow for language specific performance optimizations for specific cases and ease migration cost.

This relates to one big application we maintain where some code paths can sometimes evaluate many flags and paying the cost of ffi overhead is not an option.

The document does mention bulk evaluation, although not in the target API. Perhaps this could be a possible mitigation to include it in this proposal, and consider it as an evolution for the open feature specification?

Copy link
Member Author

@aepfli aepfli Jan 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are actually also tinkering and thinking of returning a set of ff which are pre evaluated due to the fact that they not have a targeting rule. So when populating the store, we return already those prepopulated once. And do not need to jump the barrier again. But this is some kind of additional optimization m, which I think is not needed for the initial draft.

open-feature-forking/flagd-evaluator#60

//EDIT: adapted for clarity, and removed wrong term

@toddbaert
Copy link
Member

Could you share the prompts that you used to generate this doc instead?

I would suggest having a proposal to support bring-your-own JSONLogic engine / evaluator.

Having a mandatory WASM-based evaluation engine is NOT an option for our 1P use cases. We'll have to drop flagd or having a significantly different branch of flagd internally if we will move forward with this proposal.

CC @toddbaert @cupofcat

@tangenti can you share why? Is it something fundamental to WASM or just the performance / serialization latency?

Personally my main goal here is to write the evaluation-engine once instead of building the same thing 6-7 times. I think this is something you've advocated for in the past as well. With Rust, we can also produce native bin targets to which we could interface with using things like JNI (in Java) to make native C calls. This avoids the serialization latency, though it creates some portability challenges.

@aepfli
Copy link
Member Author

aepfli commented Jan 21, 2026

Could you share the prompts that you used to generate this doc instead?

I would suggest having a proposal to support bring-your-own JSONLogic engine / evaluator.

Having a mandatory WASM-based evaluation engine is NOT an option for our 1P use cases. We'll have to drop flagd or having a significantly different branch of flagd internally if we will move forward with this proposal.

CC @toddbaert @cupofcat

@tangenti I can understand that this might not fit your use case, and your suggested approach sounds interesting.
But your proposal does not immediately invalidate this one. This reduces efforts for the maintainers for all languages. on the other hand having an option to use another engine can co-exist and could even work with this evaluator. Sounds like a nice topic for a separate ADR.

@tangenti
Copy link
Contributor

Could you share the prompts that you used to generate this doc instead?
I would suggest having a proposal to support bring-your-own JSONLogic engine / evaluator.
Having a mandatory WASM-based evaluation engine is NOT an option for our 1P use cases. We'll have to drop flagd or having a significantly different branch of flagd internally if we will move forward with this proposal.
CC @toddbaert @cupofcat

@tangenti can you share why? Is it something fundamental to WASM or just the performance / serialization latency?

Personally my main goal here is to write the evaluation-engine once instead of building the same thing 6-7 times. I think this is something you've advocated for in the past as well. With Rust, we can also produce native bin targets to which we could interface with using things like JNI (in Java) to make native C calls. This avoids the serialization latency, though it creates some portability challenges.

Google does not allow WASM sandboxes/engines for most of the languages. This is a hard NO that we have no influence to change.

I'm fully supportive for a unified evaluation engine, and from what I understand from Simon's proposal and the discussions, having a mandatory WASM based engine is not necessary here - native bindings from the Rust is one approach, and bring-your-own evaluator is another option we could explore here.

@aepfli
Copy link
Member Author

aepfli commented Jan 21, 2026

I'm fully supportive for a unified evaluation engine, and from what I understand from Simon's proposal and the discussions, having a mandatory WASM based engine is not necessary here - native bindings from the Rust is one approach, and bring-your-own evaluator is another option we could explore here.

I am not sure if bring-your-own evaluator needs to be part of this. This ADR focuses on reducing the efforts for the maintainers. We still should support a default evaluator, and the current state is less than ideal. Slightly different behaviour, slightly different jsonlogic evaluation. This ADR focuses on unifying this, and I do not think we should add more to it.

The discussion around a byoe (bring your own evaluator) might influence this ADR, but is not the solution to the problem I want to tackle with this ADR. Simplifying the efforts for maintaining the flagd provider, making it easy to role out changes to all of the sdks with one basic change in one repository.

The byoe focuses on flexibility - but instead of reducing the maintance efforts it increases them, hence I would propose an own ADR for that.

@JamieSinn
Copy link

DevCycle was heavily aligned to our WASM core; and we found it was generally very handy/helpful but we had some challenges from doing it the way we did.

Maintaining and building this core was one of the things I was responsible for along with @jonathannorris.

  • We used AssemblyScript instead of a more discrete/shared language so our target was only WASM, no shared library option - this caused a lot of weirdness and drift due to the version we were using being so old/not consistent with the actual current WASM spec. ASC also deliberately did not align to the spec.
  • We had to wrap all access to the WASM core in mutex's due to the shared memory across all accessing threads. This was easier in some languages than others (ie Java's synchronized/static)
  • Implementing the language specific WASM bindings (console.log, error/exception handling, function params, etc) took up a lot more of our time initially to build out the actual spec (specifically serialization of strings, as AssemblyScript uses a weird variant of UTF-16 called "WTF-16"
  • We found data serialization in a string format was much more expensive from an overhead perspective of just sending a byte array (Strings required the buffer allocation and byte shifting to be a utf-16) - we moved to Protobuf as the media to transfer for the variable evaluation requests (DevCycleHQ/js-sdks@77a1335)

Overall - we've become less and less enthused with a WASM core for the following reasons:

  • Single threaded access: Despite WASM internally supporting threads in the spec, access to the memory object shared between the host and the WASM guest are required to be done behind a mutex (depending on the language this is done for you by Wasmtime, our engine of choice). This means that any requests into or out of the core are single thread locked.
  • AssemblyScript cannot target all our destinations of choice. We're limited in the WASM features based on what it implements. It doesn't help that it's a bit of a sunk cost fallacy too.

As a result - I heavily investigated using a rust core instead of our assemblyscript base. I initially started scoping this trying to use WASM as a single output from a rust core library; using as many of the new Rust/rust-bindgen WASM features as possible; such as using multiple memories to ideally parallelize the access to the core based on the frequency of access.
(Small side tangent - the main reason I felt that this was needed - is any config update to the project/environment would require basically a deadlock on all evaluations until the loading and initializing of the config was complete. Any attempts to evaluate variables during this time would have a latency in the 10s of milliseconds, which to me was unacceptable.)

I've made my investigation doc public here: https://www.notion.so/taplytics/Core-Bucketing-ReWrite-Scoping-278931ab956180c7b65efe292df9d4f5 - but my TL;DR of this is that I would suggest instead of a WASM core; using a C library as the default target - and selecting WASM for languages that would be a better fit (ie NodeJS due to its native support, and the fact the language itself is still "single threaded").

One key thing to note - in my opinion - the difficulty of implementing WASM in a given language is very comparable to the difficulty of implementing it via C Bindings.

.NET WASM:
https://github.com/DevCycleHQ/dotnet-server-sdk/blob/main/DevCycle.SDK.Server.Local/Api/WASMLocalBucketing.cs

.NET C/FFI:
https://github.com/DevCycleHQ/bucketing-rs/blob/main/examples/csharp/DevCycleFFI/NativeMethods.cs
https://github.com/DevCycleHQ/bucketing-rs/blob/main/examples/csharp/DevCycleFFI/DevCycleClient.cs

The rust core is open source here: https://github.com/devcyclehq/bucketing-rs - including some of the example implementations.

TL;DR - Rust core is a great idea, outputting to WASM as a default would be an option but you will have a significant performance overhead when running in concurrent workloads.

@erenatas
Copy link

One key thing to note - in my opinion - the difficulty of implementing WASM in a given language is very comparable to the difficulty of implementing it via C Bindings.

I share the same sentiment with Jamie, and I'm going to play devil's advocate a bit because I have doubts whether WASM will work well at a scale, due to tradeoff of single thread limitation.

I have been thinking and researching about this for a while, and I personally think the priority should be on providing an FFI support. For Java for example, FFM API provides improvements to call bindings. Technically speaking, you can extend the existing Rust flagd provider and extend it to other languages through ffi bindings (I tested this for Perl with sync mode, I can confirm it works).

Also I would expect FFI has less overhead compared to WASM. A feature flag client would be the last thing I'd expect to be a bottleneck of a system.

@erka
Copy link
Member

erka commented Jan 24, 2026

@erenatas As a note. WASM is one library, FFI is few libraries (OS * arch == 10?)

@erenatas
Copy link

@erenatas As a note. WASM is one library, FFI is few libraries (OS * arch == 10?)

Yes you are right, afaik GitHub actions provide runners for distribution, correct me if I'm wrong.

@aepfli
Copy link
Member Author

aepfli commented Jan 24, 2026

And eg. Java we are still supporting 11. In java we have chicory, which allows for build time compilation. Which reduces the costs. The benefit of wasn't/chicory is that we are staying within the JVM boundaries for security.

@erenatas
Copy link

erenatas commented Jan 24, 2026

And eg. Java we are still supporting 11. In java we have chicory, which allows for build time compilation. Which reduces the costs. The benefit of wasn't/chicory is that we are staying within the JVM boundaries for security.

Jni-rs uses invocation API. I understand the convenience wasm provides as interpreters make it agnostic, but for sync mode being single threaded sounds like an important limitation, considering potential consumers of such feature flag client. We have cases that will serve 100k+ rps for example which will use these clients.

If you have the chance to validate this is not an issue, it would be very valuable in my opinion. And I want to state I see value in WASM for open-feature (also in providing feature flag capabilities for frontend).

@erka
Copy link
Member

erka commented Jan 24, 2026

Yes you are right, afaik GitHub actions provide runners for distribution, correct me if I'm wrong.

There are ways to build this for sure, but FFI has its own limitations. Some folks aren’t happy with the library size once all FFI assets are included, and FFI won’t run in a browser. WASM isn’t a silver bullet either - browser vs. non-browser environments and TLS add their own challenges. @erenatas

@beeme1mr
Copy link
Member

browser vs. non-browser environments and TLS add their own challenges.

Browser won't be an issue in this case since we recommended bulk evaluation on the edge to avoid shipping the ruleset to the browser. We could (and probably should) ensure the shared engine can work on edge functions/workers.

I agree with the ARD's overall goal of aligning on a single, unified evaluation engine. Whether that's FFI, WASM, or something else may depend on the language we're integrating with. That's why Rust is such a compelling option to me.

@JamieSinn
Copy link

Also I would expect FFI has less overhead compared to WASM. A feature flag client would be the last thing I'd expect to be a bottleneck of a system.

@erenatas - I found it heavily depends on your memory layout and how you're serializing and deserializing the data that matters heavily too. In some cases WASM/FFI had a very comparable performance simply due to my ineptitude (and laziness) to build out a rough hack for a test. I think there's something that should be heavily considered around the datamodel and serialization between.

Something I've wanted to do as a test is to use CapNProto as the serialization handler for this (https://capnproto.org/) - as we saw benefits with the move from JSON -> Protobuf, I would imagine that we'd gain similarly for Protobuf -> CPNP, but that's just a hunch.

I will heavily caveat that all my testing was done with Wasmtime as the client-language runtime across all languages simply because that's what I was working with from building out all the other systems. It also is as close to the WASM spec as you can get in my opinion.

There are ways to build this for sure, but FFI has its own limitations. Some folks aren’t happy with the library size once all FFI assets are included, and FFI won’t run in a browser.

I would personally never really consider the use of this kind of centralized implementation for a client-side library. Using something like OFREP with an edge worker (Cloudflare Workers, AWS Lambda, etc) has worked out extremely well for us with DevCycle - mostly because like you said, library size is critical for front-end deployments. Using this as the backend evaluator from some HTTP request could be entirely viable with something like CF Workers.

I am not sure if bring-your-own evaluator needs to be part of this. This ADR focuses on reducing the efforts for the maintainers. We still should support a default evaluator, and the current state is less than ideal. Slightly different behaviour, slightly different jsonlogic evaluation. This ADR focuses on unifying this, and I do not think we should add more to it.

@aepfli - I'm interested in the concept of just implementing a standard interface across all the languages. In theory - this interface could allow for injection regardless of the backing engine. FFI would output a header file anyway, which is the primary consumption implement ideally allowing for swapping out the library (LGPL esque). I wouldn't specifically design for it though, and allow a user to implement the function interface on their own, very similar to how the OF provider spec works for FF vendors.

@JamieSinn
Copy link

JamieSinn commented Jan 25, 2026

I share the same sentiment with Jamie, and I'm going to play devil's advocate a bit because I have doubts whether WASM will work well at a scale, due to tradeoff of single thread limitation.

Commenting on this separately since it's a bit more nuanced, and heavily matters based on what you prioritize.

This is a bit of a story back into how we ended up with 3 implementations of our bucketing logic. One in WASM, one in Go, and another in JS (CF Workers mostly - but that's not relevant to this)

I will be the first to admit that WASM is extremely limited in the performance you can squeeze from it when using it within a guest language.
Throughout our entire design and development of the WASM core for DevCycle I was generally opposed due to the performance limitations, but I couldn't deny that the efficiency of write once deploy everywhere was massively valuable, especially when we had limited resources and time as a startup, and frankly, we didn't really have any performance sensitive customers (this changed later on)

About a year or so after we built the WASM engine we had some issues with how WASM scaled in languages that are natively concurrent (Go specifically).
This came to an issue where one customer of ours needed more performance out of it as part of a migration from a different provider.

Our first attempt was to mess around with how the WASM allocations were setup (and used it as a pool concept):

DevCycleHQ/go-server-sdk#102 (this is a stacked PR with a few that you can click through)
DevCycleHQ/go-server-sdk#106

This kinda worked? It improved performance quite a bit; but still had some issues with memory performance and usage (as you now have your full alloc * number of pool members).

We messed around with this quite a bit more and started to hit the point of diminishing returns - yet we still hadn't hit the performance goal that was required.
We had some back and forth internally that resulted in my internal hack-branch of a Go native implementation I wrote as a proof of concept being fully fleshed out and eventually resulting in that as the final output/current implementation today.
DevCycleHQ/go-server-sdk#138
DevCycleHQ/go-server-sdk@474414b

This made our per variable evaluation timing drop from several hundreds of microseconds to 100-200 nanoseconds. A literal 1000x improvement. This also had the knock on effect of removing the CGO depend from our build process - allowing for a much cleaner and easier build for downstream consumers who can't/don't want to have CGO enabled. (Wasmtime required CGO to be enabled).

Now - this isn't meant to dunk on WASM. Key note here - we only made this change for Go, and kept WASM in all our other language implementations for one major reason. That degree of performance wasn't needed, we rarely if ever heard complaints from customers about the performance of the WASM engine (ironically becuase of WASM, our Ruby SDK was faster than a natively implemented Ruby evaluation engine, same with Python). However - it was entirely single thread limited - simply due to the nature of the IO between the host memory and the guest memory (which is why most of our benchmarks are setup in two ways, serial, and parallel evaluations).

TLDR:

WASM can be setup and implemented in a way that makes it less single-threaded - but I would argue that the complexity of doing so is even higher than implementing an FFI properly and natively (this is from experience) where you benefit the efficiency of the evaluator flow actually being concurrent, not limited to n wide concurrency.

@toddbaert
Copy link
Member

toddbaert commented Jan 26, 2026

I've made some small amendments in this PR to this branch.

In summary:

  • Migrated Java and .NET primary targets to ?; I don't think we have enough information yet (@aepfli and @JamieSinn are doing some research).
  • Migrated Go primary target to "Go"... So far I haven't identified a reasonable proposal for Go. I don't think calling native code will be practical considering the way that CGO will transitively force our consumers to use it (again, @aepfli and @JamieSinn are doing some research) on this

@erenatas I agree with your takes here.

@erka what are your thoughts, most specifically for Go? Do you think finding a flexible and performant target that will work with Go is realistic? Is there a way to leverage native code without CGO at all? I'm OK with maintaining a Go implementation alongside a Rust one, long term if necessary; it would still be an improvement over the current situation.

@tangenti This proposal is not conclusive, but I'd like to agree on goals. I understand WASM is a "hard no" for you folks - is that true in all runtimes, including JS? Or does this rule mostly apply to Go artifacts?

@erka
Copy link
Member

erka commented Jan 26, 2026

I haven’t followed the WASM world for a while, so take what I’m saying here with a grain of salt. Is it possible to run WASM in Go without CGO? Yes - Wazero is there, and I’m sure someone is already pushing an LLM to add WIT support as I write this. Performance-wise, it would likely be the slowest option compared to native code and FFI. One note about FFI, it introduces distribution challenges for Go: when building a Go (single) binary, OS- and architecture-specific FFI artifacts must be shipped (copied) alongside it, which some users find frustrating as it adds complexity to their pipeline. @toddbaert

@toddbaert
Copy link
Member

I haven’t followed the WASM world for a while, so take what I’m saying here with a grain of salt. Is it possible to run WASM in Go without CGO? Yes - Wazero is there, and I’m sure someone is already pushing an LLM to add WIT support as I write this. Performance-wise, it would likely be the slowest option compared to native code and FFI. One note about FFI, it introduces distribution challenges for Go: when building a Go (single) binary, OS- and architecture-specific FFI artifacts must be shipped (copied) alongside it, which some users find frustrating as it adds complexity to their pipeline. @toddbaert

@erka it was actually the FFI that made me worry about CGO. We could hypothetically ship multiple artifacts for various architectures, but I think to use FFI we would generally have to enable CGO, and that would be transitively forcing it on our consumers, if I understand correctly.

@erka
Copy link
Member

erka commented Jan 27, 2026

Usually, Go + FFI means CGO. There is purego project trying to break this but I've never tried it.

@aepfli
Copy link
Member Author

aepfli commented Jan 28, 2026

My biggest reason i want to push this is actually not WASM nor FFI or anything like this. My big goal is to have one repository for the evaluator in all kind of languages, so that if we implement a new feature, we do it for all at once. ideally with the same version. This way we can at least determine what which sdk supports and in which kind of language. Starting with WASM makes the migration easy. but it does not limit us, to later improve on that, use FFI where it is really feasible, or go language native.

The easiest starting point would still be wasm, to set this up as an experimental feature, activated via a flag during flagd initialization. This allows for experimentation and would lay the groundwork for us to a unified mono repo for the evaluator. A foundation to iterate and improve. Ideally handled via artifacts uploaded to the respective packet managers for the languages, versioned and ready to be updated with renovate etc.

Copy link
Member

@toddbaert toddbaert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving, and leaving this in proposed state, since it doesn't have a conclusion or implementation tickets.

This ADR is, at this point, just a transparent statement of a goal, and not an implementation proposal.

I would like to merge this. Merging it does not necessarily mean any particular implementation has been decided on, just that some community members will be working and researching in this direction.

cc @aepfli @tangenti @JamieSinn @erenatas

aepfli and others added 4 commits January 28, 2026 08:04
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
Amending current ADR.

Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
@toddbaert toddbaert force-pushed the 1842-feature-unified-flagd-evaluator-poc-javapython-reduced-maintenance-poc-and-reference branch from 13ea4e8 to 6e433d8 Compare January 28, 2026 13:05
…apython-reduced-maintenance-poc-and-reference
@erenatas
Copy link

Usually, Go + FFI means CGO. There is purego project trying to break this but I've never tried it.

I'm not familiar with go landscape, is it possible to share any docs/articles that I can read to have a better context? 🙏

I also want to thank everyone for their efforts and input, its invaluable being able to share opinions freely.

@tangenti
Copy link
Contributor

@toddbaert

Re #1848 (comment):

AFAIK, it's only allowed for the frontend usage of TypeScript.

Re #1848 (review):

Should we instead check in a shorter version of the ADR, which emphasize the agreed goal of unifying the evaluation engine, not to enforce the mandatory usage of WASM? If not, let's add a warning in the beginning of the doc, that this is not an approved design.

@toddbaert
Copy link
Member

toddbaert commented Jan 29, 2026

@toddbaert

Re #1848 (comment):

AFAIK, it's only allowed for the frontend usage of TypeScript.

Re #1848 (review):

Should we instead check in a shorter version of the ADR, which emphasize the agreed goal of unifying the evaluation engine, not to enforce the mandatory usage of WASM? If not, let's add a warning in the beginning of the doc, that this is not an approved design.

I'll add something at the top of the doc. I still think the proposals/discussion are worth keeping, so I don't want to remove anything. I'll make it clear we haven't decided on a solution yet, and that we will seek to find alternatives (such as "Bring Your Own Evaluator") if we cannot come to consensus on targets/runtimes.

@tangenti done. See: d619e81

Signed-off-by: Todd Baert <todd.baert@dynatrace.com>
@sonarqubecloud
Copy link

@toddbaert toddbaert requested review from alxckn and chrfwow January 30, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Unified flagd-evaluator PoC (Java/Python/...) – reduced maintenance, PoC, and reference

9 participants