Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NEW] Disable LUA integration during build #1204

Open
hpatro opened this issue Oct 21, 2024 · 21 comments · May be fixed by #1245
Open

[NEW] Disable LUA integration during build #1204

hpatro opened this issue Oct 21, 2024 · 21 comments · May be fixed by #1245
Labels
major-decision-pending Major decision pending by TSC team

Comments

@hpatro
Copy link
Contributor

hpatro commented Oct 21, 2024

Based on my observation, there are multiple CVEs occurring due to bugs/issues found in Lua engine/debugger. If a user doesn't have any use case of using Lua scripts and if they can remove it from the Valkey binary, they don't need to address the patching of security vulnerabilities in their system.
I found a similar request on Stack Overflow. Currently, a user could disable LUA usage via ACL rules by restricting EVAL/SCRIPT but I think they would still need to patch their servers within a certain period of time.

Could we make LUA optional ? This could also help us building right set of API(s) for users who rely on LUA to address functionalities not supported by server like CAS.

CVE(s): https://docs.aws.amazon.com/AmazonElastiCache/latest/dg/cve.html https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=Lua+Redis

@hpatro
Copy link
Contributor Author

hpatro commented Oct 21, 2024

@valkey-io/core-team Thoughts?

@madolson
Copy link
Member

You could just use ACLs with -@scripting right?

@hpatro
Copy link
Contributor Author

hpatro commented Oct 21, 2024

You could just use ACLs with -@scripting right?

Like I already mentioned, it could be avoided with ACL rules but the vulnerability is still part of the binary and the administrator has to be careful about the default permission of each new user. I think it would be much nicer security default if they don't build LUA into the binaries.

@madolson
Copy link
Member

I think it would be much nicer security default if they don't build LUA into the binaries.

@dmitrypol You mentioned once that you had some thoughts around building without LUA.

I guess my concern is just that it sort of bifurcates the entire experience. Some people won't have lua scripts and they'll be annoyed. There are a lot of time we say "no" to a feature because you can implement it with lua.

@hpatro
Copy link
Contributor Author

hpatro commented Oct 21, 2024

I guess my concern is just that it sort of bifurcates the entire experience. Some people won't have lua scripts and they'll be annoyed. There are a lot of time we say "no" to a feature because you can implement it with lua.

This is what I also want to understand more about. Do we add friction to the dev experience by shoo-ing them away by saying "Use LUA". LUA usage is also a double edged sword, some users abuse it to an extent that you don't know if it was actually intended the way it is. It also becomes an operational nightmare with long running scripts.

@hpatro
Copy link
Contributor Author

hpatro commented Oct 23, 2024

Would be nice if folks from the community can add why/how they use LUA and if they find some API missing in Valkey forcing them to build LUA scripts.

@hpatro
Copy link
Contributor Author

hpatro commented Oct 23, 2024

Based on my research, the use cases I've found where users have to rely on LUA #1215 #1216. Will update the comment as I find more.

@PingXie
Copy link
Member

PingXie commented Oct 24, 2024

  1. Do we have data on how Lua-related CVEs compare to other types of CVEs historically? Specifically, does Lua contribute a significantly higher number of CVEs, or is it comparable to other components?

  2. I'm open to making Lua support optional, but I don't think we should start by building it as a separate module or creating a Lua-free Docker image. This IMO is a regression and could lead to fragmentation, which echoes @madolson's concern. However, allowing users to remove Lua at build time is worth discussing, provided we have data (from question 1 above) that shows Lua is a significant contributor to security vulnerabilities.

@zuiderkwast
Copy link
Contributor

Agree with @PingXie that we can do add some make BUILD_LUA=no to skip Lua. We will need some ifdef so to make EVAL returns -ERR Lua not included in this build. If it's not too many ifdefs, then I think it can be OK, but if's a lot of them, then maybe it's a mess.

Usually we don't want compile-time configuration, like ifdefs for various features, but this is an optional dependency. It's different.

@zuiderkwast
Copy link
Contributor

  • Do we have data on how Lua-related CVEs compare to other types of CVEs historically? Specifically, does Lua contribute a significantly higher number of CVEs, or is it comparable to other components?

The CVEs I can find are these two, listed on https://www.cvedetails.com/version/1198481/LUA-LUA-5.1.html

  1. CVE-2014-5461
    Buffer overflow in the vararg functions in ldo.c in Lua 5.1 through 5.2.x before 5.2.3 allows context-dependent attackers to cause a denial of service (crash) via a small number of arguments to a function with a large number of fixed arguments.
  2. CVE-2021-43519
    Stack overflow in lua_resume of ldo.c in Lua Interpreter 5.1.0~5.4.4 allows attackers to perform a Denial of Service via a crafted script file.

In our deps/README.md, we have described the patches we have to our vendored Lua 5.1. It's unclear if we have mitigated both CVEs.

Currently we have at least the following differences between official Lua 5.1
and our version:

  1. Makefile is modified to allow a different compiler than GCC.
  2. We have the implementation source code, and directly link to the following external libraries: lua_cjson.o, lua_struct.o, lua_cmsgpack.o and lua_bit.o.
  3. There is a security fix in ldo.c, line 498: The check for LUA_SIGNATURE[0] is removed in order to avoid direct bytecode execution.
  4. In lstring.c, the luaS_newlstr function's hash calculation has been upgraded from a simple hash function to MurmurHash3, implemented within the same file, to enhance performance, particularly for operations involving large strings.

If we have patches for all the CVEs, we should definitely document it better, so users don't feel they need to disable Lua.

We never updated Lua because we don't want to break any scripts. I wonder how big that risk is. Considering Lua 5.1 is end-of-life since 2012, maybe we should reconsider this decision at some point? If we do, then I believe LuaJIT is the better option compared to Lua 5.4.

@hpatro
Copy link
Contributor Author

hpatro commented Oct 28, 2024

  1. Do we have data on how Lua-related CVEs compare to other types of CVEs historically? Specifically, does Lua contribute a significantly higher number of CVEs, or is it comparable to other components?

Recent one around LUA:
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-24736
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-24735
https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2022-24834

There were lot more which antirez fixed in the past http://antirez.com/news/119 (when the project wasn't focused around security aspect)

  1. I'm open to making Lua support optional, but I don't think we should start by building it as a separate module or creating a Lua-free Docker image. This IMO is a regression and could lead to fragmentation, which echoes @madolson's concern. However, allowing users to remove Lua at build time is worth discussing, provided we have data (from question 1 above) that shows Lua is a significant contributor to security vulnerabilities.

I know it will cause fragmentation but I think we should consider reducing the attack vector(s) possible on Valkey and for easy migration we would need to provide concrete alternatives.

We never updated Lua because we don't want to break any scripts. I wonder how big that risk is. Considering Lua 5.1 is end-of-life since 2012, maybe we should reconsider this decision at some point? If we do, then I believe LuaJIT is the better option compared to Lua 5.4.

This is another tangent we can think about.

@hwware
Copy link
Member

hwware commented Oct 29, 2024

I have some experience on Lua script issue (It is really annoyed), but I have no too much strong point to disable it during build:

  1. Compared with other CVE, the number of lua is smaller
  2. Lua script is a traditional call way in Redis and Valkey, it brings little benefit for Valkey if disable it.

@madolson
Copy link
Member

My preference is we leave this issue open and see if anyone actually wants to build Valkey without Lua. AWS won't want to build without Lua. I find it unlikely the other managed providers will want to build without Lua. I don't want to add the flag without a flag end-user for it.

@hpatro
Copy link
Contributor Author

hpatro commented Oct 29, 2024

Yes, I think that's reasonable. Let's hear out the community and understand better. 👍

@neomantra
Copy link
Contributor

The comments here dovetail with #1229, but I need to put it here as foundation for my vote for disabling Lua integration.

For a while, I managed my own fork of Redis with LuaJIT, modifying the build process.

Even though LuaJIT is significantly faster than Lua, Redis Lua Scripting uses the Lua C API which is inherently slow (albeit multiples faster with LuaJIT's implementation). All the Redis arguments are marshalled onto a stack, the EVAL handler called via a stack, then the result unmarshalled back off the stack. LuaJIT didn't make a huge difference for many of our workloads by itself -- the stack/CAPI would tend to dominate. I eventually abandoned the fork, except for some specific workloads that needed LuaJIT FFI features or tight math loops.

That work was from 2013 (!)... once Modules were introduced (2016!), I was excited to experiment with access to it via modules with LuaJIT FFI. The Module API way was such a natural way to interact with the Redis core and the LuaJIT FFI is very efficient. As I noted in one of Redis' Lua issues, such a module with dynamically linked LuaJIT conflicts with the statically linked Lua. But I did experiment with this on another LuaJIT-forked Redis. I never used it in production, but I found it ironic that other languages can use the Module API but Lua could not.

If Lua were disabled at build-level, such a LuaJIT Module could load its own LuaJIT (or the system LuaJIT, etc). Or get a LuaJIT EVAL that way too? Perhaps EVAL itself wouldn't exist if the Modules API had already existed. Currently, no work in this direction can happen.

Beyond all that, removing Lua is a simple hardening technique for the vast amount cases that don't use Lua at all.

Thanks for all the stewardship with ValKey.

@zuiderkwast
Copy link
Contributor

zuiderkwast commented Nov 1, 2024

@neomantra Thanks for providing the use cases for this.

Running Lua FFI and module API from EVAL is a big security issue in setups where not all clients can be guaranteed to be inside a trusted environement. I understand you run your clients inside a local or at least trusted network, which is very valid.

The official Lua support is sandboxed (at least it's supposed to be), because clients can not always be trusted, while the module API is not. Manage database service providers obviously don't allow the customers to use their own modules, but not all setups are like that.

IMO, we should not disable Lua by default, but only provide a way to compile without it, so users who want this don't need a patched Valkey version to do that. I would accept a PR that adds a make variable for this, such as BUILD_LUA=no, with relevant ifdefs for the scripting commands.

@neomantra
Copy link
Contributor

BUILD_LUA=no would work great and allow one to develop with other Lua runtimes without maintaining a fork (my main problem). Orgs developing and deploying modules can handle their own Valkey builds/deployments. Thanks!

@hpatro
Copy link
Contributor Author

hpatro commented Nov 1, 2024

Thanks @neomantra for explaining your setup and use case. 👍

neomantra added a commit to neomantra/valkey that referenced this issue Nov 1, 2024
Adds the ability to disable Lua scripting using the build
conifguration `USE_LUA`.  This will prevent the building
of the Lua static library and will keep Lua and most Lua
scripting internals out of the build.

This approach strived to minimize the change surface, stubbing out
keys requried functions and otherwise `ifdef`ing out large swathes of code.

The base Lua scripting commands like `EVAL` remain intact, but reply
with an error message `Lua scripting disabled`.  `INFO` commands
still include scripting statistics to prevent breaking any DevOps scripts, etc.
neomantra added a commit to neomantra/valkey that referenced this issue Nov 1, 2024
Adds the ability to disable Lua scripting using the build
conifguration `USE_LUA`.  This will prevent the building
of the Lua static library and will keep Lua and most Lua
scripting internals out of the build.

This approach strived to minimize the change surface, stubbing out
keys requried functions and otherwise `ifdef`ing out large swathes of code.

The base Lua scripting commands like `EVAL` remain intact, but reply
with an error message `Lua scripting disabled`.  `INFO` commands
still include scripting statistics to prevent breaking any DevOps scripts, etc.

Signed-off-by: Evan Wies <[email protected]>
@hpatro
Copy link
Contributor Author

hpatro commented Nov 1, 2024

@neomantra Could you submit a PR against valkey?

@neomantra
Copy link
Contributor

I did earlier today... working on test coverage now.

#1245

neomantra added a commit to neomantra/valkey that referenced this issue Nov 2, 2024
Adds the ability to disable Lua scripting using the build
conifguration `USE_LUA`.  This will prevent the building
of the Lua static library and will keep Lua and most Lua
scripting internals out of the build.

This approach strived to minimize the change surface, stubbing out
keys requried functions and otherwise `ifdef`ing out large swathes of code.

The base Lua scripting commands like `EVAL` remain intact, but reply
with an error message `Lua scripting disabled`.  `INFO` commands
still include scripting statistics to prevent breaking any DevOps scripts, etc.

Signed-off-by: Evan Wies <[email protected]>
neomantra added a commit to neomantra/valkey that referenced this issue Nov 2, 2024
Adds the ability to disable Lua scripting using the build
conifguration `USE_LUA`.  This will prevent the building
of the Lua static library and will keep Lua and most Lua
scripting internals out of the build.

This approach strived to minimize the change surface, stubbing out
keys requried functions and otherwise `ifdef`ing out large swathes of code.

The base Lua scripting commands like `EVAL` remain intact, but reply
with an error message `Lua scripting disabled`.  `INFO` commands
still include scripting statistics to prevent breaking any DevOps scripts, etc.

Signed-off-by: Evan Wies <[email protected]>
@neomantra
Copy link
Contributor

With this Lua-less Valkey, I was able to update my previous Module work -- migrated it to valkey-mod_luajit -- and was able to build a module linked with LuaJIT and load it into Valkey. It's just a POC and very rough, but at least it's possible to play now.

$ valkey-cli 
127.0.0.1:6379> module load /home/evan/projects/valkey-mod_luajit/libvalkey-mod_luajit.so
OK
127.0.0.1:6379> luajit.eval "VKM.EvalCtx():ReplyWithSimpleString('hello from LuaJIT')"
hello from LuaJIT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
major-decision-pending Major decision pending by TSC team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants