Skip to content

Unsafe memoization in PFLUA #1513

Open
Open
@alexandergall

Description

@alexandergall

In a multi-process setup, the ipfix probe program occasionally crashes on startup with an assertion failure in the PFLUA optimizier. Here is an excerpt of such an event:

../lib/pflua/src/pf/optimize.lua:0: attempt to compare string with number

Stack Traceback
===============
(1) Lua metamethod '__lt' at file 'core/main.lua:172'
        Local variables:
         (*temporary) = string: "../lib/pflua/src/pf/optimize.lua:0: attempt to compare string with number"
(2) Lua upvalue 'd' at file '../lib/pflua/src/pf/optimize.lua:0'
        Local variables:
         (*temporary) = string: "[]"
         (*temporary) = string: "[]"
         (*temporary) = nil
         (*temporary) = nil
         (*temporary) = nil
         (*temporary) = string: "attempt to compare string with number"
(3) Lua upvalue '' at file '../lib/pflua/src/pf/optimize.lua:93'
        Local variables:
         (*temporary) = table: 0x7fedcb989ff8  {1:[], 2:23, 3:1}
         (*temporary) = table: 0x7fedcaae5118  {1:(}
         (*temporary) = number: 1
         (*temporary) = number: 3
         (*temporary) = number: 1
         (*temporary) = number: 1
         (*temporary) = number: 2
(4) Lua upvalue '' at file '../lib/pflua/src/pf/optimize.lua:78'
        Local variables:
         (*temporary) = table: 0x7fedcb989ff8  {1:[], 2:23, 3:1}
         (*temporary) = nil
...

The problem disappears when the memoize() wrapper is removed at

cfkey = memoize(function (expr)

The function cfkey() takes a Lua table (expr) as input and that table is used as the lookup key in the cfkey_cache table. The validity of the memoization relies on two assumptions about expr

  • it's immutable
  • it never goes out of scope between calls to cfkey()

The first assumption must hold because only the address of the expr table is used as key into the cache. The second assumption guarantees that the table is not garbage-collected and re-used between calls. It seems that at least the second assumption is violated. To confirm this, modify

function optimize_inner(expr)

by inserting an explicit GC run

function optimize_inner(expr)
   expr = simplify(expr, true)
   expr = simplify(cfold(expr, {}), true)
   collectgarbage()
   expr = simplify(infer_ranges(expr), true)
   expr = simplify(lhoist(expr), true)
   clear_cache()
   return expr
end

The problem can then be triggered by the following script

pf = require("pf")

local filter = "(ip or ip6) and tcp and (dst port 80 or dst port 443 or dst port 8443)"
local foo = {}
for n = 1, 300 do
   foo[#foo+1] = pf.compile_filter(filter)
end

On my system, the assertion failure occurs in roughly 1 out of 5 runs of this program.

The issue should really be opened in https://github.com/Igalia/pflua but that code appears to be abandoned.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions