vendredi 2 août 2019

How to systematically populate a whitelist for a sandboxing program?

On pp. 260-263 of Programming in Lua (4th ed.), the author discusses how to implement "sandboxing" (i.e. the running of untrusted code) in Lua.

When it comes to imposing limiting the functions that untrusted code can run, he recommends a "whitelist approach":

We should never think in terms of what functions to remove, but what functions to add.


This question is about tools and techniques for putting this suggestion into practice. (I expect there will be confusion on this point I want to emphasize it upfront.)


The author gives the following code as an illustration of a sandbox program based on a whitelist of allowed functions. (I have added or moved around some comments, and removed some blank lines, but I've copied the executable content verbatim from the book).

-- From p. 263 of *Programming in Lua* (4th ed.)
-- Listing 25.6. Using hooks to bar calls to unauthorized functions

local debug = require "debug"
local steplimit = 1000    -- maximum "steps" that can be performed
local count = 0           -- counter for steps

local validfunc = {       -- set of authorized functions
  [string.upper] = true,
  [string.lower] = true,
  ... -- other authorized functions
}

local function hook (event)
  if event == "call" then
    local info = debug.getinfo(2, "fn")
    if not validfunc[info.func] then
      error("calling bad function: " .. (info.name or "?"))
    end
  end
  count = count + 1
  if count > steplimit then
    error("script uses too much CPU")
  end
end

local f = assert(loadfile(arg[1], "t", {}))  -- load chunk
debug.sethook(hook, "", 100)                 -- set hook
f()                                          -- run chunk

Right off the bat I am puzzled by this code, since the hook tests for event type (if event == "call" then...), and yet, when the hook is set, only count events are requested (debug.sethook(hook, "", 100)). Therefore, the whole song-and-dance with validfunc is for naught.

Maybe it is a typo. So I tried experimenting with this code, but I found it very difficult to put the whitelist technique in practice. The example below is a very simplified illustration of the type of problems I ran into.

First, here is a slightly modified version of the author's code.

#!/usr/bin/env lua5.3
-- Filename: sandbox
-- ----------------------------------------------------------------------------
local debug = require "debug"

local steplimit = 1000    -- maximum "steps" that can be performed
local count = 0           -- counter for steps

local validfunc = {       -- set of authorized functions
  [string.upper] = true,
  [string.lower] = true,
  [io.stdout.write] = true,
  -- ...    -- other authorized functions
}

local function hook (event)
  if event == "call" then
    local info = debug.getinfo(2, "fnS")
    if not validfunc[info.func] then
      error(string.format("calling bad function (%s:%d): %s",
                          info.short_src, info.linedefined, (info.name or "?")))
    end
  end
  count = count + 1
  if count > steplimit then
    error("script uses too much CPU")
  end
end

local f = assert(loadfile(arg[1], "t", {}))     -- load chunk
validfunc[f] = true
debug.sethook(hook, "c", 100)                   -- set hook
f()                                             -- run chunk

The most significant differences in the second snippet relative to the first one are:

  1. the call to debug.sethook has "c" as mask;
  2. the f function for the loaded chunk gets added to the validfunc whitelist;
  3. io.stdout.write is added to the validfunc whitelist;

When I use this sandbox program to run the one-line script shown below:

# Filename: helloworld.lua

io.stdout:write("Hello, World!\n")

...I get the following error:

% ./sandbox helloworld.lua
lua5.3: ./sandbox:20: calling bad function ([C]:-1): __index
stack traceback:
    [C]: in function 'error'
    ./sandbox:20: in function <./sandbox:16>
    [C]: in metamethod '__index'
    helloworld.lua:3: in local 'f'
    ./sandbox:34: in main chunk
    [C]: in ?

I tried to fix this by adding the following to validfunc:

  [getmetatable(io.stdout).__index] = true,

...but I still get pretty much the same error.


I have two related questions:

  1. What can I add to validfunc so that sandbox will run helloworld (as is) to completion?
  2. More importantly, what is a systematic way to find determine what to add to a whitelist table?

Part (2) is the heart of this post. I am looking for tools/techniques that remove the guesswork from the problem of populating a whitelist table.

(I know that I can get helloworld to work if I replace io.stdout:write with print, and register print in sandbox's validfunc, but doing this does not answer the general question of how to systematically determine what needs to be added to the whitelist to allow specific code.)





Aucun commentaire:

Enregistrer un commentaire