Serpent: Lua serializer and pretty printer

As I have been working on the Lua debugger for ZeroBrane Studio, I realized that I need to have a table serializer to send complex structures like stack traces and table values between components of the debugger. Serpent is the Lua serializer and pretty printer I wrote to do the job.

Lua doesn't provide a built-in method to serialize a table mostly due to drastically varying requirements that may need to be handled by such a method (as has been discussed in this thread). My first stop was the TableSerialization page, which provided a lot of useful information on how to do table serialization with numerous examples on how this has been done so far.

The review of the existing modules revealed that none of them satisfies my requirements: (1) pure Lua (this excluded modules like PlutoLibrary), (2) does both pretty printing and robust serialization (this excluded serialize.lua from metalua), (3) handles shared and self-references (this excluded many implementations), (4) serializes keys of various types, including tables as keys (this excluded
pretty.lua from Penlight), and (5) is short and doesn't have too many dependencies to be included with another module (this excluded tserialise/tpretty from lua-nucleo).

serialize.lua from metalua comes close, but it doesn't handle global functions or math.huge numbers, and doesn't do pretty printing (which would require a different method); table.tostring and serialize combined would be almost half of the debugger code I wanted to include it with.

pretty.lua from Penlight does pretty printing, but relies on two other modules and doesn't handle boolean values and tables as keys, shared tables, or self-references.

tserialize.lua from lua-nucleo also does a good job serializing various types (and works correctly with math.huge numbers, which the other two modules don't serialize in a portable way), but doesn't handle functions as either keys or values and doesn't degrade nicely (as it generates invalid output when functions are present).

My preference would be to have the result that is as readable as possible and is still a valid fragment that I can load with loadstring. For example, I want keys to be listed with numeric keys going first and only showing them when needed: {'a', 'b'} instead of {[1] = 'a', [2] = 'b'}, {1, nil, 3} instead of {1, [3]=3}, and {foo = 'foo'} instead of {['foo'] = 'foo'}.

Serpent does not handle upvalues and metatables; there is no special support for weak references or transient objects (but io.stdin and other userdata objects in io.* table are serialized by name). Functions are serialized when possible; global functions are replaced with their names.

This is the (convoluted) example I setup that handles various cases I care about:

local b = {text="ha'ns", ['co\nl or']='bl"ue', str="\"\n'\\\000"}
local c = function() return 1 end
local a = {
  x=1, [true] = {b}, [not true]=2, -- boolean as key
  ['true'] = 'some value', -- keyword as a key
  z = c, -- function as value
  list={'a',nil,nil, -- embedded nils
        [9]='i','f',[5]='g',[7]={}, ['3'] = 33}, -- empty table
  [c] = print, -- function as key, global as value
  [io.stdin] = 3, -- global userdata as key
  ['label 2'] = b, -- shared reference
  [b] = 0/0, -- table as key, undefined value as value
  [math.huge] = -math.huge, -- huge as number value
}
a.c = a -- self-reference
a[a] = a -- self-reference with table as key

The library provides three functions -- dump, line, and block -- with the last two being shortcuts for the main dump function.

Pretty multi-line printing

This is useful for debugging if you want to see a table with subtables printed one element per line with indentation.

local serpent = require("serpent")
print(serpent.block(a))

will print

{
  [1/0 --[[math.huge]]] = -1/0 --[[-math.huge]],
  c = nil --[[ref]],
  ["label 2"] = {
    ["co\nl or"] = "bl\"ue",
    str = "\"\n'\\\000",
    text = "ha'ns"
  } --[[table: 001752B0]],
  list = {
    "a",
    nil,
    nil,
    "f",
    "g",
    nil,
    {} --[[table: 00175350]],
    [9] = "i",
    ["3"] = 33
  } --[[table: 00175328]],
  ["true"] = "some value",
  x = 1,
  z = loadstring("LuaQ...",'@serialized') --[[function: 00171020]],
  [false] = 2,
  [io.stdin --[[file (767D0958)]]] = 3,
  [true] = {
    nil --[[ref]]
  } --[[table: 00175300]]
} --[[table: 001752D8]]

Pretty single-line printing

This is a more compact representation with the serialized string being one line (note that this is not a complete representation as pretty printed strings don't include self-ref section that serializes shared and circular references).

local serpent = require("serpent")
print(serpent.line(a))

will print (new lines added for readability)

{[1/0 --[[math.huge]]] = -1/0 --[[-math.huge]], c = nil --[[ref]], ["label 2"] = 
{["co\nl or"] = "bl\"ue", str = "\"\n'\\\000", text = "ha'ns"} --[[table: 001752B0]], 
list = {"a", nil, nil, "f", "g", nil, {} --[[table: 00175350]], [9] = "i", ["3"] = 33} 
--[[table: 00175328]], ["true"] = "some value", x = 1, z = loadstring("LuaQ...",
'@serialized') --[[function: 00171020]], [false] = 2, [io.stdin --[[file (767D0958)]]] = 3,
[true] = {nil --[[ref]]} --[[table: 00175300]]} --[[table: 001752D8]]

This is in fact guaranteed to be one-line, unlike other libraries that use "%q":format(...) to safe-serialize strings, which may produce multi-line output if your strings include '\n' (this has been addressed in Lua 5.2).

Full serialization

This option provides serialization that includes assignments to restore shared and circular references.

local serpent = require("serpent")
print(serpent.dump(a, {name = 'a'}))

will print (new lines added for readability)

do local a={[false]=2,[true]={[1]={str="\"\n'\\\000",text="ha'ns",["co\nl or"]
="bl\"ue"}},c=nil --[[ref]],["label 2"]=nil --[[ref]],[1/0 --[[math.huge]]]=-1/0
--[[-math.huge]],x=1,z=loadstring("LuaQ...",'@serialized'),[io.stdin]=3,list=
{[1]="a",[4]="f",[5]="g",[7]={},["3"]=33,[9]="i"},["true"]="some value"};
a.c=a;a[a]=a;a["label 2"]=a[true][1];a[a[true][1]]=0/0;a[a.z]=print;return a;end

Note that this representation includes additional code that sets all the shared and circular references and is also used to set those keys that have complex values:

a.c=a;
a[a]=a;
a["label 2"]=a[true][1];
a[a[true][1]]=0/0;
a[a.z]=print;
return a;

As you can see, this does look similar to the original code that created this structure. To restore the original table, you can use the following code:

local str = serpent.dump(a)
local fun, err = loadstring(str)
if err then error(err) end
local _a = fun() --<-- this is the restored copy of the original table

It is called "Serpent" because it handles self-references and reminds me of a serpent eating its own tail. Serpent is available here, along with tests and a simple benchmark.

[Updated 06/13/2012] to match the new interface.

[Updated 12/26/2012] to fix method names and clarify handling of boolean values in pretty.lua.

You should get a copy of my slick ZeroBrane Studio IDE and follow me on twitter here.

3 Comments

I think pretty.lua from Penlight does handle booleans.

@Vadim, thanks for the comment; my statement was poorly worded as I meant "doesn't handle boolean values [and] tables as keys". Fixed.

What library supports what could have been written as a table: more compact and informative.

And serializable ;)

Leave a comment

what will you say?
(required)
(required)

About

I am Paul Kulchenko.
I live in Kirkland, WA with my wife and three kids.
I do consulting as a software developer.
I study robotics and artificial intelligence.
I write books and open-source software.
I teach introductory computer science.
I develop a slick Lua IDE and debugger.

Recommended

Close