Maybe I wanna do my lambda things with Lua instead of <previous thing>. This is kind of devbloggy I guess. Writing it down as I go.
I want the thing to be able to:
I've changed my mind a few times about how to do this. One large post vs. multiple smaller ones. I like small posts, but. I want a post to be self contained in the sense of having all the necessary code in the post, so eh. It's kind of split into parts at least:
go to
StatementsA thing that comes up a fair amount is that in some function there's a lambda calculus expression passed in as an argument and we wanna do like case analysis and do different stuff depending on which it is: Variable reference, function abstraction or function application. I have in the past often done that by pattern matching or something, something kind of structured with named cases. In the Lua code below it's mostly just if-elseing and checking stuff like:
<exp>.param
then that means <exp> (has a parameter so) it must be a lambda<exp>.fun
then it must be a function application<exp>.var
then that means it's a variable referenceAlso stuff like <exp>.var == <name>
combines checking if something is a variable reference with checking if it refers to the <name> I am interested in. And like sometimes I do or-and-Luaisms instead of if-else. I dunno. So like at other times I might do stuff that feels maybe more structured and disciplined than this, but I think doing it this way is kind of fun too.
And also there's some helper/util stuff that I've done in the past but don't do here. Things like mapping over the result of parsing a string, or having a general "look for a sub-expression"-function that takes a predicate function as an argument. Some of it probably has something to do with Lua and how I do things there compared to in other languages, sometimes something to do with throwing more stateful and imperative stuff in there I think. Or I just haven't bothered: Lambda calculus isn't an enormous language with hundreds of syntax-cases to handle and there aren't that many different things I'm doing with it. Blah blah.
Hi hello we'll reserve these characters for stuff:
reserved = "λ\\≜:=.()"
Whitespace characters and reserved ones can't be used in identifiers. An attempt to parse an identifier will start at the start of the string (^
), skip any whitespace characters (%s
), and then look for at least one non-whitespace-or-reserved character. Successfully parsing an identifier will return the identifier name and a remainder string for the stuff that comes after it:
id = "%s*([^%s" .. reserved .. "]+)"
function identifier(s)
return s:match("^" .. id .. "(.*)")
end
Helper thing for testing:
function p(a, ...)
if a == nil then
print(nil)
return
end
local res = { '"' .. tostring(a) .. '"' }
for _, v in ipairs({...}) do
table.insert(res, '"' .. tostring(v) .. '"')
end
print(table.unpack(res))
end
Testing:
p(identifier([[foo]]))
p(identifier([[ foo]]))
p(identifier([[ foo (blah)]]))
p(identifier([[ foo(blah)]]))
p(identifier([[(foo)]]))
Seems okay!
We'll use "λ" for lambda but we will allow using "\" instead, for input convenience. Uuuuuh Lua patterns are kind of not exactly regex uuuh. We'll just use two patterns. Successfully parsing a lambda will return its parameter name and a remainder string that will contain the body expression of the lambda. (So "successfully" here just means that the "lambda, parameter, dot" part is fine, not that the full expression is okay and well formed and stuff.)
lamlam = "^%s*λ" .. id .. "%s*[.](.*)"
lamslash = "^%s*%\\" .. id .. "%s*[.](.*)"
function lambda(s)
local res, rest = s:match(lamlam)
if res then return res, rest end
return s:match(lamslash)
end
Testim:
p(lambda([[λx.x x]]))
p(lambda([[\x.x x]]))
p(lambda([[ λ xy . x x ]]))
p(lambda([[λ x y . x x ]]))
p(lambda([[x.x x]]))
p(lambda([[foo]]))
Okay let's see. Lua patterns.
%bxy, where x and y are two distinct characters; such item matches strings that start with x, end with y, and where the x and y are balanced. This means that, if one reads the string from left to right, counting +1 for an x and -1 for a y, the ending y is the first y where the count reaches 0. For instance, the item %b() matches expressions with balanced parentheses.
There's a chance we could use that for parentheses(!). Successfully matching returns a string with the stuff that's within the parentheses. And as per usual it also returns a remainder string for the stuff that comes after.
function parens(s)
local res, rest = s:match("^%s*(%b())(.*)")
if not res then return nil end
return res:sub(2, #res - 1), rest
end
p(parens([[(foo bar) baz buzz]]))
p(parens([[ (foo bar) baz buzz]]))
p(parens([[ foo (bar) baz buzz]]))
p(parens([[)foo bar( baz buzz]]))
We clearly also wanna parse nothing:
function nothing(s)
return s:match("^%s*$")
end
Test:
p(nothing(""))
p(nothing(" "))
p(nothing("foo"))
p(nothing(" foo"))
So those should kind of be the things that can be combined to make the things. An expression is one of the following:
<identifier>
(variable reference)λ<identifier>.<exp>
(function abstraction)<exp> <exp>
(function application)function var(id)
return { var = id }
end
function lam(param, body)
return { param = param, body = body }
end
function app(fun, arg)
return { fun = fun, arg = arg }
end
And we have kind of dealt with variables references and function abstractions. And the parenthesis stuff has something to do with function application. Um.
Okay then so one way to deal with stuff: Parse stuff so we end up with a list of things, and:
So there's some "one thing" vs. "a series of things" stuff. Let's see how it goes:
function parseone(s)
local res, rest = identifier(s)
if res then return var(res), rest end
res, rest = lambda(s)
if res then
local body = parse(rest)
if body.error then return body, ""
else return lam(res, body), ""
end
end
res, rest = parens(s)
if res then return parse(res), rest end
return { error = 'no good: "' .. s .. '"' }
end
function parselist(s)
local res = {}
while not nothing(s) do
local parsed
parsed, s = parseone(s)
if parsed.error then return parsed end
table.insert(res, parsed)
end
return res
end
function parse(s)
local list = parselist(s)
if list.error then return list end
if #list == 0 then return { error = "found: nothing. expected: something." } end
local res = list[1]
for i = 2, #list do
res = app(res, list[i])
end
return res
end
parseone
parses one thing but not the entire thing. It returns a fully parsed thing along with a remainder string for the stuff after.parselist
parses an entire thing, but it returns a list and not one syntax tree thing.parse
parses an entire thing.For testing:
function pl(t, indent)
indent = indent or 0
local spaces = (" "):rep(indent)
if t.error then
print(spaces .. "error: " .. t.error)
elseif t.var then
print(spaces .. "var: " .. t.var)
elseif t.param then
print(spaces .. "lambda: " .. t.param)
pl(t.body, indent + 1)
elseif t.fun then
print(spaces .. "apply:")
pl(t.fun, indent + 1)
print(spaces .. "to:")
pl(t.arg, indent + 1)
end
end
Testing. These should work:
pl(parse([[λx.x x]]))
pl(parse([[λa.λb.λc.a b c]]))
pl(parse([[λa.λb.λc.a (b c)]]))
pl(parse([[(λa.λb.λf.λx.a f (b f x)) (λf.λx.f (f x)) (λf.λx.f x)]]))
Probably fine. We can kind of tell that the parenthesis stuff works out okay by looking at the shape of things of the a b c
vs. the a (b c)
one.
These shouldn't work:
pl(parse([[]]))
pl(parse([[foo ()]]))
pl(parse([[λx.(]]))
pl(parse([[λx.(]]))
pl(parse([[λx x]]))
Also good. Not great. But eh, "good."
Blargh blargh blargh. We're going to want our parentheses back:
function pstring(s)
return "(" .. s .. ")"
end
Unparsing turns a lambda expression object thing into a string. Again with the three kinds of things:
<identifier>
(variable reference)λ<identifier>.<exp>
(function abstraction)<exp> <exp>
(function application)It's only the function application that splits like a tree and contains more than one proper subexpression. Unparsing would work okay by slapping parantheses around the function parts and argument parts of the applications, but some of those parentheses would be unnecessary:
unparse
will delegate to funstring
and argstring
:
function funstring(x)
local str = unparse(x)
return x.param and pstring(str) or str
end
function argstring(x)
local str = unparse(x)
return x.var and str or pstring(str)
end
-- uns the parse
function unparse(x)
return
x.var
or (x.param and "λ" .. x.param .. "." .. unparse(x.body))
or (funstring(x.fun) .. " " .. argstring(x.arg))
end
(I guess we don't need parenthesis around lambda abstractions in argument positions, but I like it better with.)
Another helpy testy function:
function pup(s)
print(unparse(parse(s)))
end
Some tests:
pup([[λx.x x]])
pup([[(λx.((x) ((x))))]])
pup([[λx.x λx.x]])
pup([[λa.λb.λc.a b c]])
pup([[λa.λb.λc.a (b c)]])
pup([[(λa.λb.λf.λx.a f (b f x)) (λf.λx.f (f x)) (λf.λx.f x)]])
That's it I think.
So if in our syntax tree we find something that looks kind of like this:
(λx.a x b) foo
Like, if there's some function application that has a lambda/function abstraction as its function part, then we can do a rewrite. It doesn't matter what the argument part of the application is. As long as the function part is a lambda we're good, and can:
So if we have (λx.a x b) foo
, we can substitute foo
for x
in a x b
, and get a foo b
. This is called "beta reduction."
The subst
function will take substitute replacement
for every reference to var
in exp
:
function subst(replacement, var, exp)
local function halp(x)
return
(x.var == var and replacement)
or (x.param and x.param ~= var and lam(x.param, halp(x.body)))
or (x.fun and app(halp(x.fun), halp(x.arg)))
or x
end
return halp(exp)
end
Variable references to var
are replaced with replacement
. We recursively handle lambda bodies if the lambda parameter does not shadow var
(cos then the outer var
can't be referred to from within that body). And we recursively handle the function and argument parts of function applications.
Testing: This kind of "application with a lambda argument" formation can appear wherever in the syntax tree, but we're not dealing with that right now and will test with expressions where a suitable function application is the top/outermost thing. Here's a helper function:
function ps(str)
local app = parse(str)
local lam = app.fun
print(unparse(subst(app.arg, lam.param, lam.body)))
end
And a test:
ps([[(λx.a x b) foo]])
Some more:
ps([[(λx.a x) (λy.y y)]])
ps([[(λx.a x (λy.b x)) foo]])
ps([[(λx.a x (λx.b x)) foo]])
Those are fine and the shadowing stuff works like it should.
This one works out more weirdly:
ps([[(λx.λy.x) y]])
The outer y
that is the function argument is a free variable here, but it is "captured" by the "lambda y" bit when it is substituted for x
. We end up with λy.y
and like that y
wasn't supposed to refer to that parameter.
We'll deal with that later.
When a variable reference appears in an expression it is either a "free" or a "bound" variable. It is bound if it appears somewhere within the body of a lambda that has a parameter with the same name, and it is bound to(?) that parameter. And if that's not the case then it's free. I'm not super certain about the terminology here and whenever I look it up in order to make sure I'm using the words correctly I get a little confused by what I find, but uh that's what those words are going to mean here.
(λx.λy.x) y
with variable references marked as bound/freeSo in (λx.λy.x) y
, the rightmost variable reference, the y
, is free. The x
within the lambdas is bound to the parameter of the outermost lambda. And our problem here is that we want to substitute the free y
for that x
but we don't want it to "become bound" by the inner lambda with the y
parameter.
Side note: We tend to look at a sub-expression and say stuff like "y
is a free variable in (λx.λy.x) y
." This might not be the full expression we're working with and y
could very well be bound to the parameter of some outer lambda if we looked at the whole thing, like if the whole thing was λy.(λx.λy.x) y
. This would be equally problematic, since doing the same rewrite within that expression would change the y
reference from refering to the outermost y
parameter to the innermost one. But like that's a nice thing, kind of: If a variable reference is free in the argument of a function application, then it's possible for it to get captured. We don't need to think too much about the surrounding context.
A function for finding free variables will be useful:
function free(exp)
local bound, res = {}, {}
function halp(x)
if x.var then
if not bound[x.var] then res[x.var] = true end
elseif x.param then
local new = not bound[x.param]
if new then bound[x.param] = true end
halp(x.body)
if new then bound[x.param] = nil end
else
halp(x.fun)
halp(x.arg)
end
end
halp(exp)
return res
end
That was stateful and awkward, I mean that was fun. Lua sets are typically tables where the the keys are the elements we're interested in and the values are all true
or something. Like <set>[<element>] = true
adds an element to a set and <set>[<element>] = nil
removes an element.
Another helper function for testing:
function pf(str)
local res = {}
for k, _ in pairs(free(parse(str))) do
table.insert(res, k)
end
print(table.concat(res, ", "))
end
And some tests:
pf([[(λx.λy.x) y]])
pf([[λy.(λx.λy.x) y]])
pf([[λy.(λx.λy.z) y]])
pf([[λy.(λx.λy.z) a]])
Is fine.
Okay okay okay, so! We have (λx.λy.x) y
, we wanna do the beta reduction thing, but the argument y
is kind of in conflict with the inner lambda y
. But if we rename the lambda y
to like a
or something then we can avoid the conflict. (λx.λy.x) y
is "alpha equivalent" to (λx.λa.x) y
. We can rename things however we'd like, as long as we don't mess up which references go with which parameters, and the expressions will kind of have the same "meaning." λx.λy.y x
is alpha equivalent to λa.λb.b a
and so on and etc.
We'll make a function for renaming the parameter of a lambda and all the references to it. Or actually we won't. subst
pretty much does that:
local exp = parse([[λy.x]])
print(unparse(lam("a", subst(var("a"), exp.param, exp.body))))
Something like that. If we have found the lambda that's causing the conflict, we can rename its parameter like that. subst
would take case of all the references to the parameter, but since this is a horrible example there aren't any in this one :)
Anyway subst
does the stuff and we mostly just have to deal with finding a new name. We should avoid any names that appear as free variables in either the argument expression or in the body of the lambda where we're doing the renaming. We kind of might as well just avoid all names that are in use in the entire function application, or maybe the entire expression we're working with. Also it's nice if the new name has something to do with the old one instead of just being something random. Like x
being renamed x2
could seem more sensible to a human than if it's being renamed to like asdzxcsar
or something.
all
finds all identifiers used in an expression. It's similar to free
but simpler:
function all(exp)
local res = {}
function halp(x)
if x.var then
res[x.var] = true
elseif x.param then
res[x.param] = true
halp(x.body)
else
halp(x.fun)
halp(x.arg)
end
end
halp(exp)
return res
end
And its test helper function is similar to free
's test helper function:
function pa(str)
local res = {}
for k, _ in pairs(all(parse(str))) do
table.insert(res, k)
end
print(table.concat(res, ", "))
end
Test:
pa([[(λx.λy.x) y]])
pa([[(λa.λb.c) d]])
pa([[λx.x]])
The unique
function will return a new name that is based on the name
argument and also is not in the used
argument set:
idsplit = "^([^%s" .. reserved .. "]-)(%d*)$"
function unique(name, used)
local start, num = name:match(idsplit)
num = num == "" and 1 or tonumber(num)
while true do
num = num + 1
local new = start .. num
if not used[new] then return new end
end
end
Test:
local used = all(parse([[x x2 y]]))
print(unique("x", used))
print(unique("x1", used))
print(unique("x2", used))
print(unique("x3", used))
print(unique("y", used))
print(unique("a2b", used))
print(unique("a2b3", used))
print(unique("a2b3e", used))
Beep boop. There's a conflict if:
The rename
function will take a reducible function application as its argument. If there's a conflict it will do one bit of renaming and return a new expression. If there isn't it will return nil.
function rename(exp)
local param = exp.fun.param
local argfree = free(exp.arg)
local function halp(x)
if x.var then return nil
elseif x.param == param then return nil
elseif argfree[x.param] and free(x.body)[param] then
local new = unique(x.param, all(exp))
return lam(new, subst(var(new), x.param, x.body))
elseif x.param then
local body = halp(x.body)
return body and lam(x.param, body)
else
local fun = halp(x.fun)
if fun then return app(fun, x.arg) end
local arg = halp(x.arg)
return arg and app(x.fun, arg)
end
end
local body = halp(exp.fun.body)
return body and app(lam(param, body), exp.arg)
end
Test halpy function:
function pr(str)
local res = rename(parse(str))
print(res and unparse(res))
end
Test:
pr([[(λx.λy.x) y]])
pr([[(λx.λy.x y) y]])
pr([[(λx.λx.x) y]])
Mlep.
Attempting a step
of evaluation will result in one of three things:
function step(exp)
if exp.var then return "normal", exp
elseif exp.param then
local res, body = step(exp.body)
return res, lam(exp.param, body)
else
if exp.fun.param then
local renamed = rename(exp)
if renamed then
return "rename", renamed
else
return "reduction", subst(exp.arg, exp.fun.param, exp.fun.body)
end
else
local res, fun = step(exp.fun)
if res ~= "normal" then
return res, app(fun, exp.arg)
else
local res, arg = step(exp.arg)
return res, app(fun, arg)
end
end
end
end
And eval
will parse a string and then and do a bunch of steps:
function eval(str, steps)
local res, exp = nil, parse(str)
if exp.error then
print(exp.error)
return
end
local i = 0
res, exp = step(exp)
print(unparse(exp))
while not steps or i < steps do
i = i + 1
res, exp = step(exp)
if res == "normal" then return end
print(unparse(exp))
end
end
Testing:
eval([[(λx.x) foo]])
eval([[λx.x]])
eval([[(λx.λy.y x) foo bar]])
eval([[(λx.λy.x) (foo bar)]])
eval([[(boop (λx.λy.x)) (foo bar)]])
eval([[(λx.λy.x y) y]])
eval([[(λx.λx.x) y]])
Good and fine. Does the extremely good error reporting work?
eval([[(λx.x) (λyλ]])
No good is good. Let's addition:
local two = [[λf.λx.f (f x)]]
local three = [[λf.λx.f (f (f x))]]
local plus = [[λa.λb.λf.λx.a f (b f x)]]
eval("(" .. plus .. ") (" .. two .. ") (" .. three .. ")")
It's five!
If we wanna make something infinite run for not forever we can say how many steps we wanna do:
eval([[(λx.x x) (λx.x x)]], 100)
Blap.
The stuff for evaluating expressions is like done, but in order to make it practical lambda calculus, it's nice to have some additional stuff for convenience. Let's support having a list of definitions:
function adddef(defs, name, exp)
if not defs[name] then table.insert(defs, name) end
defs[name] = exp
end
function removedef(defs, name)
defs[name] = nil
local found = false
for i = 1, #defs do
if defs[i] == name then found = true end
if found then defs[i] = defs[i + 1] end
end
end
For testing:
function pd(defs)
local res = {}
for _, name in ipairs(defs) do
table.insert(res, name .. " = " .. defs[name])
end
print(table.concat(res, ", "))
end
We're going to use it with parsed expressions, but we can test it with just some strings:
local defs = {}
adddef(defs, "x", "foo")
pd(defs)
adddef(defs, "y", "bar")
pd(defs)
adddef(defs, "z", "baz")
pd(defs)
adddef(defs, "y", "blep")
pd(defs)
removedef(defs, "y")
pd(defs)
adddef(defs, "y", "bar")
pd(defs)
removedef(defs, "x")
pd(defs)
removedef(defs, "x")
pd(defs)
removedef(defs, "y")
pd(defs)
removedef(defs, "z")
pd(defs)
This is probably mostly just preference stuff:
In order to let newer definitions refer to older ones we run through the list in reverse order and use subst
to replace stuff in an expression:
function replacedefs(defs, exp)
for i = #defs, 1, -1 do
exp = subst(defs[defs[i]], defs[i], exp)
end
return exp
end
Probably works! We'll find out later.
We will use ≜
or :=
to make definitions.
defdef = "^" .. id .. "%s*≜" .. "%s*(.*)"
defequal = "^" .. id .. "%s*:=" .. "%s*(.*)"
function def(s)
local res, rest = s:match(defdef)
if res then return res, rest end
return s:match(defequal)
end
Testing:
p(def([[id ≜ λx.x x]]))
p(def([[0 := λf.λx.x x]]))
p(def([[id ≜]]))
p(def([[λx.x x]]))
We'll undefine terms by not having an expression following the ≜
/:=
, so this should be all we need.
makerepl
makes REPL:
function makerepl()
local defs = {}
return {
execute = function(str)
local name, rest = def(str)
if name then
if nothing(rest) then
removedef(defs, name)
else
local exp = parse(rest)
if exp.error then print(exp.error)
else
adddef(defs, name, exp)
print(name .. " is defined :)")
end
end
else eval(str, 1000)
end
end,
replacedefs = function(str)
local exp = parse(str)
if exp.error then print(exp.error)
else print(unparse(replacedefs(defs, exp)))
end
end
}
end
Bit of testing:
local repl = makerepl()
repl.execute([[2 ≜ λf.λx.f (f x)]])
repl.execute([[+ ≜ λa.λb.λf.λx.a f (b f x)]])
repl.execute([[4 ≜ + 2 2]])
repl.execute([[+ 2 4]])
repl.replacedefs([[+ 2 4]])
Trying to execute + 2 4
doesn't do much, but replacing the defined names with their definitions works. Is how I want it to be. The REPL thing just prints stuff and doesn't e.g. return a useful value to the caller. I think that should be okay. We'll just manually test the string it printed here:
makerepl().execute([[(λa.λb.λf.λx.a f (b f x)) (λf.λx.f (f x)) ((λa.λb.λf.λx.a f (b f x)) (λf.λx.f (f x)) (λf.λx.f (f x)))]])
I think that's enough REPL. Maybe plug it into web page later.
HTML and JS for getting a textarea with lambdas in it. A lot of it is oddly "garoof dot nope Lua infrastructure" specific. I'll show the code like for completeness, but it's just gonna be a bunch of code with no explanation:
<textarea id="repl">
2 ≜ λf.λx.f (f x)
+ ≜ λa.λb.λf.λx.a f (b f x)
4 ≜ + 2 2
+ 2 4
</textarea>
<div class="toolbar">
<button id="execute" class="toolbar-button" title="Execute">Execute</button>
<button id="replacedefs" class="toolbar-button" title="Replace with definitions">Replacedefs</button>
</div>
<script>
const luastr = (str) => {
let i = 1;
while (true) {
const eqs = "=".repeat(i);
const start = `[${eqs}[`
const stop = `]${eqs}]`
if (!str.includes(stop)) {
return `${start}${str}${stop}`;
}
i++;
}
};
const ta = document.querySelector("#repl");
ta.setAttribute("style", `height: ${ta.clientHeight * 10}px;`);
let pos;
const selected = () => {
const str = ta.value;
const start = ta.selectionStart;
const end = ta.selectionEnd;
if (start !== end) {
pos = end;
return luastr(str.substring(start, end));
}
let lineStart = str.lastIndexOf("\n", start - 1) + 1;
if (lineStart < 0) { lineStart = 0; }
let lineEnd = str.indexOf("\n", start);
if (lineEnd < 0) { lineEnd = str.length; }
pos = lineEnd;
return luastr(str.substring(lineStart, lineEnd));
};
const print = (str) => {
ta.focus();
str = `\n${str}`;
ta.setRangeText(str, pos, pos, "end");
pos += str.length;
};
const handler = {
print: print,
error: print,
return: () => {}
};
const runner = `return web.run`;
const replacedefs = () => {
luarun(runner, `if not repl then repl = makerepl() end ; repl.replacedefs(${selected()})`, handler);
};
const execute = () => {
luarun(runner, `if not repl then repl = makerepl() end; repl.execute(${selected()})`, handler);
};
ta.onkeyup = (e) => {
if ((e.ctrlKey || e.altKey || e.metaKey) && e.key === "Enter") {
e.preventDefault();
execute();
} else if ((e.ctrlKey || e.altKey || e.metaKey) && e.key.toLowerCase() === "r") {
e.preventDefault();
replacedefs();
}
};
document.querySelector("#execute").onclick = execute;
document.querySelector("#replacedefs").onclick = replacedefs;
</script>
Bleh. That gets us:
Ctrl/Alt/Command(?) and Enter should execute the current line, or selection if there is any, and Ctrl/Alt/Command(?) and and R should to the replacedefs thing. I just threw some selection of modifier keys into the the check there. On my browser I can't get Ctrl+R to do the thing because it refreshes the page instead, although I think I've gotten that to work previously? Alt+Enter/Alt+R seems to work though. But if no keys work it should be possible to use the buttons instead.
So something like Alt+Enter on the three lines with the definitions, then Alt+R on the + 2 4
line, and then Alt+Enter again on the line you get from that. If that works then it works.