The expression problem is a well known problem in programming language theory dealing with the expressivity of a programming language. In this post I’ll examine its impact on functional and object-oriented code, then talk about solving the problem in a language with tables as a programmable entity.
Say you have some data, like Animals. And some actions that they can perform. We want to implement both actions for both animals.
cat | fox | |
---|---|---|
sound | {…} | {…} |
eat | {…} | {…} |
The expression problem asks what facilities the language provides to deal with this table. Here’s Philip Wadler’s original statement of the problem.
The Expression Problem is a new name for an old problem. The goal is to define a datatype by cases, where one can add new cases to the datatype and new functions over the datatype, without recompiling existing code, and while retaining static type safety (e.g., no casts).
Let’s take a look at two how the object-oriented and functional styles cope with the expression problem.
Object-oriented languages primarily look at the data (“objects”). We say that a cat can eat or a fox can make a sound.
class Cat {
def sound {...}
def eat {...}
}
class Fox {
def sound {...}
def eat {...}
}
We’re implicitly filling in columns of a table.
class Cat { | class Fox { | |
---|---|---|
} | } | |
sound |
|
|
eat |
|
|
It’s relatively easy to add a new datatype (ie a column). It’s all contained in a new class declaration.
class Bat {
def sound {...}
def eat {...}
}
class Cat { | class Fox { | class Bat { | |
---|---|---|---|
} | } | } | |
sound |
|
|
|
eat |
|
|
|
To add a new action (ie a row) is more difficult. We need to open up all the objects (columns) to modify them.
class Cat { | class Fox { | |
---|---|---|
} | } | |
sound |
|
|
eat |
|
|
attack |
|
|
Not a big deal when there are only two classes, but the point is we need to find and modify each class implementing this interface. The implementations are spread across space, likely in different files. The difficulty is not in the amount of code being written, but how many distinct places one must edit.
Perhaps more importantly, since we spend more time reading than writing code, how many places must you look to understand a concept. If I want to to understand what it means to be a Cat
, I look at the Cat
class. But there’s no single place where I can understand what it means to eat
.
Functional languages start from functions. sound applies to cats and foxes.
data Animal = Cat | Fox | Bat
sound :: Animal -> Sound
sound Cat = {...}
sound Fox = {...}
sound Bat = {...}
eat :: Animal -> Action
eat = {...}
Cat | Fox | |
---|---|---|
sound | Cat -> {...} | Fox -> {...} |
eat | Cat -> {...} | Fox -> {...} |
It’s relatively easy to add a new function.
attack :: Animal -> Action
attack Cat = {...}
attack Fox = {...}
attack Bat = {...}
Cat | Fox | |
---|---|---|
sound | Cat -> {} | Fox -> {} |
eat | Cat -> {} | Fox -> {} |
attack | Cat -> {} | Fox -> {} |
But it’s difficult to add a new data constructor.
data Animal = Cat | Fox | Bat | Dog
sound :: Animal -> Sound
sound Cat = {...}
sound Fox = {...}
sound Bat = {...}
sound Dog = {...}
eat = {...}
Cat | Fox | Bat | |
---|---|---|---|
sound | Cat -> {...} | Fox -> {...} | Bat -> {...} |
eat | Cat -> {...} | Fox -> {...} | Bat -> {...} |
We need to open up the implementation of every function dealing with the Animal
datatype. Which, again, may be spread across different files.
Both approaches work well only half the time when editing objects and actions they perform.
In both the object-oriented and functional code, we’re implicitly filling the cells in a table. An object-oriented approach tends to focus on columns, while a fuctional approach tends to focus on rows.
Imagine using tables as a first-class programming interface, where you can scan the rows and columns or add new ones. You become empowered to see the table from the point of view of either the functional or object-oriented style, depending on what you’re trying to understand.
Cat | Fox | Bat | |
---|---|---|---|
sound | Cat -> {...} | Fox -> {...} | Bat -> {...} |
eat | Cat -> {...} | Fox -> {...} | Bat -> {...} |
attack | Cat -> {...} | Fox -> {...} | Bat -> {...} |
Aside: It's very important to understand that I'm not proposing to program this way in text files. It would be far too much work to maintain table layout manually. Rather, this type of editing can only be done in a richer environment.
Additionally, by treating the table as an object in our programming environment, we gain the power to query:
> animals which eat
Cat | Fox | Bat | |
---|---|---|---|
sound | Cat -> {...} | Fox -> {...} | Bat -> {...} |
eat | Cat -> {...} | Fox -> {...} | Bat -> {...} |
attack | Cat -> {...} | Fox -> {...} | Bat -> {...} |
> unimplemented cases
Cat | Fox | Bat | |
---|---|---|---|
sound | Cat -> {...} | Fox -> {...} | Bat -> {...} |
eat | Cat -> {...} | Fox -> {...} | Bat -> {...} |
attack | Cat -> {...} | Fox -> {...} | Bat -> {...} |
To make this more interesting, say our environment also provides queryable metadata. It could conceivable provide info on:
Then you could query interesting things, like:
> functions written by me,
with no tests,
that have thrown an exception in production
Cat | Fox | Bat | |
---|---|---|---|
sound | Cat -> {...} | Fox -> {...} | Bat -> {...} |
eat | Cat -> {...} | Fox -> {...} | Bat -> {...} |
attack | Cat -> {...} | Fox -> {...} | Bat -> {...} |
> implementations calling a deprecated function
Cat | Fox | Bat | |
---|---|---|---|
sound | Cat -> {...} | Fox -> {...} | Bat -> {...} |
eat | Cat -> {...} | Fox -> {...} | Bat -> {...} |
attack | Cat -> {...} | Fox -> {...} | Bat -> {...} |
> implementations marked TODO
Cat | Fox | Bat | |
---|---|---|---|
sound | Cat -> {...} | Fox -> {...} | Bat -> {...} |
eat | Cat -> {...} | Fox -> {...} | Bat -> {...} |
attack | Cat -> {...} | Fox -> {...} | Bat -> {...} |
Inheritance and type classes are both means of coping with the combinatorial explosion of table cells (as n
datatypes and m
operations on each requires n*m
function implementations) by filling in cells automatically.
By making tables programmable - think spreadsheets - we can implement rules to simulate inheritance and type classes.
I’ve been thinking about what it might look like to simulate inheritance or type classes in this framework. Here’s a sketch of the direction I’m going.
We have a few different tables - animal
, machine
, inheritance
, and the animal
/ machine
composite table.
First a couple tables implementing the operations for the parent class.
animal | |
---|---|
sound | {...} |
eat | {...} |
machine | |
---|---|
power up | {...} |
manufacture | {...} |
Now a table specifying the inheritance pattern.
animal | machine | … | |
---|---|---|---|
cat | x | ||
car | x | ||
… |
Those three tables are the whole implementation. Let’s look at a derived table of all animals and machines together.
animal
/ machine
cat | car | |
---|---|---|
sound | (animal) | |
eat | (animal) | |
power up | (machine) | |
manufacture | (machine) |
Some benefits:
There’s a related problem with testing. Given two orthogonal axes - code and tests - we need to decide how to lay them out. Most code I’ve worked with has had code in one file and tests in another:
// code
function f() {...}
function g() {...}
// tests
function test_f() {...}
function test_g() {...}
Although I have seen code that does this all inline
// f
function f() {...}
function test_f() {...}
// g
function g() {...}
function test_g() {...}
There are good arguments to be made for both styles. It’s more important to note that they’re both valid. Sometimes I want to see f
{.javascript} and g
{.javascript} at the same time. Sometimes I want to see f
{.javascript} and f_test
{.javascript} at the same time.
Let’s solve this in the same way as the expression problem
f | g | |
---|---|---|
implementation | f | g |
test | f_test | g_test |
Viewing a column I can see code and test together. Viewing a row I can see all code or all tests.
The expression problem is really a layout problem. It’s impossible to write a one-dimensional (plaintext) file such that data and actions are both immediate. Inheritance and typeclasses are useful, but they’re really rules for building an implicit table.
My proposed solution is to change the constraints of the problem. By considering a richer representation of our program we gain the power to view it from multiple directions.