This is part 3 of a series that began with Functional Web Programming, Part 0 — Introduction.
What is immutable data?
Let’s start with something we’re all really familiar with: an assignment statement.
In PHP, we’d assign a value to a variable like this:
$x = 9;
Here’s some other things that we’re doing every day in our code:
// Reassign an existing variable $x = 9; $x = 2; // Build up the value of a variable iteratively $message = "Hey girl, "; $message .= "you ever throw down some Clojure? "; $message .= "Or maybe Haskell?"; // Set the property value of an existing object $user->name = "Taylor Blue";
Again, all super-common everyday programming stuff in the College.
It’s also the biggest thing that’s making your programs a hassle to develop and an ever bigger hassle to debug and maintain. The introduction of state into computer programs is the root of many (and in my career, most) bugs.
How does state jam us up?
Here are some of the ways maintaining and changing the state of our program causes us problems:
- A variable or property was never assigned a value
- The value of a variable or property has been changed out from under us
- The value of a variable or property was correct but later set to an incorrect value
- The service responsible for providing the value of a variable or property at runtime isn’t responding
- The value of a variable or property is fine, but other parts of the program downstream can break in non-obvious ways if we change the value
- The value of a variable or property is fine, but it’s not obvious where that value came from in the first place and what else has touched up all over it
The list above pretty much covers the root cause for most of the bugs I encounter (and cause). They all also are made possible by introducing statefulness into our programs.
In functional programming languages, data is immutable. You can name them, but you can’t change them. Values and their identity are not separate ideas in functional programming, while value and identity are independent concepts in imperative programming (more on that below).
If we bind
$age to the value 29,
$age will always be 29, and we can’t change the value of
$age just as we can’t change what the number 29 means. In functional programs,
$age isn’t just a reference to 29 — for all purposes,
$age is 29.
Think of it this way: in procedural and object-oriented programming, variables contain a value. In functional programming, variables are the value.
Benefits of immutable data
Here are the benefits of using immutable data in your programs:
Immutable data makes concurrency easier
We don’t write multithreaded code in my office, but one of the biggest benefits of immutable data is how it enables concurrent programming. I’ll skip this point since it’s not relevant to most of you, but I encourage you to scope out the article on concurrent programming in the Clojure docs.
Rest assured, immutable data structures are pretty handy when it comes to writing multithreaded programs.
Immutable data prevents bugs caused by bad state
Like the mathematical functions which are functional programming’s namesake, functions are pure and total.
By definition, they can’t take invalid input, can only create a new representation of (but never change) their inputs, and can’t affect the world outside themselves. They also can’t reach out into the world halfway through execution to get some needed value — everything has to be provided up front to a function (and to a functional program, which is just a bunch of smaller functions threaded together).
In typical procedural or object-oriented code, order of execution is important at the function level as we have to track and maintain constantly-changing state to ensure that the data we’re working with stays in a state that won’t cause incorrect results or runtime errors.
In functional programs, they get all the valid values they need up front and references to values can’t change. Since functional programs are just referentially-transparent pipelines of functions that are themselves referentially-transparent, execution order doesn’t matter at the function level — in fact, the big draw of functional programming is that we can mathematically guarantee that our program does exactly what we want it to.
When using immutable data (and processing them with referentially-transparent functions), the big list above of problems that come along with state dries up.
Immutable data discourages side effects
Most side effects involve changing state — (over)writing a new value to a variable, object property, persistent data storage, or cache. Because data is immutable in functional programming, we literally can’t do most side effects, or at least not without jumping through some hoops to make sure we don’t jack up our app.
Immutable data is easier to recall or cache
Because functional programs can only create new data from existing data, the “old” data is still around and easily accessible. Likewise, because data can’t be changed by the program, the interpreter or compiler (or the programmer) is free to cache that data and call it up later on the cheap.
Contrast this to a lot of imperative programming (and databases that aren’t Datomic), where values are mutated in-place, wiping out previous values and, along with it, any guarantees we had about which variable held which value.
Imperative programming vs. functional programming
As I mentioned a couple times today, working with immutable data throws a spotlight on the differences between the structure and execution of imperative programs (which are 99% of the programs we write in the College) and functional programs.
MSDN has a short but informative article on the differences between imperative and functional programming. In it, the author writes that:
“To solve problems, [object-oriented] developers design class hierarchies, focus on proper encapsulation, and think in terms of class contracts. The behavior and state of object types are paramount, and language features, such as classes, interfaces, inheritance, and polymorphism, are provided to address these concerns.
In contrast, functional programming approaches computational problems as an exercise in the evaluation of pure functional transformations of data collections. Functional programming avoids state and mutable data, and instead emphasizes the application of functions.”
Put another way, object-oriented programs tackle problems by modeling the state of the world and tweaking that state in a very specific order until we get it to a state we like. Functional programs, on the other hand, don’t mess around with keeping track of state, and instead lay down a one-way, self-contained conveyer belt that takes some value in and spits some new value out.
As I’ve discussed in other parts of this series, it’s possible to write “good enough” programs in an imperative style, but the reliance on state definitely makes things harder than they need to be.
const keyword, which is helpful but explicitly does not make data immutable).
Still, there’s some things you can do in your object-oriented code to take advantage of immutable data’s benefits:
- Don’t reassign variables. Once you’ve assigned a value to a variable, don’t change it.
- Make object properties read-only. The value of an object’s properties should only be set once — when the object is hydrated.
- Write pure functions. Try not to write functions that change the state of the program or cause side effects. Get nervous about functions that don’t return a value.
- Avoid action at a distance. Related to the above — don’t write functions that modify the state of some distant part of the program. Try not to use
staticmethods, and never ever use
Functional programming differs from procedural and object-oriented (imperative) programming in that data is immutable. We can give names to values, but we can’t change those values.
Immutable data helps us avoid the sticky problems that come with writing stateful programs, and plays nicely with functions that are pure and total. Immutable data also makes writing multithreaded programs a lot less of a headache.