home  articles  notes  projects  github  twitter  email

Articles

July 05, 2018

Purely functional dependency injection in TypeScript

swans

A very common pattern I see in web applications on the server side is a database connection module that exposes a bunch of functions. These functions run queries to the database and may call other functions defined in the module to abstract some of the common database operations. Let’s look at one possible solution.

function query(client: pg.Client, sql: Query, params: Param[]): Task<pg.QueryResult> {
  return client.query(sql, params)
}

function fetchUser(client: pg.Client, id: number): Task<User> {
  return query(client, "SELECT first_name, last_name from users where id = ?", [id])
    .map(rows => rows[0])
}

This is the most obvious solution. We just pass around the client to every function. But that is also the problem with this approach. The database functions quickly become infectious: every function that uses one of those needs to also receive the pg.Client as argument.

This is a problem we want to solve with dependency injection. In object oriented programming we usually create a class and inject the DB client to it. The functions become methods in that class and have access to the client.

class Database {
  constructor(client: pg.Client) { }

  query(sql: Query, params: Param[]): Task<pg.QueryResult> {
    return this.client.query(sql, params)
  }

  fetchUser<T>(id: number): Task<User> {
    return this.query("SELECT first_name, last_name from users where id = ?", [id])
      .map(rows => rows[0])
  }
}

This somewhat solves the problem and we no longer need to pass the pg.Client around. However, we then need to pass that class around we just created. We also need to write code in a more object oriented style: we can no longer just pass functions around because we have an object instead the plain functions. And I like functions, they are easy to work with.

How can we keep the simplicity of functions without having to pass that client around?

The essence of the database layer

The trick is to identify the essence of the database abstraction. Looking at the functions we have, what are the parts that are common to both?

function query(client: pg.Client, sql: Query, params: Param[]): Task<pg.QueryResultRow[]> {
}

function fetchUser(client: pg.Client, id: number): Task<User> {
}

Both of them take the raw database client as input and return a Task. Ok, so extracting the common parts out we are left with a function type:

function query<A>(client: pg.Client): Task<A>

Note the introduction of a type variable A for the return value inside the Task.

So, with the common parts extracted out into a type, what do we do with it?

What we can do is, instead of returning a Task from our fetchUser function, we return the query function itself.

function fetchUser(id: number): (client: pg.Client) => Task<User> {
  // Returning the `query` instead of calling it!
  return function query(client: pg.Client): Task<User> {
    return client.query("SELECT first_name, last_name from users where id = ?", [id])
      .map(result => result.rows[0])
  }
}

The return type now has the same type as the query function had! In fact, let’s extract that type out so it becomes a bit clearer, and let’s call it DatabaseFn because it represents our database abstraction.

type DatabaseFn<A> = (client: pg.Client) => Task<A>

function fetchUser(id: number): DatabaseFn<User> {
  return function (client: pg.Client): Task<User> {
    return client.query("SELECT first_name, last_name from users where id = ?", [id])
      .map(result => result.rows[0])
  }
}

So now our fetchUser itself returns a function. This also means that calling the fetchUser function doesn’t require us to have the client available.

To actually run our query (really just to give us a Task) we must call the database function we returned:

fetchUser(42)(client)

What all this has now given us is a pure function which we can just import from anywhere and call without needing to provide the client value. We can rely on the fact that the client will be passed in later at the top level of the call stack when we actually want to run the query.

More complex domain logic

Now, what if we want to further abstract things and call fetchUser from another DB function? Do we then need to have the client available? No. We can just keep returning those database functions.

// Fetch one row, fail if no rows were found
function queryOne(sql: Query, params: Param[]): DatabaseFn<QueryResultRow> {
  return (client: pg.Client): Task<QueryResultRow> {
    client.query(query, params)
      .chain(rows => {
        return rows.length > 0
          ? Task.of(rows[0])
          : Task.rejected(new Error('Query returned no rows'))
      })
  }
}

function authenticate(username: string, password: string): DatabaseFn {
  return function (client: pg.Client): Task<User> {
    const query = "SELECT id from users where username = ? and password = ?"
    // `queryOne` returns a `DatabaseFn` so we need to pass it a `client`
    // in order to give us a `Task` so we can `chain`.
    return queryOne(query, [username, password])(client)
      .chain(id => {
        // Same here, need to unwrap our `DatabaseFn`.
        return fetchUser(id)(client)
      })
  }
}

We are still returning a DatabaseFn. Note that chain is a method from Task, and since we are inside a Task we must also return a Task. Each time we call a query function we need to unwrap the DatabaseFn to be able to chain with another call.

All that DatabaseFn unwrapping creates too much boilerplate and is error prone to write. It doesn’t compose well. Can we do better?

Make it a datatype

In order to combine our database functions more easily we need some kind of helper functions or something to make it simpler. There are several solutions to this, but I’m proposing a solution here that uses classes.

Instead of passing around the raw DatabaseFn function around, let’s wrap it in a class.

class DatabaseFn<A> {
  constructor(readonly fn: (client: pg.Client) => Task<A> {}
}

We can now add methods to this class to work with the wrapped function. Let’s add chain which allows call chaining like we did in authenticate.

class DatabaseFn<A> {
  constructor(readonly fn: (client: pg.Client) => Task<A> {}

  chain<B>(f: (a: A => DatabaseFn<B>)): DatabaseFn<B> {
    return new DatabaseFn((client: pg.Client) => {
      this.fn(client).chain(v => f(v)(client))
    })
  }
}

This is nice but we don’t have any way of running queries, we can only chain them! Let’s write a query combinator for this.

function query(sql: Query, params: Param[]): DatabaseFn<QueryResultRow[]> {
  return new DatabaseFn((client: pg.Client) => {
    return client.query(sql, params)
  })
}

Now we can rewrite queryOne and authenticate with this.

// Fetch one row, fail if no rows were found
function queryOne(sql: Query, params: Param[]): DatabaseFn<QueryResultRow> {
  return query(query, params)
    .chain(rows => {
      return rows.length > 0
        ? DatabaseFn.of(rows[0])
        : DatabaseFn.throw(new Error('Query returned no rows'))
    })
}

function authenticate(username: string, password: string): DatabaseFn<User> {
  return queryOne("SELECT id from users where username = ? and password = ?", [username, password])
    .chain(row => fetchUser(row.id))
}

Much nicer! (The implementation of DatabaseFn.of and DatabaseFn.throw are left as an exercise for the reader :)

We can now write pure functions to run queries against a database and we are not required to pass in the client argument around. What’s more, by wrapping the database function into it’s own datatype we can attach methods to it for easier chaining. Contrasting this to the object oriented approach to dependency injection, we don’t pass in the client to any object beforehand. Instead, we combine our database interactions using chain and then at the end pass in the client to run the program.

The reader pattern

When you see that you have many functions that receive the same input, like a resource of some kind, I propose the following:

The “pattern” I’m describing here can itself be generalized into a datatype. In Haskell and PureScript this is called the reader monad transformer, ReaderT. While it is possible to implement ReaderT in Typescript and Javascript, I think this simple pattern will get you a long way. There is an implementation of ReaderT for TypeScript in fp-ts if you want to dig deeper.

permalink


January 07, 2018

Timeboxing

I’ve been experimenting recently with timeboxing for personal work. I don’t remember what triggered the interest this time, but it probably was the chronic procrastination that I suffer from.

Pomodoro

I googled “time boxing for personal tasks” or something like that and all the links seemed to lead to the pomodoro technique. I was already familiar with the pomodoro technique so I wasn’t that interested in that right now. I was mainly looking for tips on hard timeboxing of personal tasks. Like, set a time-box, work that amount, ship. Pomodoro is for splitting your work into manageable chunks, but I was looking for advice on how to really squeeze out the important stuff and ship.

That turned out to be a little too hard core of a time management tactic but that got me into pomodoro yet again.

I’m not going to explain what the pomodoro technique is, you can find a lot of stuff about it elsewhere. But basically what I now do is I start a timer and work for 25 minutes, try to have a break of 5 minutes after that, repeat.

I’m quite happy how this has improved my productivity and focus. It has helped with procrastination a lot, which is most important. Here are some thoughts and insights I’ve had with the pomodoro technique:

In conclusion, I would say the pomodoro technique, or timeboxing in general, is really about feedback. And it’s the best procrastination hack there is.

Here are a few related posts I liked:

permalink


December 25, 2017

Promise is the wrong abstraction

Swallowed Stop
"Swallowed Stop" by Theen Moy

If you are using Promises in JavaScript or about to, I’d suggest you reconsider. Here’s why:

It’s a bad API

I almost didn’t have to write this article because just as I was writing my first draft Aldwin Vlasblom wrote an excellent article on broken promises. That article summarizes well what’s wrong with the Promises API.

While the API is bad I think it’s the abstraction itself which causes more problems. The abstraction leaks, and just isn’t suitable in many cases. In particular, it’s the eagerness or non-purity about Promises that make them bad.

The alternative I’m going to talk about here is a similar abstraction for asynchronous values called a Task. There are many libraries like Task, such as Fluture, and what I’m about to say applies to both of those and to many other pure alternatives to Promises.

The wrong abstraction

There’s nothing inherently wrong with the Promise abstraction, but I think it’s the wrong abstraction in many places where it is used.

Promises in JavaScript represent processes which are already happening, which can be chained with callback functions. - MDN

A Promise is like a box that has a value or will have a value at some point in the future. When we create a Promise it is immediately run:

const twelve = new Promise((resolve, reject) => {
  resolve(12)
})

twelve is now a Promise containing the number 12. The function we pass into the Promise, called an executor, is immeditaly run so that before the call to Promise returns the executor has been called.

A Task looks similar to a Promise except it is run only when we call fork.

const twelve = new Task((reject, resolve) => {
  resolve(12)
})
.fork(
  err => console.log(“Cannot compute 12”),
  twelve => console.log(“We got “ + twelve)
)

Tasks are thus pure and lazy in nature, and this has huge consequences.

A Promise can be thought of as the result of the computation. You either already have the result (the Promise has been resolved) or the value is about to be delivered. When you have a reference to a Promise the side effect has already been run and we are just waiting for the result to arrive.

A Task on the other hand represents the computation itself. We can combine Tasks in interesting ways and pass them around, yet no side effects have been run until we call fork. In other words: Task is a side effect we can pass around. It’s side effects as data.

Side effects as data

Since Tasks are computations represented as values, we can pass them around, combine and sequence them. We can wrap computations in other computations and we can pass computations into other functions and decide later if and when to run them. I’ll explain this with a few examples.

Let’s say we are writing some code to interact with a SQL database and we have a Promise which makes database updates that we want to run in a transaction. We need a function like this:

wrapInTransaction :: Promise a -> Promise a

It’s a function that takes a Promise and returns a new Promise. That is, we want that function to wrap our update statements in a transaction and give us back a Promise which resolves if the transaction is committed. If the transaction fails, the Promise is rejected.

However, there’s no way to implement that function. Why? The Promise that we pass into the function, the one that makes the updates to the database, could already have been resolved before we try to wrap it in a transaction. Remember, Promise represents the possible result of the computation that is already “in flight”. We cannot wrap it in a transaction because it’s already running the query.

We have no problem implementing that function using a Task instead. The implementation looks something like this:

const withTransaction = task =>
  db.query(‘BEGIN TRANSACTION’)
  .chain(const(task))
  .chain(const(db.query(‘COMMIT’)))
  .orElse(const(db.query(‘ROLLBACK’)))

Notice how we are taking the task as a parameter and injecting it into the chain of other computations, yet it has not been run until the whole returned computation is forked.

Controlling the order of computations

A problem that I (and many others) often face is having an array of items and for each of them you want to perform an asynchronous operation. This can be for example an ajax request. Often you want to do these requests in sequence, so that the next request starts only after the previous one has finished. What we know about promises by now is that creating a Promise will also trigger the request, right then and there. That means the looping of the items and making the requests are tied with the promise creation. What does that mean?

It means we need a special helper function specialized to Promises to simultaneously loop and fire the requests. The standard Promises API doesn’t even have such a function, but for example Bluebird does.

Tasks can deal with this much more elegantly. We can first map each id to a Task and then later use sequence from ramda to collect the results from each Task into an array. We can also further map over the Tasks and later decide to run them in parallel instead if we prefer.

Side note: traverse would work here as well but I find myself using sequence more often, and it’s a bit simpler.

The important thing to note is that both map and sequence know nothing about Tasks and are pure functions that just operate on data. We have solved the problem using nothing but very simple and pure functions both found in ramda (could have used Array.prototype.map for mapping as well). We have also been able to divide the problem into two composable pieces — mapping and sequencing — which makes reasoning about the code much easier. The implementation of Bluebirds each in turn delegates to Bluebird’s reduce whose implementation is 178 lines long, and is very specific to Promises.

Because Tasks are data, just like any other data structures, the same rules apply to them. We can use the same powerful concept of manipulating data with pure functions and apply it to side effects.

Conclusions

A note about data.task v2: There is a new version of data.task which has better support for cancellation and resource handling. I haven’t used it yet but you should definitely check it out if you are wondering which library to use.

permalink


feed