Who's scared of phantom types?

The type system of PHP is evolving with every new released version of the language and more new shiny features are coming with PHP 8, like union types and static return type. Still, if you want to push things further, you’d better use libraries as PHPStan and Psalm to get access to features like type variables and purity checks. For this blog post I’m going to use Psalm as a type checker and show how to use a little type level trick with phantom types.

Identify everything

I’ll try to make my point going through a concrete example, which illustrates the issue and how phantom types help us achieve a nice solution.

Let’s consider a common entity of your domain, let’s call it Foo, which is identified by some kind of identifier. Let’s say, for the sake of the example, that we are using integers as identifiers, probably coming from an auto-incrementing field on a relational database.

To make sure not to confuse our identifiers with other integer values floating around our application, we decide to wrap them in a class called Id.

final class Id
{
  /** @var int */
  private $intId;

  public function __construct(int $intId)
  {
    $this->intId = $intId;
  }

  public function asInt(): int
  {
    return $this->intId;
  }
}

In this way we can distinguish at the type level an Id from any other integer in our application. For example, in our entity Foo we are going to type hint its identifier with Id and not just simple with int. This has the deficit that we have to wrap and unwrap our integer value, but has the big advantage of preventing passing an age, a day, a year or any other numeric value where an identifier was expected.

We’re now happy because we have increased the type safety of our application!

Until the next entity comes along…

Consider now another entity Bar which needs to refer to the same type of identifier Foo had. We are immediately tempted to use the same Id class we were using for Foo, it doesn’t make sense to write another class which behaves exactly like Id does. Or does it?

If we reuse the same class, we’ll still be able to distinguish between Ids and other integer values, but we’re not going to be able to distinguish between Ids for Foo and Ids for Bar. That’s something we might actually want to do, distinguish between FooId and BarId, so that we’ll always know at compile time if we’re dealing with an identifier for a Foo entity or for a Bar one. Consider for example repositories for Foo and Bar; the repository FooRepository could have a method loadFoo(FooId $id): Foo which accepts just identifiers for Foo, making it impossible to pass an identifier for Bar. This is definitely a win for type safety, but we’re basically duplicating code in two identical classes which differ only for their name.

One could save code duplication defining an abstract class Id and concrete classes FooId and BarId inheriting from it.

abstract class Id
{
  /** @var int */
  private $intId;

  public function __construct(int $intId)
  {
    $this->intId = $intId;
  }

  public function asInt(): int
  {
    return $this->intId;
  }
}

final class FooId extends Id {}

final class BarId extends Id {}

In this way we are not duplicating code, but still every time we introduce a new entity Baz in out system, we need to define also a new class BazId extends Id for its identifiers. In the long run, for big projects, this could become cumbersome.

Let’s move to the type level

If we examine carefully our desire to distinguish between FooId and BarId, we quickly realise that we are trying to communicate information at the type level. In fact, this is evident from the fact that the implementations are precisely the same. This observation suggests us that maybe we should try to use type level mechanisms to convey such information and not value level tools like inheritance.

One trick we could actually use is add a type variable to our Id class and use it to tag the identifier with the entity it is actually referring to. Let’s see how that would work.

/**
 * @template A
 */
final class Id
{
  /** @var int */
  private $intId;

  public function __construct(int $intId)
  {
    $this->intId = $intId;
  }

  public function asInt(): int
  {
    return $this->intId;
  }
}

The only difference with out first version of Id is that we now added the @template annotations to introduce the A type variable. Now we can simply refer to Id<Foo> and Id<Bar> to speak about identifiers for Foo and Bar, respectively.

Reconsidering the repository example from above, the same method loadFoo would become

interface FooRepository
{
  /**
   * @psalm-param Id<Foo> $id
   * @return Foo
   */
  public function loadFoo(Id $id): Foo
}

This guarantees the same type safety as our intermediate solution using inheritance but avoids completely the need to create specific BazId classes for every new Baz entity.

Such kind of type variables, which are used only at the type level and do not refer to anything at the value level, are usually called phantom types.

Conclusion

Now that PHP is getting more and more features to play with types, it is important to learn to use them at their full potential and to explore their practical value. One very relevant skill that developers need to develop in a language with a rich type system is how to discern information which is needed at the value level and information important only at the type level. Distinguishing between the two levels and separating information accordingly allows to design simpler and safer code and to reduce code duplication.

The usage of phantom types is a simple type level trick which can help with this in practical settings, without requiring complicated features or making your code too abstract and less readable.