Despite recent discussions in the PHP community about whether type hints are to be considered 'visual debt' or not, at Moxio we still strongly value adding types to our code. Writing type-safe code lets us catch bugs early, enables static analysis, and serves a self-documenting purpose. Still it can be a challenge to write type-safe code in PHP, especially as it lacks a feature known as 'generics'. In this blog post I will show how (lack of) generics influences type-safe design, how parameter types and return types may change when extending a class or interface, and how we can keep our package design sound while doing so.
Extension and return type hints
Suppose we have an interface that represents a file, from which we can get the raw contents. Instances of this interface are created by a file reader, which accepts a filepath and returns an object corresponding to that file on disk:
<?php
interface File {
public function getContents(): string;
}
interface FileReader {
public function readFile(string $filepath): File;
}
These interfaces (and their implementations) may be part of our own code, or they could be defined by some vendor package. Either way, we would like to extend File
with some functionality specific for PHP files, like retrieving its PHP Abstract Syntax Tree (AST):
<?php
interface PhpAst {
/* (omitted) */
}
interface PhpFile extends File {
public function getAst(): PhpAst;
}
Now how does this relate to our FileReader
interface? What should the interface for a class that reads and returns PHP files look like? Can we use such a PHP file reader in places where a FileReader
is expected?
With generics
Languages that have generics, like Java, allow an elegant solution to this problem. We can just parameterize the FileReader
interface with the type of file returned:
interface PhpAst {}
interface File {
public String getContents();
}
interface PhpFile extends File {
public PhpAst getAst();
}
interface FileReader<T> {
public T readFile(String filepath);
}
Now a reader that returns instances of PhpFile
implements FileReader<PhpFile>
, while readers that (are only guaranteed to) return more generic File
objects are a FileReader<File>
. If some code needs just any (generic) file reader it can just declare it as FileReader<? extends File>
. It can then accept both FileReader<File>
and FileReader<PhpFile>
(or any other reader that returns a subtype of File
), having the guarantee that the objects returned from it are always a File
and thus at least have a getContents()
method.
Without generics
In languages without generics, like PHP, this is a bit more difficult. We could of course just create a separate interface for a PHP file reader:
<?php
interface PhpFileReader {
public function readFile(string $filepath): PhpFile;
}
However, can we declare such an interface as an extension of FileReader
, and thus pass it wherever a FileReader
is required, despite the signature of readFile
not being exactly equal? In (type) theory, the answer would be 'yes'. After all, a FileReader
is something obeying the contract that we can call readFile()
on with a filename to receive a File
, in which a File
is an object guaranteed to have a getContents()
method returning a string. Our PhpFileReader::readFile()
method returns a PhpFile
, which is (a particular specialization of) a File
. This makes a PhpFileReader
obey the contract of a FileReader
.
This type-theoretical principle is called covariance: when extending a type (class or interface), we are allowed to 'tighten' the types of the values returned by each method (to a more specific subtype) without breaking its contract. Consumers calling such a method when they expect to deal with the parent type will still get an object of the type they desire, having all the methods they expect to find on that object. Subtypes may thus be stricter in what they return than their parent types.
Does this mean we can just write interface PhpFileReader extends FileReader
? The practical answer is, unfortunately, 'no'. PHP only supports covariance when extending or implementing classes or interfaces in the specific situation where the parent type has no return type hint on the method and the subtype adds one. We can view this as tightening the return type from literally anything to a specific type. In all other situations, it requires any return type hints to be identical between the parent and the child. This does not allow us to use our PhpFileReader
where a FileReader
is required without removing type hints, making our code less type-safe. I will show a solution to this problem in a while. First, let's look at how subtypes affect the input parameters of methods.
Extension and parameter type hints
Suppose that in our system we also have rules operating on files, e.g. for checking coding standards or detecting bugs. We could model such a rule on a generic file using a type-safe interface like this:
<?php
interface FileRule {
public function check(File $file): bool;
}
Rules implementing this interface can retrieve the contents of the file they receive using getContents()
, and then for example check for "@todo" markers or trailing whitespace. But what if we want to write rules that are specific to PHP files, using the more powerful information available inside the AST?
With generics
In languages with generics we have an easy solution: we parameterize the interface again, this time by the type of the file argument we intend to check:
interface FileRule<T> {
public boolean check(T file);
}
A rule that operates on PHP files then implements FileRule<PhpFile>
, can only be called with instances of PhpFile
and has access to the getAst()
method on the file object. A rule that checks any type of file can implement FileRule<File>
and will accept any File
instance, but will not know about methods like getAst()
, which are specific to the PhpFile
subtype.
If part of our code needs to accept (e.g. as a parameter) a rule that can check a PHP file, this must include instances of FileRule<File>
. After all, since a PhpFile
is just a specific example of a File
, it can be given in any place where a File
is expected. This means that a FileRule<File>
can also check a PhpFile
, and thus is a valid replacement for a FileRule<PhpFile>
. To allow the rule parameter to accept both a FileRule<PhpFile>
and a FileRule<File>
we can declare it as a FileRule<? super PhpFile>
, i.e. "any rule that will check a PhpFile
".
Without generics
In PHP we cannot resort to the solution we would have used in a language that does support generics. Also for this situation we can create a separate interface:
<?php
interface PhpFileRule {
public function check(PhpFile $file): bool;
}
Again we can ask what the relationship between FileRule
and PhpFileRule
is. Is PhpFileRule
an extension (subtype) of FileRule
? We can see fairly easily that it isn't. After all, a FileRule
is something that can check any File
object, while PhpFileRule
will not accept (for example) a PythonFile
. This violates the Liskov Substitution Principle and establishes that PhpFileRule
is not a proper subtype of FileRule
.
What most people find confusing and suprising is that the (type-theoretical) relationship between FileRule
and PhpFileRule
is actually the other way around: FileRule
is a subtype of PhpFileRule
! It makes sense though, given that FileRule
will accept anything that a PhpFileRule
accepts as a parameter (i.e. all instances of PhpFile
, which are all a File
by inheritance). This type-theoretical principle is called contravariance: when extending a type we are allowed to loosen the types of all input parameters of methods in the subtype. Subtypes thus may be more liberal in what they accept than their parent types.
Does that mean that in practice we could (and should) write interface FileRule extends PhpFileRule
in PHP? The answer is (two times) 'no'. First, PHP does not support contravariance of method parameters when extending or implementing a class or interface. This means that, for now, the type hints on method parameters need to be identical between parent type and subtype. An RFC which will allow one specific type of contravariance (completely dropping the type constraint from a parameter, allowing any input value) has been accepted for PHP 7.2. This RFC is mainly intended for library authors to start adding scalar type hints to interfaces without breaking subclasses. Full covariance and contravariance support in PHP is not expected in the short term, as their implementation requires changes to autoloading.
Package Dependency Principles
Still, even if PHP supported it, it would not be a good idea to declare that FileRule
extends PhpFileRule
. This has to do with the principles of package design as described by Robert "Uncle Bob" Martin. If we think of the functionality and interfaces for generic files as being in one package and the ones for PHP files belonging to another, the 'php files' package has a dependency on the 'generic files' package, as PhpFile
extends File
. If we would let FileRule
extend PhpFileRule
we would also introduce a dependency the other way around, creating a package dependency cycle and thus violating the Acyclic Dependencies Principle.
The Stable Dependencies Principle ("Depend in the direction of stability") and the Stable Abstractions Principle ("Abstractness increases with stability") point us in the 'correct' direction of dependence. Since the 'generic files' package is more abstract than the 'php files' package (and thus more stable), 'php files' can depend on 'generic files'. Dependencies the other way around should be avoided. We can even imagine that the 'generic files' package is a third party library that we use. We are in no position to ask that library's author to make his FileRule
interface extend our PhpFileRule
interface.
Adapter pattern to the rescue!
Then how are we supposed to use a FileRule
where a PhpFileRule
is expected, or pass a PhpFileReader
to a method that wants a FileReader
? It turns out we can solve all our problems with a simple Adapter design pattern. We create two small classes. One adapts FileRule
to the interface of PhpFileRule
, and the other adapts PhpFileReader
to a FileReader
:
<?php
class FileRuleAsPhpFileRule implements PhpFileRule {
private $file_rule;
public function __construct(FileRule $file_rule) {
$this->file_rule = $file_rule;
}
public function check(PhpFile $file): bool {
return $this->file_rule->check($file);
}
}
<?php
class PhpFileReaderAsFileReader implements FileReader {
private $php_file_reader;
public function __construct(PhpFileReader $php_file_reader) {
$this->php_file_reader = $php_file_reader;
}
public function readFile(string $filepath): File {
return $this->php_file_reader->readFile($filepath);
}
}
If we now want to use a FileRule
where a PhpFileRule
is asked for we can just pass new FileRuleAsPhpFileRule($file_rule)
. Similarly we can use new PhpFileReaderAsFileReader($php_file_reader)
to have a reader for PHP files act as a generic file reader. This allows us to adapt between our related interfaces in a type-safe way and, since the adapters are both part of the 'php files' package, prevents dependency cycles.
Of course there is a small downside to this approach. Using this pattern requires two additional adapter classes and two PHP-file-specific interfaces to be written (when compared to the implementation with generics) and the delegation calls inside the adapters incur a very small runtime overhead. We consider these disadvantages negligible however when compared to the benefits of type safety and proper package dependencies.
Summary
When extending a type, the subtype may widen the types of input parameters (contravariance) and narrow down the types of return values (covariance). In other words, the subtype may be more liberal in what it accepts and more strict in what it returns.
- In languages without generics, type-safe programming may require creating separate interfaces that are covariant or contravariant.
- PHP only allows covariance and contravariance in a very limited set of cases.
- Adding an explicit
extends
clause for contravariance situations may break package design principles. - An Adapter pattern is an easy solution to overcome these situations.
This post was originally published on the Moxio blog.
Top comments (2)
You've expressed that the trade-off between the extra classes/overheads and type safety are worthwhile. Is the need to use extra classes to get type safety enough that you'd consider a different language (e.g. one with generics) given the option?
For us, switching to another language is currently not an option (in terms of existing codebase, libraries and infrastructure, developer knowledge and that PHP isn't that bad). However, in a completely new context I would probably choose another language over PHP.