Protocols in Python: Why You Need Them

Python 3.8 – released in October 2019 – came with lots of goodies. Among them are assignment expressions and positional-only arguments. Another great but less known addition is protocols, or static duck typing. So what is that, and how is it useful?

In order to give you a good overview of where protocols fit in, and why they are useful, I’ll first discuss the following subjects:

Dynamic versus Static Typing
Type Hints
ABCs
And finally: Protocols

Dynamic versus Static Typing

Python is a dynamically typed language. What does that mean?

Firstly: type declarations are not required. I can define the following function without ever specifying what types I expect the arguments to be, nor do I have to name a return type:

def my_function(a, b, c):
    return a + b - c

Secondly: types are handled – and checked – at runtime. I can run my_function with either integers, floats or a mix of both as input. The return type depends on the input:

result = my_function(5, 3, 2)
# type(result) -> int

result = my_function(5.1, 3, 2)
# type(result) -> float

Comparing this to C, which is a statically typed language, we see we have to provide type declarations:

int my_function(int a, int b, int c) { return a + b - c; }

Providing any other type would be illegal. The following would not compile:

int result = my_function(5.1, 3, 2);

That is a benefit of statically typed language: types are checked at compile time, so you can not run into any issues with types at runtime. In Python you may encounter issues at runtime that you would never have with a statically typed language.
On the other hand, dynamically typed languages are more flexible when it comes to types that are accepted. And they do not require type declarations, which is great for lazy programmers.

Duck Typing

Dynamic typing is also called duck typing, because

If it walks like a duck and it quacks like a duck, then it must be a duck.

Or in other words: if any object has the required functionality, then we should accept it as an argument.

For example, suppose that we have a class named Duck, and this class can walk and quack:

class Duck:
    def walk(self):
        ...

    def quack(self):
        ...

Then we can make an instance of this class and make it walk and quack:

duck = Duck()
duck.walk()
duck.quack()

Now if we have a class Donkey that can walk, but it can not quack:

class Donkey:
    def walk(self):
        ...

Then if we try to make an instance of the donkey walk and quack:

duck = Donkey()
duck.walk()
duck.quack()

We will get an >> AttributeError: 'Donkey' object has no attribute 'quack’. Note that we only get this at runtime!

However, we can replace duck with any other class that can walk and can quack. For example:

class ImpostorDuck:
    def walk(self):
        ...

    def quack(self):
        not_quite_quacking()

duck = ImpostorDuck()
duck.walk()
duck.quack()

So, wrapping up: Python is a dynamically typed language, which is great because it gives lots of flexibility and type declarations are not required. But no type checking happens except at runtime, which can lead to unexpected issues. Which leads us to…

Type Hints

Type Hints, or optional static typing, were introduced in Python 3.5 to overcome this downside. It lets you optionally specify types of arguments and return values, which can then be checked by a static type checker such as mypy.

For example, suppose we have a type Duck that can swim and eat bread:

class Duck:
    def eat_bread(self):
        ...  

    def swim(self):
        ...

We can then define a function feed_bread that makes a Duck eat bread. We can specify the type of the argument to be of type Duck:

def feed_the_duck(duck: Duck):
    duck.eat_bread()

duck = Duck() 
feed_the_duck(duck)

Now trying to feed bread to a Monkey, for example, will not work:

class Monkey:
    def eat_bananas(self):
        ...

    def climb_tree(self):
        ...

monkey = Monkey() 
feed_the_duck(monkey)

At runtime, this will give you: >> AttributeError: 'Monkey' object has no attribute 'eat_bread’.

But mypy can spot issues like these before you run your code. In this case, it will tell you:

error: Argument 1 to "feed_the_duck"  has incompatible type "Monkey"; expected "Duck"

These type hints can make your life as developer much easier, but they aren’t perfect. For example, if we want to make feed_bread more generic such that it can also accept other types of animal, we need to explicitly list all accepted types:

from typing import Union

class Pig:
    def eat_bread(self):
        pass

def feed_bread(animal: Union[Duck, Pig]):
    animal.eat_bread()

And another downside: if you use code as above that is provided by an external package that is not under your control (let’s suppose it is called animals), you can not use it for your own types. For example, my baby son Mees’ favourite activities include eating bread and drinking milk:

from animals import feed_bread

class Mees:
    def eat_bread(self):
        pass

    def drink_milk(self):
        pass 

mees = Mees() 
feed_bread(mees)

At runtime the above code will work perfectly fine, but mypy will complain:

>> error: Argument 1 to "feed_bread" has incompatible type "Mees";  expected "Union[Duck, Pig]"

If we do not have control over the animals package, there is nothing be can do about this – except tell mypy to ignore the offending line.

So, wrapping up: type hints are great because they give you the option to have static type checking. Still there are no obligations to add type declarations, but if you do, you get some of the benefits of a statically typed language. But the inability to adapt type hints of imported code gives conflicts between the dynamic typing nature of Python and the static type hints.

ABCs

Abstract Base Classes take away some of the pain of the conflict described above. As the name says, they are base classes-classes that you are supposed to inherit from- but they can not be instantiated. They are used to define the interface of what the subclasses of the ABC should look like.

For example (and forgive me for assuming that all animals can walk):

class Animal(metaclass=ABCMeta):
    @abstractmethod
    def walk(self):
        pass  # Needs implementation by subclass

Instantiating this class is impossible: my_animal = Animal() will yield >> TypeError: Can't instantiate abstract class Animal with abstract methods walk.

However, if we define a subclass, we can instantiate it:

class Duck(Animal):
    def walk(self):
        ...  

duck = Duck()
assert isinstance(duck, Animal)  # <-- True

For a more practical example, one may create an ABC called EatsBread that defines that its subclasses can indeed eat bread (or, in other words, they must have a method with the signature eat_bread(self)):

from abc import ABCMeta, abstractmethod

class EatsBread(metaclass=ABCMeta):
    @abstractmethod
    def eat_bread(self):
        pass

class Duck(EatsBread):
    def eat_bread(self):
        ...

class Pig(EatsBread):
    def eat_bread(self):
        ...

def feed_bread(animal: EatsBread):
    animal.eat_bread()

Now if I were to use this implementation of feed_bread in my code of Mees – I can make Mees a subclass of EatsBread and all will be fine:

from animals import EatsBread, feed_bread

class Mees(EatsBread):
    def eat_bread(self):
        ...

    def drink_milk(self):
        ...

feed_bread(Mees())  # <-- OK at runtime and for mypy

Although this is much better – this still is not perfect. Often base classes are not easily exposed, meaning I have to have ugly imports to get what I need:

from animals import feed_bread
from animals.base.eats import EatsBread

In addition you have to either inherit from the base class (or explicitly register your class as a subclass, e.g. EatsBread.register(Mees)) for this to work – which is not as nice as the implicit behaviour of duck typing.

And still there can be situations which would not quite work. Suppose we use two external packages:

From package animals:

class Animal(metaclass=ABCMeta):
    @abstractmethod
    def walk(self):
        pass  

class Dog(Animal):
    def walk(self):
        ...

def walk_animal(animal: Animal):
    animal.walk()

And from package llamas:

class Llama:
    def walk(self):
        ...

Now if you combine those in your code:

from animals import walk_animal
from llamas import Llama

llama = Llama()
walk_animal(llama)  # <-- Not OK for mypy

This last line will work fine at runtime – but as Llama does not inherit from Animal, mypy will complain.

One can solve this by making Llama a virtual subclass of Animal:

Animal.register(Llama)  

llama = Llama()
walk_animal(llama)  # <-- OK

But I would say that this is far from pretty.

Wrapping up once more: ABCs provide structure to your types, which is great. This means that type hints do not need updates for new subclasses. But we may still have issues combining classes from multiple packages. And all this typing -be it inheriting or as virtual subclasses- need to happen explicitly, which conflicts with the dynamic and implicit nature of dynamic typing.

Protocols

And this is where protocols come in. A protocol is a special case of ABC which works implicitly:

from typing import Protocol

class EatsBread(Protocol):
    def eat_bread(self):
        pass

def feed_bread(animal: EatsBread):
    animal.eat_bread()

class Duck:
    def eat_bread(self):
        ...

feed_bread(Duck())  # <-- OK

In the above code, Duck is implicitly considered to be a subtype of EatsBread. There is no need to explicitly inherit from the protocol. Any class that implements all attributes and methods defined in the protocol (with matching signatures) is seen as a subtype of that protocol.

So if we were to use the feed_bread function from package animals:

from animals import feed_bread

class Mees:
    def eat_bread(self):
        ...

    def drink_milk(self):
        ...

feed_bread(Mees())  # <-- OK

Here Mees is also implicitly a subtype of EatsBread. Again there is no need to explicitly specify that: as long as the signatures match it just works! This is why protocols are also called static duck typing.

Note that all this only works while type checking: not at runtime! If you do want this to work at runtime, you can use the runtime_checkable-decorator.

Wrapping up one last time: protocols are awesome, because:

There is no need to explicitly inherit from a protocol or register your class as a virtual subclass.
There are no more difficulties combining packages: it works as long as the signatures match.
We now have the best of both worlds: static type checking of dynamic types.