Automatically catch many common errors while coding
One of the most common complaints about the Python language is that variables are Dynamically Typed. That means you declare variables without giving them a specific data type. Types are automatically assigned at based on what data was passed in:
In this case, the variable president_name is created as str type because we passed in a string. But Python didn?t know it would be a string until it actually ran that line of code.
By comparison, a language like Java is Statically Typed. To create the same variable in Java, you have to declare the string explicitly with a String type:
Because Java knows ahead of time that president_name can only hold a String, it will give you a compile error if you try to do something silly like store an integer in it or pass it into a function that expects something other than a String.
Why should I care about types?
It?s usually faster to write new code in a dynamically-typed language like Python because you don?t have to write out all the type declarations by hand. But when your codebase starts to get large, you?ll inevitably run into lots of runtime bugs that static typing would have prevented.
Here?s an example of an incredibly common kind of bug in Python:
All we are doing is asking the user for their name and then printing out ?Hi, <first name>!?. And if the user doesn?t type anything, we want to print out ?Hi, UserFirstName!? as a fallback.
This program will work perfectly if you run it and type in a name? BUT it will crash if you leave the name blank:
Traceback (most recent call last): File “test.py”, line 14, in <module> first_name = get_first_name(fallback_name) File “test.py”, line 2, in get_first_name return full_name.split(” “)[0]AttributeError: ‘dict’ object has no attribute ‘split’
The problem is that fallback_name isn?t a string ? it?s a Dictionary. So calling get_first_name on fallback_name fails horribly because it doesn?t have a .split() function.
It?s a simple and obvious bug to fix, but what makes this bug insidious is that you will never know the bug exists until a user happens to run the program and leave the name blank. You might test the program a thousand times yourself and never notice this simple bug because you always typed in a name.
Static typing prevents this kind of bug. Before you even try to run the program, static typing will tell you that you can?t pass fallback_name into get_first_name() because it expects a str but you are giving it a Dict. Your code editor can even highlight the error as you type!
When this kind of bug happens in Python, it?s usually not in a simple function like this. The bug is usually buried several layers down in the code and triggered because the data passed in is slightly different than previously expected. To debug it, you have to recreate the user?s input and figure out where it went wrong. So much time is wasted debugging these easily preventable bugs.
The good news is that you can now use static typing in Python if you want to. And as of Python 3.6, there?s finally a sane syntax for declaring types.
Fixing our buggy program
Let?s update the buggy program by declaring the type of each variable and each function input/output. Here?s the updated version:
In Python 3.6, you declare a variable type like this:
variable_name: type
If you are assigning an initial value when you create the variable, it?s as simple as this:
my_string: str = “My String Value”
And you declare a function?s input and output types like this:
def function_name(parameter1: type) -> return_type:
It?s pretty simple ? just a small tweak to the normal Python syntax. But now that the types are declared, look what happens when I run the type checker:
$ mypy typing_test.pytest.py:16: error: Argument 1 to “get_first_name” has incompatible type Dict[str, str]; expected “str”
Without even executing the program, it knows there?s no way that line 16 will work! You can fix the error right now without waiting for a user to discover it three months from now.
And if you are using an IDE like PyCharm, it will automatically check types and show you where something is wrong before you even hit ?Run?:
It?s that easy!
More Python 3.6 Typing Syntax Examples
Declaring str or int variables is simple. The headaches happen when you are working with more complex data types like nested lists and dictionaries. Luckily Python 3.6?s new syntax for this isn?t too bad? at least not for a language that added typing as an afterthought.
The basic pattern is to import the name of the complex data type from the typing module and then pass in the nested types in brackets.
The most common complex data types you?ll use are Dict, List and Tuple. Here?s what it looks like to use them:
from typing import Dict, List# A dictionary where the keys are strings and the values are intsname_counts: Dict[str, int] = { “Adam”: 10, “Guido”: 12}# A list of integersnumbers: List[int] = [1, 2, 3, 4, 5, 6]# A list that holds dicts that each hold a string key / int valuelist_of_dicts: List[Dict[str, int]] = [ {“key1”: 1}, {“key2”: 2}]
Tuples are a little bit special because they let you declare the type of each element separately:
from typing import Tuplemy_data: Tuple[str, int, float] = (“Adam”, 10, 5.7)
You also create aliases for complex types just by assigning them to a new name:
from typing import List, TupleLatLngVector = List[Tuple[float, float]]points: LatLngVector = [ (25.91375, -60.15503), (-11.01983, -166.48477), (-11.01983, -166.48477)]
Sometimes your Python functions might be flexible enough to handle several different types or work on any data type. You can use the Union type to declare a function that can accept multiple types and you can use Any to accept anything.
Python 3.6 also supports some of the fancy typing stuff you might have seen in other programming languages like generic types and custom user-defined types.
Running the Type Checker
While Python 3.6 gives you this syntax for declaring types, there?s absolutely nothing in Python itself yet that does anything with these type declarations. To actually enforce type checking, you need to do one of two things:
- Download the open-source mypy type checker and run it as part of your unit tests or development workflow.
- Use PyCharm which has built-in type checking in the IDE. Or if you use another editor like Atom, download it?s own type checking plug-in.
I?d recommend doing both. PyCharm and mypy use different type checking implementations and they can each catch things that the other doesn?t. You can use PyCharm for realtime type checking and then run mypy as part of your unit tests as a final verification.
Great! Should I start writing all my Python code with type declarations?
This type declaration syntax is very new to Python ? it only fully works as of Python 3.6. If you show your typed Python code to another Python developer, there?s a good chance they will think you are crazy and not even believe that the syntax is valid. And the mypy type checker is still under development and doesn?t claim to be stable yet.
So it might be a bit early to go whole hog on this for all your projects. But if you are working on a new project where you can make Python 3.6 a minimum requirement, it might be worth experimenting with typing. It has the potential to prevent lots of bugs.
One of the neat things is that you easily can mix code with and with out type declarations. It?s not all-or-nothing. So you can start by declaring types in the places that are the most valuable without changing all your code. And since Python itself won?t do anything with these types at runtime, you can?t accidentally break your program by adding type declarations.
Thanks for reading! If you are interested in machine learning (or just want to understand what it is), check out my Machine Learning is Fun! series.
You can also follow me on Twitter at @ageitgey or find me on linkedin.