Understanding TOTP Two-Factor Authentication: An ELI5

Hendrik Erz

Abstract: When logging in online, you usually have to provide a second factor before the website lets you in: A six digit code from your smartphone. Have you ever wondered how this works? After a recent incident where I lost all my 2FA keys, I decided to understand the algorithm.

Published on Sunday, June 2nd, 2024 by Hendrik | 19 min reading time

In today’s article, I want to explain to you how the most common two-factor-authentication system works: TOTP. It’s the primary one that you likely already use across several online accounts. You set it up by scanning a QR code on a website with an authenticator app, and then enter a six-digit code calculated by that app whenever you log in.

The reason for why I write this article is because of a recent daunting experience: My authenticator app just decided to delete all my TOTP codes after an update. This would have essentially locked myself out of all my online accounts, were it not for a backup that I stored of those. After setting up two new additional apps and verifying that I can still log in, I thought to myself: “Why is the space of authenticator apps so awful, especially since it is such a simple procedure?”

I decided to research the TOTP algorithm to get a better understanding of what it is, how it works, and how to ensure I will never lose access to my online accounts. I wrote my own implementation (in Python) to do so, and I share the full code below.

The Background: Losing My TOTP Keys

First a bit of background. Yesterday I returned from this year’s INAS conference in Leipzig. And somehow, my usual good luck with regard to technology always forsakes me when it comes to INAS. Two years ago, in Florence, my smartphone decided to not accept my SIM card anymore, so I was essentially stuck without cell reception in a foreign country.

This year it was my TOTP app. For a long time I used Raivo, because it was Open Source and worked very well. However, the emphasis is on “was”: Last year, its developer sold the app to some shady company. Of course, this wasn’t great, but as long as they didn’t do anything, it should be fine. However, during the conference, there was an update, and I clicked on “Update now” too quickly. I still read a line of the changelog that essentially said “please ensure to back up before updating”. Frantically, I started the app, and it just immediately crashed. A few seconds later, a second update was available, which made the app start again. However, after the app booted, I was greeted with an empty screen: No TOTP tokens to be found.

Now, this is not great. Indeed, without my TOTP codes, I wasn’t able to log into any account, and I was already fearing an awful week of having to use up the recovery tokens for my various 2FA setups. Fortunately, one reason for why I used Raivo is because it allowed me to create an encrypted backup of its contents, and so I still had all the 2FA-tokens. However, I couldn’t continue to use the app anymore, because the new owners decided it to be a great idea to lock up all the importing and exporting functionality behind a paywall. €4.99 a month just for an app that one could write in about a day? No, thanks. I’ll pass.

So I searched for an alternative. But it turns out that there really aren’t many good 2FA apps: You either have to set up online accounts, or the backup functionality is locked behind a subscription.

This piqued my interest: TOTP authentication should be really simple to implement, so there should be tons of FOSS alternatives out there. Right? Well, there aren’t many FOSS alternatives out there. But it turns out that TOTP is indeed very simple to implement.

How TOTP Works: The User’s Perspective

From a user’s perspective, TOTP works extremely simple. It involves three components: (1) An authenticator app on your smartphone; (2) a shared key that only your app and the server know; (3) a six-digit code, or token, that is regenerated every 30 seconds.

First, you have to set it up. Usually, this works like so:

Start the workflow to set up 2FA for your online account.
The website will then provide you with a QR code.
You scan this QR code with an authenticator app of your choice (You don’t have to use one from the big five, even if they lie about that and tell you that you can only use theirs).
Your app will immediately start generating these six-digit tokens.
The website asks you to provide one of these six-digit tokens so that it can verify that your app has stored the correct key.

From now on, whenever you log in to this account, the website will prompt you to provide a TOTP token after entering your password.

Usually, a website will also provide you a series of backup codes and ask you to save them. And indeed, you should save them in a secure place. Because should you ever lose access to your TOTP app, you won’t be able to log in, so this is used as a backup to your backup. So please take my horrifying experience and save them somewhere!

How TOTP Works: The Technical Perspective

From a technical perspective, TOTP also involves three components: (1) A shared key; (2) an algorithm that uses this key to generate a hash; (3) a way to turn this hash into a six-digit token.

It turns out that a lot of the factors that play into TOTP are actually configurable. For example: It doesn’t have to be 6 digits. The specification allows for any amount of digits between 4 and 10 (but recommends at least 6). Also, the tokens don’t have to be valid for 30 seconds, it could be anything from a single second to days or even weeks. Moreover, while SHA1 is the common algorithm used to turn the shared secret into a code, there are various other encryption algorithms that are allowed.

To understand what changes to these factors mean, it helps to understand the algorithm, so let’s get into it!

The Secret Key

First, we need a secret key. This secret key is just a series of random bytes. A common choice is to generate 20 random bytes (for example, using the OpenSSL library), and encode them using base32. Here is one code that we can use: CLAH6OEOV52XVYTKHGKBERP42IUZHY4D.

This is exactly what is contained in that QR code whenever you set up 2FA on a website. This is how the server provides its key to your app. The QR code also contains the various settings (such as how long each token is valid, and how many digits to generate).

The QR code will contain a URL in the following format: otpauth://totp/Service%20Name:test@example.com?secret=CLAH6OEOV52XVYTKHGKBERP42IUZHY4D&issuer=Example%20Service

(Which also means: If anyone takes a photo of your screen while the QR code is shown, well, someone else has your second factor.)

Generating a Token

After the secret has been shared between server and your app, the app will begin generating tokens. This involves three steps:

Take the secret and the current time to generate a so-called HMAC hash (using, e.g., SHA1), which is always 20 bytes long.
Extract four bytes from that hash.
Turn them into a number with 4–10 digits

This is usually straight forward, since the corresponding algorithms one needs are available in any language. Here I’m using Python, but in principle, you can implement this in any language you want.

So, first: Take the current time and convert it a bit so that we have a number that changes only every 30 seconds.

import time
from math import floor

# By default, use a 30 seconds time window
VALIDITY_DURATION = 30

def generate_counter_value ():
  """Generates the counter value for the TOTP algorithm's hash generator."""
  timestamp = time.time() # Gets current time as a float (= with microseconds)
  timestamp = floor(timestamp) # Turn it into an integer (= only seconds)
  # Doing the following ensures that the value increments by one every 30
  # seconds
  counter_value = floor(timestamp / VALIDITY_DURATION)
  return counter_value

This is called a counter value. The reason for this is that TOTP is actually based on HOTP, which was its predecessor. HOTP assumes that you only generate a code when you require it. Therefore, the server and you have to additionally store a counter that starts at zero and is incremented when a code is being generated. If you mindlessly generate codes with an HOTP app, you won’t be able to log in any longer because the counter that your app uses is out of sync with the server’s counter. TOTP solves this problem because usually the time available for both your smartphone and the server is relatively close to each other. We will get back to this.

Next: Based on this counter value, generate an HMAC-SHA1 hash:

import base64 # To convert the secret into the 20 random bytes
import hmac # Provides the HMAC algorithm
import hashlib # Provides the SHA1 algorithm

def generate_hash (K: str, C: int):
  """Generates a TOTP compatible HMAC hash based on the shared secret (K) and
  the current time window/counter value C."""
  # Secrets are usually provided in base32 format, so we need to decode it
  # (cf. https://stackoverflow.com/a/57325779)
  key_bytes = base64.b32decode(K)
  # We have to specify how many bytes to use to represent the time. 8 means
  # 64 bit which helps avoid the Y2k38 bug. Python will throw an error if the
  # number is too big to be represented.
  counter_bytes = C.to_bytes(8, byteorder = 'big')
  # Take the secret and the counter and generate a hash.
  hash = hmac.digest(key_bytes, counter_bytes, hashlib.sha1)
  # NOTE: The hash must be big endian. This code here assumes that
  # hmac.digest() always uses big endian.
  return hash

You may see that there are a few assumptions in there.

First, we assume that the secret is encoded in base32. This is an unusual choice. Normally, binary data is being encoded using base64. The reason for base32 is that it is (at least according to a Stack Overflow answer) more readable than base64 because it uses only half as many characters. But the specification for TOTP actually doesn’t specify which encoding to use, so one could use base64 or even binary. It doesn’t matter, but since everyone seems to use simply base32, we assume that here, too.

Next is the amount of bytes to store the counter. We have to provide this number because the amount of bytes required to store a number grows the bigger the number becomes. Any number between 0 and 255 can be represented in a single byte, which consists of 8 bits: 1111 1111 (= 255). Above that, you need a second byte (e.g., 1000 0000 0000 0000 = 256), and so forth. By specifying eight bytes, it means that the counter value will always have 64 bit (8×8). I don’t think that this has a lot of impact on the algorithm, but it makes sense to be overly verbose (especially if you try to understand something).

Lastly, there is a note on “endianness” in there. Endianness, or “byte order”, refers to the ordering of multiple bytes that belong together. For example, any number above 255 must be stored in more than one byte, so it is essentially “split up”. Since a byte is the smallest atomic unit in computing, the computer needs to know where the smaller parts of the number are stored, and where the bigger parts of the number are stored. The smaller parts (numbers 0-255) are called the “least significant byte”, the bigger ones (256 and above) the “most significant byte”. If you attach these bytes wrongly, then you can inadvertently change the number itself.

Here’s an example of the number 256 in “Big Endian” byte order:

  Byte 2  |  Byte 1
0000 0001 | 0000 0000 # 2 bytes = 16 bit
|||| ||||   |||| |||+ 0 x      1
|||| ||||   |||| ||+- 0 x      2
|||| ||||   |||| |+-- 0 x      4
|||| ||||   |||| +--- 0 x      8
|||| ||||   |||+----- 0 x     16
|||| ||||   ||+------ 0 x     32
|||| ||||   |+------- 0 x     64
|||| ||||   +-------- 0 x    128
|||| ||||
|||| |||+------------ 1 x    256
|||| ||+------------- 0 x    512
|||| |+-------------- 0 x  1,024
|||| +--------------- 0 x  2,048
|||+----------------- 0 x  4,096
||+------------------ 0 x  8,192
|+------------------- 0 x 16,384
+-------------------- 0 x 32,768

If you now reverse the ordering of these two bytes, it will start counting this 1, 2, 4, 8, … sequence from the other side and the number will be seen as a “1”. This may seem inconspicuous now, but it will become crucial later. Specifically, when it comes to the next step: Taking the 20 bytes hash and turning it into a 6-digit code.

So let’s take a look at that. The turning of the 20 byte hash into a code involves two steps: First, you truncate that to just 4 bytes (= a 32-bit number), and then you ensure that the number fits into 6 digits (or however many you specify).

First the “dynamic truncation”:

def truncate_dynamically (hash: bytes):
  """Truncates a generated HMAC hash and returns it as an integer"""
  # Extract the four least significant bits from the hash, which is by
  # definition between zero (0000) and 15 (1111)
  offset = hash[-1] & 0x0F
  # Extract four bytes, or 32 bits from the offset
  truncated = hash[offset:offset + 4]
  # Convert them into a number. The biggest number that can be represented in
  # 32 bit is 2,147,483,647.
  code_number = int.from_bytes(truncated, byteorder = 'big')
  # Return the least significant 31 bits. This here essentially removes bit 32
  # to avoid ambiguity with signed/unsigned integers, because the most
  # significant bit is usually 0 for a positive, and 1 for a negative number,
  # but there can be issues with that.)
  return code_number & 0x7FFFFFFF

Here we see bitwise operations for the first time. Specifically, what happens here is to use so-called “bit masks”. First we have an AND with 0x0F. 0F in hexadecimal notation looks as follows in binary: 0000 1111. A bitwise AND operation sets each bit in the result to 1 if both numbers have a 1 there, and 0 otherwise. Why does this always result in a number between zero and fifteen? Let’s take a look at a few examples:

  0001 0000 # 16
& 0000 1111 # 0F (or 15)
-----------
  0000 0000 # 0

  0100 0101 # 69
& 0000 1111 # 0F (or 15)
-----------
  0000 0101 # 5

Based on this offset, 4 bytes are extracted. Since the hash will always contain 20 bytes, this means that this offset will always ensure that any sequence of four bytes in the hash will be extracted. If the offset is 0, the first four bytes will be extracted, if the offset is 15, the last four bytes will be extracted. Remember that indexing usually starts at 0, not at 1, so index 19 (15+4) refers to the last byte.

Then, these four bytes will be converted into an integer (here we again can see that the ordering of the bytes is important). The last bitwise operation does the same as the first operation — masking —, except that it throws away not an entire sequence of 4 bits, but only the final one. That “most significant” bit is typically used to store whether the number itself is positive or negative.

This final code number will therefore be any number between 0 and 2,147,483,647. As you will notice: That number uses 10 digits — the maximum number of digits allowed by the TOTP protocol. That’s where this number comes from!

Now that we have that number with 10 digits, we only have to do one final step: Reduce that number until it has only as many digits as required, usually 6:

TOKEN_LENGTH = 6 # How many digits should each token have?

def truncated_hash_to_token (code: int, digits: int = TOKEN_LENGTH):
  """Takes a truncated HMAC code number and returns the modulo of that to
  return the requested number of digits"""
  code = code % 10 ** digits # This is basically a rounding up
  # NOTE: Sometimes the resulting code is less than the amount of digits. In
  # that case, we must left-pad with zeros.
  code = str(code)
  if len(code) < digits:
    code = code.rjust(digits, "0")
  return code

The first operation involves a modulo: code = code % 10 ** digits. A modulo means to divide a number by another one, but store not the result, but the remainder of that division. To understand that, let us first look at the divisor: 10 ** digits. Essentially, this number is a power of ten (some programming languages use ^ for that instead of **). So, if digits = 6, it is 10 ** 6 = 1,000,000. If you divide any number by 1 million, the remainder of that division will have at most six digits, because it will be at most 999,999. This is how you end up with a number that has exactly the number of digits that you need.

However, there is a weird if/else construct below that. The reason is that you may have a remainder of less than 100,000. And then you only have 5 digits instead of six, so what TOTP apps commonly do is pad the number with leading zeros.

In Python itself, you can actually do something that looks wrong, but works pretty well: return str(code)[-digits:]. This essentially converts the number into a string and then returns the final digits characters from that. That would then also include those leading zeros. However, that will break if the number is smaller than the amount of digits. Remember: The resulting code number can be any number, including numbers between 0 and 100,000. In that case, the code would break, because turning 12,345 into a string has fewer characters than required by the index.

The code that is returned from this function is then the six-digit code that you have to enter on the website. The server performs the same operation and will get the same code. So if your code is the one that the server also has generated, the authentication works.

Dealing with Time Differences

There is one final caveat to account for, though: Even though modern computers are very precise, the time your smartphone shows will never be exactly the same your computer has. This means that about every 30 seconds, there will be a short window where the server has a different code than your smartphone. In addition, a time difference of up to 5 minutes are usually considered “within acceptable range”. Thus, if your smartphone’s clock differs by more than 30 seconds from the server’s clock, you would never be able to log in.

It is for that reason that any server typically generates not one but multiple tokens, and checks whether the token you provided is in this list. So here’s a final code snippet:

VALID_START = -2 # Allow 2 time steps in the past to be considered valid
VALID_END = 2 # Allow 2 time steps in the future to be considered valid

def generate_totp_tokens (
    key: str,
    timestep_start = VALID_START,
    timestep_end = VALID_END
):
  """Generates a list of valid tokens within the valid window provided."""
  tokens: list[str] = []
  counter_value = generate_counter_value()

  for timestep in range(timestep_start, timestep_end + 1):
    hm = generate_hash(key, counter_value + timestep)
    code = truncate_dynamically(hm)
    valid_token = truncated_hash_to_token(code)
    tokens.append(valid_token)

  return tokens

Here you can see that there is a window of two minutes in which each code is valid. If the app’s clock were to differ for more than a minute from the server, you would not be able to log in, so depending on the scenario, you may want to increase these numbers.

In this example, the application will generate 5 codes; two “old” ones, two “future” ones, and the current one. One can then check the user-provided code against this list, and grant access if it is in this time window.

If you want to try this out, simply copy all the snippets into a Python file and add the following code to the bottom. When you then run the code, it asks you for a token and tells you whether it’s valid.

if __name__ == "__main__":
  secret = "CLAH6OEOV52XVYTKHGKBERP42IUZHY4D"
  client_token = input("Enter TOTP code from device (Ctrl+C to abort): ")
  valid_tokens = generate_totp_tokens(secret)

  if client_token in valid_tokens:
    print("Token is valid!")
  else:
    print("Sorry, but the token is not valid.")

Final Remarks

As you can see: The most common Two-Factor Authentication method TOTP is actually quite simple: You only need some secret key, the current time, and generate a six-digit number based on that. Given how prevalent TOTP 2FA is nowadays, it is quite atrocious that there is no official Apple app for this functionality. With the Google Authenticator, there is a default app for Android devices, but on iOS, users are left alone to find an open source app that does not put crucial security features behind a paywall. Especially not one that is as simple as TOTP.

But this also shows the shortcomings of this approach: Essentially, it means that you need to have two passwords instead of one. While the digits do not allow any attacker to guess the underlying secret key (also because of the dynamic truncation), there is one short time frame in which this password is literally out in the open: When the website displays you this QR code. Anyone who is with you in the room could take a photo and would therefore have this second password. That QR code is akin to selecting the “Show password” option that some websites have implemented.

And indeed, Apple has realized this: Your Apple-account is always secured with TOTP 2FA, but this 2FA does one crucial thing differently: It provides the secret key via the Apple account to your Apple devices. That means that the key is never publicly displayed. The downside, obviously, is that you need to have at least one Apple device for that.

So TOTP is definitely not the final step in the evolution of online security, but a good one. And now you know how it actually works!