I was messing around with a project recently, and I thought everything was going swimmingly. My unit tests were passing, the output to stdin looked like I’d expect. And then I made one small change, and instead of the binary number I was expecting, I saw:

```
1010101001001020
```

“Ones and zeroes everywhere. And I think I saw a two!”; “It was just a dream, Bender. There’s no such thing as a two.” I added a bunch more unit tests but they didn’t seem to be finding the two. It took me quite a while to figure out what the problem was because there were in fact two problems: one with the function code, and one with the tests.

# Background

First, a little background.
I’ve seen a few examples of so-called “binary clocks”
where what they do is give you a binary representation of
the digits in a standard 24-hour clock.
There’s a binary representation of the hours,
a binary representation of the minutes,
and one for the seconds.
That’s no fun.
What I wanted to do was create a binary clock that just gives you a
binary number that represents how far through the day you are.
So it starts at `000000`

at midnight and ticks through until hitting `111111`

just before midnight the next day.
`100000`

is midday.
It is currently `110101`

which is about 9pm in old money.
So that’s what I wanted to do:
take a number of seconds elapsed since midnight,
and turn it into a binary number that represents how far through the day you are.

Now, six digits only gives you one tick every 22 and a half minutes, so that’s no good. There are 86400 seconds in a day (give or take leap seconds) and $2^{16} = 65536$ which is near enough. Better than one tick every two seconds. Sixteen bits is probably a good enough resolution for a clock, I think.

# The code

It’s a small enough project that I’m just going to replicate most of it here. Before continuing, see if you can find both problems.

```
import pytest
from hypothesis import given
from hypothesis.strategies import integers, floats
SECONDS_IN_DAY = 86400
# 24*60*60
# The actual function I'm testing
def binclock(seconds, precision = 6):
"""Represent elapsed time as binary number."""
digits = []
rem = seconds
divisor = SECONDS_IN_DAY
for i in range(1,precision+1):
divisor >>= 1
digits.append(rem//divisor)
rem %= divisor
return "".join([str(int(x)) for x in digits])
# Fixtures and helper functions for tests
seconds_ints = integers(min_value=0,max_value = 86399)
def check_binary(x,precision=6):
return all([y in "01" for y in binclock(x)])
# Tests
class TestBinClock:
def test_first_division(self):
assert binclock(0, precision = 1) == "0"
assert binclock(43201, precision=1) == "1"
def test_second_division(self):
assert binclock(0, precision=2) == "00"
assert binclock(43201,precision=2) == "10"
assert binclock(21601, precision=2) == "01"
@given(seconds_ints)
def test_binary_six(self,x):
assert check_binary(x)
@given(seconds_ints)
def test_binary_seven(self,x):
assert check_binary(x, 7)
@given(seconds_ints)
def test_binary_eight(self,x):
assert check_binary(x,8)
@given(seconds_ints)
def test_binary_sixteen(self,x):
assert check_binary(x,16)
def test_nearby_range(self):
seconds = 57066
for i in range(300):
assert check_binary(seconds+i)
@given(floats(min_value = 0,max_value=86399.999))
def test_fractions(self,x):
assert check_binary(x)
def test_specific_fraction(self):
assert check_binary(57468.627816)
```

# What’s going on?

Now, all these tests pass.
And you can see me getting more desperate as I add tests trying to make it fail.
But I *saw* a two!
There shouldn’t be a two.

I decided to try using the
`hypothesis`

module,
to test a bunch of values.
This didn’t seem to help.

Do some debugging… I saw a two at about 57066 seconds! Nope. Oh maybe it’s something to do with floats Nope. OK it’s this precise value that gives me trouble in the output. Nope, the test passes.

What’s going on here?
Maybe it’s obvious, but I wasn’t seeing the wood for the trees.
The more obvious problem is that I added a parameter to the `check_binary`

function,
but I didn’t actually do anything with it.
So when I thought I was testing higher precision values,
I was not:
I was always testing the default `precision=6`

.
Once I modified `check_binary`

to actually use its argument,
all my tests failed,
except for the tests with `precision=6`

and `precision=7`

.
So it all works with the default precision,
but turn up it at all and it falls over.

The unlucky thing for me is that I had accidentally picked a default value for this
precision parameter
that masks the flaw in my way of calculating the binary fraction.
I’m going to go through how I got to my implementation of `binclock`

to explain how I ended up with this faulty version
before explaining exactly what’s going wrong.

# What is binclock doing?

I want to explain how I ended up with the version of
the algorithm in `binclock`

:
why I thought it should work.
You might recognise this algorithm as kind of similar to the algorithm
you use to turn an integer into its binary representation.
In that case, you do something like this:

```
def make_binary1(num):
digits = []
while num >0:
digits.append(str(num % 2))
num //= 2
return "".join(digits[::-1])
```

Keep dividing your number by two, keeping track of all the remainders.
Flip that string around and that’s your binary representation.
So for the number 6 you do the following:
divide it by 2 and get 3 remainder 0.
So after one loop, the `digits`

list is holding `[0]`

and `num`

is now 3.
(`digits`

is actually a list of strings, because `join`

isn’t smart enough to cast what gets passed into it,
but let’s ignore that for now).
Divide 3 by 2 – we’re doing integer division – so that’s 1 remainder 1.
So `digits`

is now `[0, 1]`

and `num`

is 1.
Repeat again and we add another 1 to `digits`

and `num`

is now 0,
so the while loop ends.
Reverse `[0, 1, 1]`

and turn it into a string to get: `110`

,
which is 6 in binary.

Now, here’s another algorithm that gives the same output, but is doing something slightly different internally.

```
def make_binary2(num):
divisor = 2
while divisor<= num:
divisor <<= 1
digits = []
while divisor > 1:
divisor >>= 1
digits.append(str(num // divisor))
num %= divisor
return "".join(digits)
```

The first part of this function finds a power of two bigger or equal to the number we’re looking at.
Then, instead of generating the digits of the binary representation
in reverse order (smallest to biggest),
we generate them in the order they appear in the final output.
Note here that instead of using `divisor *= 2`

, I use `divisor <<= 1`

.
This is a useful trick if you’re dealing with powers of two,
but if you wanted to make your function able to output, say, ternary representations of numbers as well,
then you’d be better off sticking with multiplication and division, instead of bit shifts.

This function gives the same outputs as `make_binary1`

for positive integer inputs.
(They give different outputs for `0`

, see if you can figure out how they differ).

Why would you want to do conversion to binary in this way rather than the first way? Well, one reason is that this second approach can be more easily modified to generate binary representations of floats, rather than just integers. Here’s a first attempt at doing that.

```
# THIS CODE DOESN'T WORK
def make_binary3(num,precision=6):
divisor = 2
power = 1
while divisor <= num:
divisor <<= 1
power += 1
digits = []
for i in range(power+precision):
if i==power:
digits.append(".")
divisor >>= 1
digits.append(str(int(num // divisor)))
num %= divisor
return "".join(digits)
```

Obviously we need to specify the precision we want to generate binary representations of floats with:
how else would we prevent looping forever?
But there is something wrong with the code above.
You might be able to guess what it is from what I said about the
`binclock`

code above.

The problem is that at some point, `divisor >>= 1`

yields a zero,
and thus `num // divisor`

causes a divide-by-zero error.
Let’s try working through `make_binary3(1.5)`

.
The first part of the function doesn’t do anything, because `divisor`

is already larger than `num`

.
Now, for the second loop.
`i`

is 0,
but `power`

is 1, so that first `if`

gets skipped.
We right shift `divisor`

by 1 so it’s now 1.
Then we append `1.5 // 1.0`

to `digits`

(that is, 1).
We then assign `1.5 % 1.0`

– which is `0.5`

– to `num`

.
(Yeah “integer” division and modulo work fine with floats,
little bit surprising, but once you think about what they’re doing, it kind of makes sense).
The next time through the loop,
`i == power`

so we add a “`.`

” to `digits`

(this is the decimal point, or, I guess, the binary point).
Then we right shift `divisor`

by 1.
Now, `1 >> 1 == 0`

,
so we get a divide by zero error when
we come to do `num // divisor`

which,
at this stage is `0.5 // 0`

.

It was this incorrect implementation of a function to create binary representations of floats
that I used as the basis for `binclock`

.
I’ll explain how `binclock`

is different from this once I’ve
laid out how the correct version of `make_binary`

works.

The correct thing to do is to replace `divisor >>= 1`

with `divisor /= 2`

so that instead of rounding to zero, the divisor keeps getting smaller.
Let’s work through just that second part of the calculation again with an amended loop.

```
digits = []
for i in range(power+precision):
if i==power:
digits.append(".")
divisor /= 2
digits.append(str(int(num // divisor)))
num %= divisor
return "".join(digits)
```

The first go through the loop works just the same.
At the end of the first iteration,
`digits`

is `[1]`

,
`divisor`

is 1
and `num`

is `0.5`

.
Now, on the second iteration,
we add our binary point as before,
and then we *divide* `divisor`

by 2,
giving `0.5`

.
`0.5 // 0.5 == 1`

so we append `1`

to `digits`

.
We then assign `0.5 % 0.5`

– which is 0 – to `num`

.
All the remaining iterations have `num`

equal to 0,
so although we keep halving `divisor`

,
we keep adding `0`

to `digits`

.
Then we gather up our digits to produce `1.100000`

.

# Fixing binclock

The algorithm in `binclock`

is pretty much the same as our `make_binary3`

algorithm,
except we already know the maximum number that will be passed in:
it’s the number of seconds in a day, 86400.
We’re actually calculating the fraction that `input / 86400`

represents,
but we’ve dropped the leading binary point.
So rather than divide the input by 86400, we just multiply the initial divisor by 86400.
But other than that, it’s the same calculation.

Now we can see that `binclock`

has basically the same problem that `make_binary3`

has:
using right shift causes some rounding down that we don’t want.
The problem boils down to the fact that using the right shift operator `>>=`

leads to rounding errors once you start shifting out digits that are set.
So the algorithim in `binclock`

works so long as you’re bit shifting out a zero
(so long as you’re dividing a number that’s divisible by 2).
But at some point, that stops being the case,
you get some rounding, and then things start coming off the rails,
and you start seeing twos.

Here’s an example.
Let’s say that our divisor is 14,
and we’re trying to see what proportion of 14 is 6.
That is, imagine there are 14 seconds in a day, and we’re at second 6:
what time is it in binary?
Step through the main loop of `binclock`

.
`14`

is `1110`

which means we can safely right shift
without rounding down.
So `14 >> 1`

is `7`

.
We append `6 // 7`

to `digits`

(that’s a `0`

).
Then we assign `6 % 7`

to `rem`

(that’s a `6`

).
So far so good.
But now the trouble starts on the second time through the loop.
`7`

is `111`

and so, right shifting, we chop off that rightmost `1`

to leave ourselves with `11`

.
Thus, `7 >> 1`

is `3`

.
Then we do `6//3`

which gives us our `2`

.
The rest of the calculation proceeds as you’d expect.
What we should have done, instead,
is *divide* 7 by 2,
yielding 3.5, and then `6//3.5`

is `1`

,
while `6 % 3.5`

is `2.5`

.

At what point do you start seeing twos in `binclock`

?
Well, `(86400 >> 6) << 6 == 86400`

,
but `(86400 >> 8) << 8 != 86400`

.
Or, to put it another way `f"{86400:b}"[-8:] == "10000000"`

.
What I’m saying is that you can right shift 86400 seven times
and you won’t lose any information, but that eighth shift,
that eighth digit of precision starts causing problems.
So the fact that I’d picked the value of 6 for the default precision,
coupled with the fact that I’d forgotten to make my test helper function actually use
the parameter value
meant that a bunch of tests passed when they shouldn’t have.

In hindsight, this was an obvious error to make, but, then, in hindsight, aren’t they all?