# The problem

Python has a Counter object – it’s found in the collections module – and I’ve become convinced that it’s… a bit weird? The basic idea is fine: it’s a dictionary but the keys are things you want to count, and the values are the amount of the thing you have counted. The neat thing is if you initialise a counter with an iterable like a string or a list, it’ll give you the counts for the number of times each element appears. And it gives you convenience methods for finding the most common elements and that sort of thing. It’s pretty neat and I’ve used it a few times.

The main problem with the Counter object is that it has some fairly inconsistent behaviour when it comes to negative numbers.

Here’s a warmup.

>>> a = Counter(z=1)
>>> b = Counter(z=2)
>>> a-b
Counter()


Create two counters, and subtract one away from the other. Because the count would go negative if we subtract 2 from 1, the outcome is the empty counter. And the same thing happens if we do the subtraction inplace:

>>> a = Counter(z=1)
>>> b = Counter(z=2)
>>> a-=b
>>> a
Counter()


OK, so Counter just doesn’t want to permit negative counts, well, maybe that’s fine. I’d want, perhaps a warning that the count went negative, but I guess that’s fair enough. But wait:

>>> a = Counter(z=1)
>>> b = Counter(z=2)
>>> a.subtract(b)
>>> a
Counter({'z': -1})


So Counter is willing to permit negative counts if we use subtract rather than -= (or __isub__). And it won’t make any kind of complaint if we do Counter(z=-1). So it is sort of permitting negative counts, but only sometimes? And it’s not permitting negative counts in any way that really permits them to be used in any sensible way.

>>> a = Counter(z=-1)
>>> b = Counter(z=-2)
>>> a+b
Counter()


So negative counts get dropped whenever we do addition or subtraction, unless we use subtract or update, in which case they don’t. Why permit negative values in Counters at all if they’re going to be dealt with kind of inconsistently? (I mean, Counter(z = 3.5) is also something that python does not complain about, but that’s a different story). Wouldn’t it be better to build a Counter class that handles negative numbers better?

Well, part of the issue there is that it isn’t clear that there’s one obviously right answer to the question “How should a counter respond to negative numbers?” One option would be to raise an error whenever a Counter encounters a negative value. Another would be to raise a warning. Another would be to just let negative values through. Any of these options seems preferable to the current implementation.

# A Solution

For a project, I wanted a counter that would throw an error when it hit a negative number, and so I decided to try subclassing the Counter object. At first I just explored the code for the collections module, but this wasn’t quite enough, since Counter is a subclass of dict and so there might be methods of dict that get used by Counter that I wouldn’t necessarily notice if I’m just skimming the method definitions in Counter. It was part of this exercise that led me to learn about inspect and do the stuff I talked about in a previous post. I’m not sure whether the final project will make use of this solution, since it’s kind of slow and I’m exploring other implementations, but I’ll discuss it here.

One interesting thing is that each way of subtracting numbers (__sub__, __isub__, and subtract) is defined explicitly: none is defined in terms of the others. I presume this is done for efficiency reasons, to remove one layer of indirection from some method calls? But it means that I needed to redefine all of those methods.

So using the MethodInspector I defined in a previous post, I poked about in the definition of Counter and copied out the methods I needed to modify. The modifications are all rather minor (except that I just defined subtract in terms of __isub__ because I’m lazy). For your viewing pleasure, here’s the subclass.

from collections import Counter

class NegativeNumberError(ValueError):
def __init__(self, message="values must be non-negative"):
super().__init__(message)

class SubtractionError(ValueError):
def __init__(self, message="subtraction would drop count below zero"):
super().__init__(message)

class NNCounter(Counter):
def __init__(self, iterable=None, /, **kwds):
"""
Create a new NNCounter object.
"""
if (
iterable is not None
and issubclass(type(iterable), dict)
and any([x < 0 for x in iterable.values()])
):
raise NegativeNumberError

if kwds and any([x < 0 for x in kwds.values()]):
raise NegativeNumberError
super().__init__()
self.update(iterable, **kwds)

def __isub__(self, other):
"""
Inplace subtract counter.

Raises a SubtractionError if any count drops below 0.
"""
for elem, count in other.items():
newcount = self[elem] - count
if newcount < 0:
raise SubtractionError
else:
self[elem] = newcount
return self

def __setitem__(self, k, v):
if v < 0:
raise NegativeNumberError
else:
super().__setitem__(k, v)

def __sub__(self, other):
"""
Subtract one NNCounter from another.

Raises a SubtractionError if any count drops below 0.
"""
if not isinstance(other, NNCounter):
return NotImplemented
result = NNCounter()
for elem, count in self.items():
newcount = count - other[elem]
if newcount > 0:
result[elem] = newcount
elif newcount < 0:
raise SubtractionError
for elem, count in other.items():
if elem not in self and count > 0:
raise SubtractionError

return result

def subtract(self, *args, **kwargs):
other = NNCounter(*args, **kwargs)
self -= other


Looking at this again, I think I would probably prefer to have one new kind of error, and then maybe adjust the message depending on whether it’s a subtraction or an init problem. Or at least maybe have both errors subclass a “negative number error” superclass so you can catch both without catching all other kinds of ValueError. I’m not going ot change it because, as I said, I’m probably not going to use this in the project I built it for.

The other limitation is that I haven’t redefined the addition methods for NNCounter, and so adding two NNCounters together returns a Counter. This is because the definition of Counter explicitly initialises the counters it returns in addition and subtraction operations as Counter() rather than being aware it might be subclassed and doing something like type(self)(). For my use case, this isn’t an issue, but if I were being thorough, I would have to redefine __add__, __iadd__ and __update__ as well, and possibly __and__, __or__ etc which also hardcode the type of their output.

At some point, I hope to get round to actually explaining what the project is that I was using this for, but that will have to wait.