The other day my co-worker and I stumbled across an interesting find in MySQL— and by ‘stumbled’ I mean spent a good 30 minutes trying to figure out why some tests weren’t passing. Here’s what was happening:

We have an endpoint that receives incoming messages and in our database we have an incoming_messages table with a sender column. However, it is possible that sender might be null if we receive a message from an unknown source.

So, what I wanted to test was a function that counted all of the records that didn’t have a specific value for the sender column. For instance, I wanted to count all the incoming_mesages that didn’t have a sender value of “system”, meaning they’d been generated by our internal systems.

My function under test looked liked this:

public int nonSystemMessageCount() {  
IncomingMessage.count("sender != 'system'")
}

and my test looked like this:

IncomingMessage.create("body", "some message body", "sender", null)  
the(nonSystemMessageCount()).shouldEqual(1);

However, no matter how many times I ran the test nonSystemMessageCount() returned 0, and no matter which way we debugged and how deep we went down our ORM’s call stack- it still returned 0. One of the debugging steps that finally started to narrow it down for us was trying to create a messages set from Message.select("sender != 'system'") — which also returned an empty set.

Fortunately, while I was down in the rabbit hole my co-worker found the MySQL documentation (as well as MariaDB docs- which we use) that revealed that neither count nor select will include any record where the value is null if doing a comparison, equality, or non-equality check.

So, ultimately we considered two options:
1) add the AND IS NULL check to our SQL statement
2) just not have null values in the sender column

Since we control the interface to the DB (and this is a new feature with no production data yet) we decided that we’d rather just not deal with null. We placed both a DB constraint and model validation on IncomingMessage so that sender can never be null. If we don’t know who the sender is then we set it to “unknown”. That’s probably the best design anyway, even if it does feel a little defensive.

Although I definitely understand why null doesn’t actually count for anything in a binary world, sometimes nothing is something! In this case, all of us on the team were pretty surprised to find out that records with a null value for a column didn’t count.