By CodeTwice

2011-02-16 00:39:28 8 Comments

There is a question about MySQL's COUNT() aggregate function that keeps popping into my head time to time. I would like to get some explanation to why it is working the way it is.

When I started working with MySQL I quickly learned that its COUNT(condition) seems only to work properly if condition also contains an OR NULL in the end. In case of more complicated COUNT conditions it was an empirical process to find out where to put it exactly. In MSSQL you do not need this OR NULL to get proper results, so I would like to know the explanation for it. So, here is an example.

Lets have a very basic table with the following structure and data:

  `value` int(11) NOT NULL

INSERT INTO test (value) VALUES(1);
INSERT INTO test (value) VALUES(4);
INSERT INTO test (value) VALUES(5);
INSERT INTO test (value) VALUES(6);
INSERT INTO test (value) VALUES(4);
INSERT INTO test (value) VALUES(4);
INSERT INTO test (value) VALUES(5);
INSERT INTO test (value) VALUES(2);
INSERT INTO test (value) VALUES(8);
INSERT INTO test (value) VALUES(1);

Scenario: I would like to count how many rows I have where the value = 4. An obvious solution would be to filter for it using a WHERE and do a COUNT(*) but I am interested in a COUNT(condition) based solution.

So, the solution that comes to my mind is:

SELECT COUNT(value=4) 
  FROM test

The result is 10. This is obviously wrong.

Second attempt with OR NULL:

  FROM test

The result is 3. It is correct.

Can someone explain the logic behind this? Is this some bug in MySQL or is there a logical explanation why I need to add that strange-looking OR NULL to the end of the COUNT condition to get the correct result?


@Mark Byers 2011-02-16 00:42:05

COUNT(expression) counts the number of rows for which the expression is not NULL. The expression value=4 is only NULL if value is NULL, otherwise it is either TRUE (1) or FALSE (0), both of which are counted.

1 = 4         | FALSE
4 = 4         | TRUE
1 = 4 OR NULL | NULL
4 = 4 OR NULL | TRUE

You could use SUM instead:

SELECT SUM(value=4) FROM test

This is not particularly useful in your specific example but it can be useful if you want to count rows satisfying multiple different predicates using a single table scan such as in the following query:

    SUM(a>b) AS foo,
    SUM(b>c) AS bar,
    COUNT(*) AS total_rows
FROM test

@Andriy M 2012-01-07 21:43:05

Um, SUM(a>b) can still be replaced with COUNT(a>b OR NULL), can it not? So I do not really see how SUM() better suits here than in the OP's example.

@Mark Byers 2012-01-08 10:55:46

Well for one thing, sum is less to type. As I said, it doesn't give much in the specific example the op gave - a where clause can be used instead. But it can be useful in other situations.

@zerkms 2011-02-16 00:42:23

COUNT() function accepts an argument, that is treated as NULL or NOT NULL. If it is NOT NULL - then it increments the value, and doesn't do anything otherwise.

In your case expression value=4 is either TRUE or FALSE, obviously both true and false are not null, that is why you get 10.

but I am interested in a COUNT(condition) based solution.

The count-based solution will be always slower (much slower), because it will cause table fullscan and iterative comparison of each value.

@RichardTheKiwi 2011-02-16 01:06:52

You answered half of the question.. incomplete was the word I was after

@RichardTheKiwi 2011-02-16 00:47:48

This should reveal all

SELECT 4=4, 3=4, 1 or null, 0 or null


1   |   0   |   1   |   NULL


  1. COUNT adds up the columns / expressions that evaluate to NOT NULL. Anything will increment by 1, as long as it is not null. Exception is COUNT(DISTINCT) where it increments only if it is not already counted.

  2. When a BOOLEAN expression is used on its own, it returns either 1 or 0.

  3. When a boolean is OR-ed with NULL, it is NULL only when it is 0 (false)

To others

Yes if the count is the ONLY column desired, one could use WHERE value=4 but if it is a query that wants to count the 4's as well as retrieving other counts/aggregates, then the filter doesn't work. An alternative would have been SUM(value=4), e.g.

SELECT sum(value=4)
  FROM test

@Ken Bloom 2011-02-16 00:53:14

+1 for spelling out the truth table of boolean operations with NULL

@paxdiablo 2011-02-16 01:07:42

And +1 for this one for the explanation as to why it behaves as it does, and the use cases where just using where is inadequate.

@CodeTwice 2011-02-16 01:42:00

Great answer, it explains it all. I would be interesting to know why they decided to implement it this way, e.g. count based on IS NULL / IS NOT NULL instead of the much more common true / false approach.

@RichardTheKiwi 2011-02-16 01:45:41

@CodeTwice you may have it about face. COUNT ( boolean ) is actually less common. SQL Server doesn't even let you use boolean expressions alone (bit != boolean). More people count columns (existence of value == [not] null) rather than evaluate a condition.

@Milan Babuškov 2011-02-16 00:47:20

It's because COUNT(expression) counts VALUES. In SQL theory, NULL is a STATE, not a VALUE and thus is it not counted. NULL is a state that means that field's value is unknown.

Now, when you write "value=4" this evaluates to boolean TRUE or FALSE. Since both TRUE and FALSE are VALUES, the result is 10.

When you add "OR NULL", you actually have "TRUE OR NULL" and "FALSE OR NULL". Now, "TRUE OR NULL" evaluates to TRUE, while "FALSE OR NULL" evaluates to NULL. Thus the result is 3, because you only have 3 values (and seven NULL states).

@Karl 2011-02-16 00:45:51

I would suggest that you will find the more standard syntax moves better between different database engines and will always give the correct result.

 select count(*)
 from test
 where value = 4

Is the syntax you used a Mysql variant?

Related Questions

Sponsored Content

6 Answered Questions

[SOLVED] MySQL load NULL values from CSV data

11 Answered Questions

[SOLVED] Can I concatenate multiple MySQL rows into one field?

2 Answered Questions

[SOLVED] why mysql count(*) has better performance than count(id)

  • 2016-02-26 13:30:38
  • zhuguowei
  • 1906 View
  • 7 Score
  • 2 Answer
  • Tags:   mysql

6 Answered Questions

[SOLVED] Using IS NULL or IS NOT NULL on join conditions - Theory question

4 Answered Questions

[SOLVED] Row average in MySQL - how to exclude NULL rows from the count

  • 2013-07-04 12:46:39
  • mtmacdonald
  • 1995 View
  • 4 Score
  • 4 Answer
  • Tags:   mysql sql

2 Answered Questions

The relation of mysql insert count to disk io write count

1 Answered Questions

[SOLVED] Counting different tables in MyISAM MySQL tables

  • 2010-11-04 16:11:40
  • Mac Taylor
  • 122 View
  • 0 Score
  • 1 Answer
  • Tags:   mysql

6 Answered Questions

[SOLVED] MySQL Query Issue - Count Query Uses lot of CPU

  • 2010-02-03 07:27:15
  • Mihir
  • 964 View
  • 2 Score
  • 6 Answer
  • Tags:   mysql count cpu

5 Answered Questions

[SOLVED] MySQL COUNT(DISTINCT()) unexpected results

  • 2008-10-28 17:59:20
  • ejunker
  • 3458 View
  • 3 Score
  • 5 Answer
  • Tags:   sql mysql

Sponsored Content