2009-12-17 17:21:36 8 Comments

I have a vector of numbers:

```
numbers <- c(4,23,4,23,5,43,54,56,657,67,67,435,
453,435,324,34,456,56,567,65,34,435)
```

How can I have R count the number of times a value *x* appears in the vector?

### Related Questions

#### Sponsored Content

#### 24 Answered Questions

#### 13 Answered Questions

#### 20 Answered Questions

#### 10 Answered Questions

#### 18 Answered Questions

#### 21 Answered Questions

### [SOLVED] How to efficiently count the number of keys/properties of an object in JavaScript?

**2008-09-24 08:56:21****mjs****681054**View**1560**Score**21**Answer- Tags: javascript performance properties count key

## 17 comments

## @tmfmnk 2020-06-26 12:15:49

One option could be to use

`vec_count()`

function from the`vctrs`

library:The default ordering puts the most frequent values at top. If looking for sorting according keys (a

`table()`

-like output):## @Nik 2020-03-12 16:40:15

This is a very fast solution for one-dimensional atomic vectors. It relies on

`match()`

, so it is compatible with`NA`

:You could also tweak the algorithm so that it doesn't run

`unique()`

.In cases where that output is desirable, you probably don't even need it to re-return the original vector, and the second column is probably all you need. You can get that in one line with the pipe:

## @Taz 2020-05-25 14:00:21

Really great solution! Thats also the fastest one I could come up with. It can be a little bit improved for performance for factor input using u <- if(is.factor(x)) x[!duplicated(x)] else unique(x).

## @Pascal Martin 2020-02-21 15:09:30

A method that is relatively fast on long vectors and gives a convenient output is to use

`lengths(split(numbers, numbers))`

(note theSat the end of`lengths`

):The output is simply a named vector.

The speed appears comparable to

`rle`

proposed by JBecker and even a bit faster on very long vectors. Here is a microbenchmark inR 3.6.2with some of the functions proposed:Importantly, the only function that also counts the number of missing values

`NA`

is`plyr::count`

. These can also be obtained separately using`sum(is.na(vec))`

## @GWD 2018-12-17 15:52:21

This can be done with

`outer`

to get a metrix of equalities followed by`rowSums`

, with an obvious meaning.In order to have the counts and

`numbers`

in the same dataset, a data.frame is first created. This step is not needed if you want separate input and output.## @Therii 2018-11-16 16:56:04

There are different ways of counting a specific elements

## @ishandutta2007 2017-06-07 13:14:06

## @geotheory 2013-06-06 14:49:12

There is also

`count(numbers)`

from`plyr`

package. Much more convenient than`table`

in my opinion.## @stevec 2020-05-09 03:41:07

Is there a dplyr equivalent of this?

## @uttkarsh dharmadhikari 2016-02-18 09:34:54

You can change the number to whatever you wish in following line

## @Berny 2015-05-15 12:35:40

If you want to count the number of appearances subsequently, you can make use of the

`sapply`

function:Output:

## @Garini 2018-05-30 13:24:28

Is this by any means faster than table??

## @pomber 2014-12-26 17:06:41

Using table but without comparing with

`names`

:`table`

is useful when you are using the counts of different elements several times. If you need only one count, use`sum(numbers == x)`

## @Akash 2014-12-26 07:11:31

One more way i find convenient is:

This converts the dataset to factor, and then summary() gives us the control totals (counts of the unique values).

Output is:

This can be stored as dataframe if preferred.

here row.names has been used to rename row names. without using row.names, column names in s are used as row names in new dataframe

Output is:

## @JBecker 2012-12-13 21:43:28

My preferred solution uses

`rle`

, which will return a value (the label,`x`

in your example) and a length, which represents how many times that value appeared in sequence.By combining

`rle`

with`sort`

, you have an extremely fast way to count the number of times any value appeared. This can be helpful with more complex problems.Example:

If the value you want doesn't show up, or you need to store that value for later, make

`a`

a`data.frame`

.I find it is rare that I want to know the frequency of one value and not all of the values, and rle seems to be the quickest way to get count and store them all.

## @Heather Stark 2013-01-31 13:54:48

Is the advantage of this, vs table, that it gives a result in a more readily usable format? thanks

## @JBecker 2013-04-22 20:42:11

@HeatherStark I would say there are two advantages. The first is definitely that it is a more readily used format than the table output. The second is that sometimes I want to count the number of elements "in a row" rather than within the whole dataset. For example,

`c(rep('A', 3), rep('G', 4), 'A', rep('G', 2), rep('C', 10))`

would return`values = c('A','G','A','G','C')`

and`lengths=c(3, 4, 1, 2, 10)`

which is sometimes useful.## @ClementWalter 2016-06-21 16:54:09

using microbenchmark, it appears that

`table`

is faster`when the vector is long`

(I tried 100000) but slightly longer when it shorter (I tried 1000)## @skan 2016-12-13 19:46:17

This is going to be really slow if you have a lot of numbers.

## @Sergej Andrejev 2012-04-19 13:13:15

There is a standard function in R for that

`tabulate(numbers)`

## @omar 2016-06-01 15:55:10

The disadvantage of

`tabulate`

is that you can not deal with zero and negative numbers.## @Dodgie 2017-01-31 00:26:43

But you can deal with zero instances of a given number, which the other solutions do not handle

## @pglpm 2019-07-05 08:36:34

Fantastically fast! And as omar says, it gives zero count for non-appearing values, extremely useful when we want to build a frequency distribution. Zero or negative integers can be handled by adding a constant before using

`tabulate`

. Note:`sort`

seems to be necessary for its correct use in general:`tabulate(sort(numbers))`

.## @hadley 2009-12-17 18:09:42

The most direct way is

`sum(numbers == x)`

.`numbers == x`

creates a logical vector which is TRUE at every location that x occurs, and when`sum`

ing, the logical vector is coerced to numeric which converts TRUE to 1 and FALSE to 0.However, note that for floating point numbers it's better to use something like:

`sum(abs(numbers - x) < 1e-6)`

.## @JD Long 2009-12-17 18:13:56

good point about the floating point issue. That bites my butt more than I generally like to admit.

## @JBecker 2013-04-22 20:46:07

@Jason while it does answer the question directly, my guess is that folks liked the more general solution that provides the answer for all

`x`

in the data rather than a specific known value of`x`

. To be fair, that was what the original question was about. As I said in my answer below, "I find it is rare that I want to know the frequency of one value and not all of the values..."## @Jesse 2009-12-17 17:55:16

I would probably do something like this

But really, a better way is

## @Ken Williams 2009-12-18 19:41:20

`table(numbers)`

is going to do a lot more work than the easiest solution,`sum(numbers==x)`

, because it's going to figure out the counts of all the other numbers in the list too.## @skan 2015-12-02 12:16:16

the problem with table is that it's more difficult to include it inside more complex calculus, for example using apply() on dataframes

## @Shane 2009-12-17 17:25:59

You can just use

`table()`

:Then you can subset it:

Or convert it into a data.frame if you're more comfortable working with that:

## @hadley 2009-12-17 18:10:25

Don't forget about potential floating point issues, especially with table, which coerces numbers to strings.

## @Shane 2009-12-17 18:18:17

That's a great point. These are all integers, so it isn't a real issue in this example, right?

## @Ian Fellows 2009-12-18 02:11:37

not exactly. The elements of the table are of class integer class(table(numbers)[1]), but 435 is a floating point number. To make it an integer you can use 435L.

## @Heather Stark 2013-01-31 13:52:05

@Ian - I am confused about why 435 is a float in this example. Can you clarify a bit? thanks.

## @baudtack 2013-11-05 05:31:43

@HeatherStark This is because all numbers, unless integers are explicitly requested, are floats by default.

## @pomber 2014-12-26 17:08:17

Why not

`a["435"]`

insetead of`a[names(a)==435]`

?## @skan 2016-12-13 17:00:51

@pomber if you also had the count for NAs a["NA"] wouldn't work.

## @Garini 2018-05-30 13:25:10

Is the table option faster than a simple sapply as in one of the following answers?

## @JD Long 2009-12-17 17:27:54

here's one fast and dirty way: