By RQuestions


2009-12-17 17:21:36 8 Comments

I have a vector of numbers:

numbers <- c(4,23,4,23,5,43,54,56,657,67,67,435,
         453,435,324,34,456,56,567,65,34,435)

How can I have R count the number of times a value x appears in the vector?

14 comments

@GWD 2018-12-17 15:52:21

This can be done with outer to get a metrix of equalities followed by rowSums, with an obvious meaning.
In order to have the counts and numbers in the same dataset, a data.frame is first created. This step is not needed if you want separate input and output.

df <- data.frame(No = numbers)
df$count <- rowSums(outer(df$No, df$No, FUN = `==`))

@Therii 2018-11-16 16:56:04

There are different ways of counting a specific elements

library(plyr)
numbers =c(4,23,4,23,5,43,54,56,657,67,67,435,453,435,7,65,34,435)

print(length(which(numbers==435)))

#Sum counts number of TRUE's in a vector 
print(sum(numbers==435))
print(sum(c(TRUE, FALSE, TRUE)))

#count is present in plyr library 
#o/p of count is a DataFrame, freq is 1 of the columns of data frame
print(count(numbers[numbers==435]))
print(count(numbers[numbers==435])[['freq']])

@ishandutta2007 2017-06-07 13:14:06

numbers <- c(4,23,4,23,5,43,54,56,657,67,67,435 453,435,324,34,456,56,567,65,34,435)

> length(grep(435, numbers))
[1] 3


> length(which(435 == numbers))
[1] 3


> require(plyr)
> df = count(numbers)
> df[df$x == 435, ] 
     x freq
11 435    3


> sum(435 == numbers)
[1] 3


> sum(grepl(435, numbers))
[1] 3


> sum(435 == numbers)
[1] 3


> tabulate(numbers)[435]
[1] 3


> table(numbers)['435']
435 
  3 


> length(subset(numbers, numbers=='435')) 
[1] 3

@geotheory 2013-06-06 14:49:12

There is also count(numbers) from plyr package. Much more convenient than table in my opinion.

@uttkarsh dharmadhikari 2016-02-18 09:34:54

You can change the number to whatever you wish in following line

length(which(numbers == 4))

@Berny 2015-05-15 12:35:40

If you want to count the number of appearances subsequently, you can make use of the sapply function:

index<-sapply(1:length(numbers),function(x)sum(numbers[1:x]==numbers[x]))
cbind(numbers, index)

Output:

        numbers index
 [1,]       4     1
 [2,]      23     1
 [3,]       4     2
 [4,]      23     2
 [5,]       5     1
 [6,]      43     1
 [7,]      54     1
 [8,]      56     1
 [9,]     657     1
[10,]      67     1
[11,]      67     2
[12,]     435     1
[13,]     453     1
[14,]     435     2
[15,]     324     1
[16,]      34     1
[17,]     456     1
[18,]      56     2
[19,]     567     1
[20,]      65     1
[21,]      34     2
[22,]     435     3

@Garini 2018-05-30 13:24:28

Is this by any means faster than table??

@pomber 2014-12-26 17:06:41

Using table but without comparing with names:

numbers <- c(4,23,4,23,5,43,54,56,657,67,67,435)
x <- 67
numbertable <- table(numbers)
numbertable[as.character(x)]
#67 
# 2 

table is useful when you are using the counts of different elements several times. If you need only one count, use sum(numbers == x)

@Akash 2014-12-26 07:11:31

One more way i find convenient is:

numbers <- c(4,23,4,23,5,43,54,56,657,67,67,435,453,435,324,34,456,56,567,65,34,435)
(s<-summary (as.factor(numbers)))

This converts the dataset to factor, and then summary() gives us the control totals (counts of the unique values).

Output is:

4   5  23  34  43  54  56  65  67 324 435 453 456 567 657 
2   1   2   2   1   1   2   1   2   1   3   1   1   1   1 

This can be stored as dataframe if preferred.

as.data.frame(cbind(Number = names(s),Freq = s), stringsAsFactors=F, row.names = 1:length(s))

here row.names has been used to rename row names. without using row.names, column names in s are used as row names in new dataframe

Output is:

     Number Freq
1       4    2
2       5    1
3      23    2
4      34    2
5      43    1
6      54    1
7      56    2
8      65    1
9      67    2
10    324    1
11    435    3
12    453    1
13    456    1
14    567    1
15    657    1

@JBecker 2012-12-13 21:43:28

My preferred solution uses rle, which will return a value (the label, x in your example) and a length, which represents how many times that value appeared in sequence.

By combining rle with sort, you have an extremely fast way to count the number of times any value appeared. This can be helpful with more complex problems.

Example:

> numbers <- c(4,23,4,23,5,43,54,56,657,67,67,435,453,435,324,34,456,56,567,65,34,435)
> a <- rle(sort(numbers))
> a
  Run Length Encoding
    lengths: int [1:15] 2 1 2 2 1 1 2 1 2 1 ...
    values : num [1:15] 4 5 23 34 43 54 56 65 67 324 ...

If the value you want doesn't show up, or you need to store that value for later, make a a data.frame.

> b <- data.frame(number=a$values, n=a$lengths)
> b
    values n
 1       4 2
 2       5 1
 3      23 2
 4      34 2
 5      43 1
 6      54 1
 7      56 2
 8      65 1
 9      67 2
 10    324 1
 11    435 3
 12    453 1
 13    456 1
 14    567 1
 15    657 1

I find it is rare that I want to know the frequency of one value and not all of the values, and rle seems to be the quickest way to get count and store them all.

@Heather Stark 2013-01-31 13:54:48

Is the advantage of this, vs table, that it gives a result in a more readily usable format? thanks

@JBecker 2013-04-22 20:42:11

@HeatherStark I would say there are two advantages. The first is definitely that it is a more readily used format than the table output. The second is that sometimes I want to count the number of elements "in a row" rather than within the whole dataset. For example, c(rep('A', 3), rep('G', 4), 'A', rep('G', 2), rep('C', 10)) would return values = c('A','G','A','G','C') and lengths=c(3, 4, 1, 2, 10) which is sometimes useful.

@ClementWalter 2016-06-21 16:54:09

using microbenchmark, it appears that table is faster when the vector is long (I tried 100000) but slightly longer when it shorter (I tried 1000)

@skan 2016-12-13 19:46:17

This is going to be really slow if you have a lot of numbers.

@Sergej Andrejev 2012-04-19 13:13:15

There is a standard function in R for that

tabulate(numbers)

@omar 2016-06-01 15:55:10

The disadvantage of tabulate is that you can not deal with zero and negative numbers.

@Dodgie 2017-01-31 00:26:43

But you can deal with zero instances of a given number, which the other solutions do not handle

@pglpm 2019-07-05 08:36:34

Fantastically fast! And as omar says, it gives zero count for non-appearing values, extremely useful when we want to build a frequency distribution. Zero or negative integers can be handled by adding a constant before using tabulate. Note: sort seems to be necessary for its correct use in general: tabulate(sort(numbers)).

@hadley 2009-12-17 18:09:42

The most direct way is sum(numbers == x).

numbers == x creates a logical vector which is TRUE at every location that x occurs, and when suming, the logical vector is coerced to numeric which converts TRUE to 1 and FALSE to 0.

However, note that for floating point numbers it's better to use something like: sum(abs(numbers - x) < 1e-6).

@JD Long 2009-12-17 18:13:56

good point about the floating point issue. That bites my butt more than I generally like to admit.

@JBecker 2013-04-22 20:46:07

@Jason while it does answer the question directly, my guess is that folks liked the more general solution that provides the answer for all x in the data rather than a specific known value of x. To be fair, that was what the original question was about. As I said in my answer below, "I find it is rare that I want to know the frequency of one value and not all of the values..."

@Jesse 2009-12-17 17:55:16

I would probably do something like this

length(which(numbers==x))

But really, a better way is

table(numbers)

@Ken Williams 2009-12-18 19:41:20

table(numbers) is going to do a lot more work than the easiest solution, sum(numbers==x), because it's going to figure out the counts of all the other numbers in the list too.

@skan 2015-12-02 12:16:16

the problem with table is that it's more difficult to include it inside more complex calculus, for example using apply() on dataframes

@Shane 2009-12-17 17:25:59

You can just use table():

> a <- table(numbers)
> a
numbers
  4   5  23  34  43  54  56  65  67 324 435 453 456 567 657 
  2   1   2   2   1   1   2   1   2   1   3   1   1   1   1 

Then you can subset it:

> a[names(a)==435]
435 
  3

Or convert it into a data.frame if you're more comfortable working with that:

> as.data.frame(table(numbers))
   numbers Freq
1        4    2
2        5    1
3       23    2
4       34    2
...

@hadley 2009-12-17 18:10:25

Don't forget about potential floating point issues, especially with table, which coerces numbers to strings.

@Shane 2009-12-17 18:18:17

That's a great point. These are all integers, so it isn't a real issue in this example, right?

@Ian Fellows 2009-12-18 02:11:37

not exactly. The elements of the table are of class integer class(table(numbers)[1]), but 435 is a floating point number. To make it an integer you can use 435L.

@Heather Stark 2013-01-31 13:52:05

@Ian - I am confused about why 435 is a float in this example. Can you clarify a bit? thanks.

@baudtack 2013-11-05 05:31:43

@HeatherStark This is because all numbers, unless integers are explicitly requested, are floats by default.

@pomber 2014-12-26 17:08:17

Why not a["435"] insetead of a[names(a)==435]?

@skan 2016-12-13 17:00:51

@pomber if you also had the count for NAs a["NA"] wouldn't work.

@user1113953 2017-07-10 11:21:54

User @hadley named it: sum(numbers == x) Much more precise, and quicker to understand

@Garini 2018-05-30 13:25:10

Is the table option faster than a simple sapply as in one of the following answers?

@JD Long 2009-12-17 17:27:54

here's one fast and dirty way:

x <- 23
length(subset(numbers, numbers==x))

Related Questions

Sponsored Content

22 Answered Questions

[SOLVED] How can I count the occurrences of a list item?

  • 2010-04-08 13:30:00
  • weakish
  • 1521588 View
  • 1421 Score
  • 22 Answer
  • Tags:   python list count

21 Answered Questions

19 Answered Questions

[SOLVED] Count the number occurrences of a character in a string

  • 2009-07-20 20:00:36
  • Mat
  • 918293 View
  • 917 Score
  • 19 Answer
  • Tags:   python string count

11 Answered Questions

[SOLVED] jQuery: count number of rows in a table

  • 2009-07-19 14:02:41
  • danjan
  • 664012 View
  • 473 Score
  • 11 Answer
  • Tags:   jquery count row

13 Answered Questions

[SOLVED] How do I erase an element from std::vector<> by index?

  • 2009-05-17 17:59:36
  • dau_man
  • 698721 View
  • 464 Score
  • 13 Answer
  • Tags:   c++ stl vector erase

19 Answered Questions

[SOLVED] Drop data frame columns by name

  • 2011-01-05 14:34:29
  • Btibert3
  • 1343028 View
  • 812 Score
  • 19 Answer
  • Tags:   r dataframe r-faq

13 Answered Questions

[SOLVED] How to join (merge) data frames (inner, outer, left, right)

9 Answered Questions

[SOLVED] Grouping functions (tapply, by, aggregate) and the *apply family

7 Answered Questions

[SOLVED] Test if a vector contains a given element

  • 2009-07-23 02:20:53
  • medriscoll
  • 617327 View
  • 485 Score
  • 7 Answer
  • Tags:   r vector r-faq

Sponsored Content