By user592419


2011-11-18 17:07:18 8 Comments

I am trying to obtain counts of each combination of levels of two variables, "week" and "id". I'd like the result to have "id" as rows, and "week" as columns, and the counts as the values.

Example of what I've tried so far (tried a bunch of other things, including adding a dummy variable = 1 and then fun.aggregate = sum over that):

library(plyr)
ddply(data, .(id), dcast, id ~ week, value_var = "id", 
        fun.aggregate = length, fill = 0, .parallel = TRUE)

However, I must be doing something wrong because this function is not finishing. Is there a better way to do this?

Input:

id      week
1       1
1       2
1       3
1       1
2       3

Output:

  1  2  3
1 2  1  1
2 0  0  1

4 comments

@Ronak Shah 2019-02-05 00:49:54

A tidyverse option could be :

library(dplyr)
library(tidyr)

df %>%
  count(id, week) %>%
  pivot_wider(names_from = week, values_from = n, values_fill = list(n = 0))
  #spread(week, n, fill = 0) #In older version of tidyr

#     id   `1`   `2`   `3`
#   <dbl> <dbl> <dbl> <dbl>
#1     1     2     1     1
#2     2     0     0     1

data

df <- structure(list(id = c(1L, 1L, 1L, 1L, 2L), week = c(1L, 2L, 3L, 
1L, 3L)), class = "data.frame", row.names = c(NA, -5L))

@Joshua Ulrich 2011-11-18 17:16:25

You could just use the table command:

table(data$id,data$week)

    1 2 3
  1 2 1 1
  2 0 0 1

If "id" and "week" are the only columns in your data frame, you can simply use:

table(data)
#    week
# id  1 2 3
#   1 2 1 1
#   2 0 0 1

@Andrie 2011-11-18 17:17:55

+1 Blast. You have a knack of making my solutions look totally long-winded, roundabout and pedestrian.

@Patrick Burns 2011-11-18 18:15:45

If you have a lot of data and operations that can't be simplified so much, then the 'data.table' package may help you.

@mnel 2012-09-14 02:42:45

The reason ddply is taking so long is that the splitting by group is not run in parallel (only the computations on the 'splits'), therefore with a large number of groups it will be slow (and .parallel = T) will not help.

An approach using data.table::dcast (data.table version >= 1.9.2) should be extremely efficient in time and memory. In this case, we can rely on default argument values and simply use:

library(data.table) 
dcast(setDT(data), id ~ week)
# Using 'week' as value column. Use 'value.var' to override
# Aggregate function missing, defaulting to 'length'
#    id 1 2 3
# 1:  1 2 1 1
# 2:  2 0 0 1

Or setting the arguments explicitly:

dcast(setDT(data), id ~ week, value.var = "week", fun = length)
#    id 1 2 3
# 1:  1 2 1 1
# 2:  2 0 0 1

For pre-data.table 1.9.2 alternatives, see edits.

@Andrie 2011-11-18 17:14:59

You don't need ddply for this. The dcast from reshape2 is sufficient:

dat <- data.frame(
    id = c(rep(1, 4), 2),
    week = c(1:3, 1, 3)
)

library(reshape2)
dcast(dat, id~week, fun.aggregate=length)

  id 1 2 3
1  1 2 1 1
2  2 0 0 1

Edit : For a base R solution (other than table - as posted by Joshua Uhlrich), try xtabs:

xtabs(~id+week, data=dat)

   week
id  1 2 3
  1 2 1 1
  2 0 0 1

Related Questions

Sponsored Content

3 Answered Questions

[SOLVED] data.table vs dplyr: can one do something well the other can't or does poorly?

  • 2014-01-29 15:21:45
  • BrodieG
  • 115897 View
  • 729 Score
  • 3 Answer
  • Tags:   r data.table dplyr

1 Answered Questions

[SOLVED] Converting to wide format from long with multiple id and value columns

2 Answered Questions

[SOLVED] Converting data from wide to frequency counts of columns

  • 2018-03-01 18:45:41
  • Anne
  • 30 View
  • 0 Score
  • 2 Answer
  • Tags:   r plyr

1 Answered Questions

[SOLVED] R formating long data to wide data... but with linked results

  • 2018-02-25 20:50:29
  • Ruth Johnson
  • 68 View
  • -4 Score
  • 1 Answer
  • Tags:   r reshape2

0 Answered Questions

Long to Wide format: dcast does not conserve values (just aggregates)

  • 2017-07-20 14:45:45
  • Tamara Dominguez Poncelas
  • 43 View
  • 0 Score
  • 0 Answer
  • Tags:   r format reshape2

2 Answered Questions

[SOLVED] Reshape and combine data from long to wide format with R

3 Answered Questions

[SOLVED] trouble getting unbalanced df from wide to long

  • 2016-07-14 03:43:34
  • Eric Green
  • 86 View
  • 1 Score
  • 3 Answer
  • Tags:   r reshape reshape2

2 Answered Questions

[SOLVED] Frequency of all combinations of two

  • 2014-05-27 14:14:35
  • jenswirf
  • 94 View
  • 0 Score
  • 2 Answer
  • Tags:   r

Sponsored Content