By user3315563


2015-10-10 07:17:08 8 Comments

My df looks like this:

Id  Task Type    Freq  
3     1    A       2
3     1    B       3
3     2    A       3
3     2    B       0
4     1    A       3
4     1    B       3
4     2    A       1
4     2    B       3

I want to restructure by Id and get:

Id   A    B …  Z    
3    5    3      
4    4    6        

I tried:

df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq")

and got the error:

Aggregation function missing: defaulting to length

I can't figure out what to put in the fun.aggregate. What's the problem?

1 comments

@Jaap 2015-10-10 07:35:10

The reason why you are getting this warning is in the description of fun.aggregate (see ?dcast):

aggregation function needed if variables do not identify a single observation for each output cell. Defaults to length (with a message) if needed but not specified

So, an aggregation function is needed when there is more than one value for one spot in the wide dataframe.

An explanation based on your data:

When you use dcast(df, Id + Task ~ Type, value.var="Freq") you get:

  Id Task A B
1  3    1 2 3
2  3    2 3 0
3  4    1 3 3
4  4    2 1 3

Which is logical because for each combination of Id, Task and Type there is only value in Freq. But when you use dcast(df, Id ~ Type, value.var="Freq") you get this (including a warning message):

Aggregation function missing: defaulting to length
  Id A B
1  3 2 2
2  4 2 2

Now, looking back at the top part of your data:

Id  Task Type    Freq  
3     1    A       2
3     1    B       3
3     2    A       3
3     2    B       0

You see why this is the case. For each combination of Id and Type there are two values in Freq (for Id 3: 2 and 3 for A & 3 and 0 for Type B) while you can only put one value in this spot in the wide dataframe for each values of type. Therefore dcast wants to aggregate these values into one value. The default aggregation function is length, but you can use other aggregation functions like sum, mean, sd or a custom function by specifying them with fun.aggregate.

For example, with fun.aggregate = sum you get:

  Id A B
1  3 5 3
2  4 4 6

Now there is no warning because dcast is being told what to do when there is more than one value: return the sum of the values.

@NelsonGon 2018-12-17 03:47:47

Nice explanation. How do you aggregate this for say characters?

@Jaap 2018-12-17 09:19:54

@NelsonGon For characters you use for example use the toString-function to aggregate them: dcast(df, Id ~ Type, value.var="Freq", fun.aggregate = toString). Alternatively you can define your own aggregation function - e.g.: f.agg <- function(x) paste(x, collapse = "-") - and use that: dcast(df, Id ~ Type, value.var="Freq", fun.aggregate = f.agg)

Related Questions

Sponsored Content

10 Answered Questions

[SOLVED] How can I view the source code for a function?

  • 2013-10-07 13:58:00
  • Joshua Ulrich
  • 253178 View
  • 511 Score
  • 10 Answer
  • Tags:   r function r-faq

16 Answered Questions

[SOLVED] Changing column names of a data frame

  • 2011-05-21 11:31:23
  • Son
  • 1216991 View
  • 361 Score
  • 16 Answer
  • Tags:   r dataframe rename

9 Answered Questions

[SOLVED] Grouping functions (tapply, by, aggregate) and the *apply family

0 Answered Questions

dcast aggregates by length

  • 2016-02-11 17:45:19
  • Komal Rathi
  • 99 View
  • 0 Score
  • 0 Answer
  • Tags:   r reshape2

0 Answered Questions

dcast (data.table) error, attempts aggregation with no duplicates

0 Answered Questions

How can I refer to multiple aggregation functions in data.table dcast?

  • 2017-03-16 22:49:02
  • tgyozo
  • 175 View
  • 4 Score
  • 0 Answer
  • Tags:   r data.table dcast

1 Answered Questions

0 Answered Questions

1 Answered Questions

[SOLVED] How to resolve dcast error with reshape2 package in R?

  • 2012-02-25 22:58:44
  • eipi10
  • 4282 View
  • 4 Score
  • 1 Answer
  • Tags:   r reshape

Sponsored Content