By Dirk Calloway


2014-12-02 10:27:19 8 Comments

I have a long data set I would like to make wide and I'm curious if there is a way to do this all in one step using the reshape2 or tidyr packages in R.

The data frame df looks like this:

id  type    transactions    amount
20  income       20          100
20  expense      25          95
30  income       50          300
30  expense      45          250

I'd like to get to this:

id  income_transactions expense_transactions    income_amount   expense_amount
20       20                           25                 100             95
30       50                           45                 300             250

I know I can get part of the way there with reshape2 via for example:

dcast(df, id ~  type, value.var="transactions")

But is there a way to reshape the entire df in one shot addressing both the "transactions" and "amount" variables at once? And ideally with new more appropriate column names?

2 comments

@Arun 2015-03-16 01:40:58

With data.table v1.9.6+, we can cast multiple value.var columns simultaneously (and also use multiple aggregation functions in fun.aggregate). Please see ?dcast for more and also the examples section.

require(data.table) # v1.9.6+
dcast(dt, id ~ type, value.var=names(dt)[3:4])
#    id transactions_expense transactions_income amount_expense amount_income
# 1: 20                   25                  20             95           100
# 2: 30                   45                  50            250           300

@A5C1D2H2I1M1N2O1R2T1 2014-12-02 10:31:48

In "reshape2", you can use recast (though in my experience, this isn't a widely known function).

library(reshape2)
recast(mydf, id ~ variable + type, id.var = c("id", "type"))
#   id transactions_expense transactions_income amount_expense amount_income
# 1 20                   25                  20             95           100
# 2 30                   45                  50            250           300

You can also use base R's reshape:

reshape(mydf, direction = "wide", idvar = "id", timevar = "type")
#   id transactions.income amount.income transactions.expense amount.expense
# 1 20                  20           100                   25             95
# 3 30                  50           300                   45            250

Or, you can melt and dcast, like this (here with "data.table"):

library(data.table)
library(reshape2)
dcast.data.table(melt(as.data.table(mydf), id.vars = c("id", "type")), 
                 id ~ variable + type, value.var = "value")
#    id transactions_expense transactions_income amount_expense amount_income
# 1: 20                   25                  20             95           100
# 2: 30                   45                  50            250           300

In later versions of dcast.data.table from "data.table" (1.9.8) you will be able to do this directly. If I understand correctly, what @Arun is trying to implement would be doing the reshaping without first having to melt the data, which is what happens presently with recast, which is essentially a wrapper for a melt + dcast sequence of operations.


And, for thoroughness, here's the tidyr approach:

library(dplyr)
library(tidyr)
mydf %>% 
  gather(var, val, transactions:amount) %>% 
  unite(var2, type, var) %>% 
  spread(var2, val)
#   id expense_amount expense_transactions income_amount income_transactions
# 1 20             95                   25           100                  20
# 2 30            250                   45           300                  50

@David Arenburg 2014-12-02 11:09:04

This is a nice answer (Enlightened + Nice answer badges on the way...), but not sure what we need all this mess with tidyr, dplyr, data.table, reshape etc. when there is such a simple way of doing it with base R

@A5C1D2H2I1M1N2O1R2T1 2014-12-02 11:10:47

@DavidArenburg, speed at scale, really. I'm an old hardcore fan of reshape, but it would not be my first recommendation if someone is working with remotely large data.

@David Arenburg 2014-12-02 11:13:24

Now that I'm looking into reshapes source code it looks like it's all about endless for/apply loops. It is pity it wasn't written in C/C++ as all base R function should be

@hadley 2014-12-02 16:09:31

reshape() is simple? All I have to say to that is "bahahahaha"

@A5C1D2H2I1M1N2O1R2T1 2014-12-02 16:12:05

@hadley, I can't speak for David, but I didn't interpret his comment to mean that reshape() was simple, but rather implied that the reshape() approach here is actually reasonably straightforward.

@hadley 2014-12-02 19:49:44

In that case, the whole point of "mess"ing with tidyr, dplyr, data.table, reshape2 etc, is that they generalise better to new problems, which reshape() does not.

Related Questions

Sponsored Content

10 Answered Questions

[SOLVED] How to select rows from a DataFrame based on column values?

19 Answered Questions

[SOLVED] How to sort a dataframe by multiple column(s)

3 Answered Questions

[SOLVED] Rearrange dataframe using dcast using a dummy

  • 2019-01-28 14:29:45
  • Leprechault
  • 54 View
  • 2 Score
  • 3 Answer
  • Tags:   r reshape reshape2

1 Answered Questions

[SOLVED] Reshape DF from long to wide in R using Reshape2 without an aggregation function

  • 2016-01-25 19:07:15
  • Derek Darves
  • 404 View
  • 1 Score
  • 1 Answer
  • Tags:   r reshape2

1 Answered Questions

[SOLVED] R: reshaping wide to long

2 Answered Questions

[SOLVED] Reshape a data frame to matrix from text file value.var errors

2 Answered Questions

[SOLVED] POSIXct values become numeric in reshape2 dcast

  • 2012-09-05 21:02:24
  • gkaupas
  • 1891 View
  • 6 Score
  • 2 Answer
  • Tags:   r reshape2

2 Answered Questions

[SOLVED] dcast - concatenate column values with column names

  • 2015-07-01 13:15:20
  • Chris
  • 1005 View
  • 1 Score
  • 2 Answer
  • Tags:   r reshape2

2 Answered Questions

[SOLVED] How to "unmelt" data with reshape r

  • 2014-09-19 13:57:55
  • LP_640
  • 17437 View
  • 16 Score
  • 2 Answer
  • Tags:   r reshape reshape2

2 Answered Questions

[SOLVED] reshape of a large data

  • 2013-07-22 20:56:03
  • S Das
  • 726 View
  • 2 Score
  • 2 Answer
  • Tags:   r reshape

Sponsored Content