By Btibert3


2010-11-19 16:40:52 8 Comments

I have a nested list of data. Its length is 132 and each item is a list of length 20. Is there a quick way to convert this structure into a data frame that has 132 rows and 20 columns of data?

Here is some sample data to work with:

l <- replicate(
  132,
  list(sample(letters, 20)),
  simplify = FALSE
)

19 comments

@trevi 2019-04-23 10:12:10

For a paralleled (multicore, multisession, etc) solution using purrr family of solutions, use:

library (furrr)
plan(multisession) # see below to see which other plan() is the more efficient
myTibble <- future_map_dfc(l, ~.x)

Where l is the list.

To benchmark the most efficient plan() you can use:

library(tictoc)
plan(sequential) # reference time
# plan(multisession) # benchamark plan() goes here. See ?plan().
tic()
myTibble <- future_map_dfc(l, ~.x)
toc()

@Ahmad 2019-04-11 14:29:06

The following simple command worked for me:

myDf <- as.data.frame(myList)

Reference (Quora answer)

> myList <- list(a = c(1, 2, 3), b = c(4, 5, 6))
> myList
$a
[1] 1 2 3

$b
[1] 4 5 6

> myDf <- as.data.frame(myList)
  a b
1 1 4
2 2 5
3 3 6
> class(myDf)
[1] "data.frame"

But this will fail if it’s not obvious how to convert the list to a data frame:

> myList <- list(a = c(1, 2, 3), b = c(4, 5, 6, 7))
> myDf <- as.data.frame(myList)
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 3, 4

@Gregor 2019-04-11 15:20:20

A note that on the input from the question this only sort of works. OP asks for 132 rows and 20 columns, but this gives 20 rows and 132 columns.

@Gregor 2019-04-11 15:21:22

For your example with different-length input where it fails, it's not clear what the desired result would be...

@Ahmad 2019-04-11 17:51:50

@Gregor True, but the question title is "R - list to data frame". Many visitors of the question and those who voted it up don't have the exact problem of OP. Based on the question title, they just look for a way to convert list to data frame. I myself had the same problem and the solution I posted solved my problem

@Gregor 2019-04-11 18:39:36

Yup, just noting. Not downvoting. It might be nice to note in the answer that it does something similar--but distinctly different than--pretty much all the other answers.

@nico 2010-11-19 16:46:09

Assuming your list of lists is called l:

df <- data.frame(matrix(unlist(l), nrow=length(l), byrow=T))

The above will convert all character columns to factors, to avoid this you can add a parameter to the data.frame() call:

df <- data.frame(matrix(unlist(l), nrow=132, byrow=T),stringsAsFactors=FALSE)

@nico 2010-11-19 18:27:30

@Joshua Ulrich: Ooops! I don't know why, but I thought he was asking for a matrix :)

@Btibert3 2010-11-19 21:30:19

unlist did the trick. After that, I could manipulate/change what I needed. Thx!

@Ian Sudbery 2013-03-15 10:15:18

Careful here if your data is not all of the same type. Passing through a matrix means that all data will be coerced into a common type. I.e. if you have one column of character data and one column of numeric data the numeric data will be coerced to string by matrix() and then both to factor by data.frame().

@Dave 2013-11-25 18:29:21

What is the best way to do this where the list has missing values, or to include NA in the data frame?

@nico 2013-11-25 19:53:15

@Dave: I don't think I quite follow you... this works even if there are NAs around...

@Dave 2013-11-27 17:58:51

@nico for me if there was a ragged data frame, it wouldnt incorporate NA

@nico 2013-11-27 18:47:56

@Dave: Works for me... see here r-fiddle.org/#/fiddle?id=y8DW7lqL&version=3

@Dave 2013-11-27 20:01:09

@nico must be because mine isnt a matrix, its a data frame

@nico 2013-11-28 07:11:21

@Dave: note that I am outputting a data frame, not a matrix. I pass through a matrix just to split the data onto columns. Anyway, maybe it is better to post a new question with more details and a reproducible example. :)

@Alex Brown 2014-05-16 18:12:37

Also take care if you have character data type - data.frame will convert it to factors.

@SigmaX 2015-06-07 17:32:58

@mropa's answer is far superior, since it's simpler and preserves data types.

@MySchizoBuddy 2015-07-25 19:16:53

@mropas answer doesn't work with the sample in the question. It works with it's own sample data. Of all the answer only this one gets the correct result.

@nico 2015-08-16 16:44:34

@VickyZhang well, I'm not too well versed in Python to be honest (I didn't know it had data frame at all for instance...). In any case you can probably list a series of other things that are longer in Python than in R, but that means pretty much nothing. They are different languages built for different purposes. And, above all, it's a one-liner, it does not seem that complicated to me.

@N.Varela 2015-12-18 21:47:26

@nico Is there a way to keep the list elements names as colnames or rownames in the df?

@ha_pu 2019-03-01 17:42:27

To make it more dynamic you'd might write nrow = length(l) instead of nrow = 132.

@nico 2019-03-02 12:18:49

@ha_pu good point - will edit

@Jinhua Wang 2019-04-24 16:27:58

Amazing answer. Simplistic and useful!

@Will C 2018-12-21 23:46:58

A short (but perhaps not the fastest) way to do this would be to use base r, since a data frame is just a list of equal length vectors. Thus the conversion between your input list and a 30 x 132 data.frame would be: df <- data.frame(l) From there we can transpose it to a 132 x 30 matrix, and convert it back to a dataframe:

new_df <- data.frame(t(df))

As a one-liner: new_df <- data.frame(t(data.frame(l)))

The rownames will be pretty annoying to look at, but you could always rename those with

rownames(new_df) <- 1:nrow(new_df)

@Will C 2019-01-15 19:19:24

Why was this downvoted? I'd like to know so I don't continue to spread misinformation.

@Arthur Yip 2019-03-07 00:05:24

I've definitely done this before, using a combination of data.frame and t! I guess the people who downvoted feel there are better ways, particularly those that don't mess up the names.

@Will C 2019-03-12 21:05:45

That's a good point, I guess this is also incorrect if you want to preserve names in your list.

@sbha 2018-07-11 01:50:32

Depending on the structure of your lists there are some tidyverse options that work nicely with unequal length lists:

l <- list(a = list(var.1 = 1, var.2 = 2, var.3 = 3)
        , b = list(var.1 = 4, var.2 = 5)
        , c = list(var.1 = 7, var.3 = 9)
        , d = list(var.1 = 10, var.2 = 11, var.3 = NA))

df <- dplyr::bind_rows(l)
df <- purrr::map_df(l, dplyr::bind_rows)
df <- purrr::map_df(l, ~.x)

# all create the same data frame:
# A tibble: 4 x 3
  var.1 var.2 var.3
  <dbl> <dbl> <dbl>
1     1     2     3
2     4     5    NA
3     7    NA     9
4    10    11    NA

You can also mix vectors and data frames:

library(dplyr)
bind_rows(
  list(a = 1, b = 2),
  data_frame(a = 3:4, b = 5:6),
  c(a = 7)
)

# A tibble: 4 x 2
      a     b
  <dbl> <dbl>
1     1     2
2     3     5
3     4     6
4     7    NA

@GGAnderson 2019-03-20 02:15:46

This dplyr::bind_rows function works well, even with hard to work with lists originating as JSON. From JSON to a surprisingly clean dataframe. Nice.

@SavedByJESUS 2018-05-30 02:00:12

This method uses a tidyverse package (purrr).

The list:

x <- as.list(mtcars)

Converting it into a data frame (a tibble more specifically):

library(purrr)
map_df(x, ~.x)

@Matt Dancho 2017-04-09 11:10:14

The tibble package has a function enframe() that solves this problem by coercing nested list objects to nested tibble ("tidy" data frame) objects. Here's a brief example from R for Data Science:

x <- list(
    a = 1:5,
    b = 3:4, 
    c = 5:6
) 

df <- enframe(x)
df
#> # A tibble: 3 × 2
#>    name     value
#>   <chr>    <list>
#>    1     a <int [5]>
#>    2     b <int [2]>
#>    3     c <int [2]>

Since you have several nests in your list, l, you can use the unlist(recursive = FALSE) to remove unnecessary nesting to get just a single hierarchical list and then pass to enframe(). I use tidyr::unnest() to unnest the output into a single level "tidy" data frame, which has your two columns (one for the group name and one for the observations with the groups value). If you want columns that make wide, you can add a column using add_column() that just repeats the order of the values 132 times. Then just spread() the values.


library(tidyverse)

l <- replicate(
    132,
    list(sample(letters, 20)),
    simplify = FALSE
)

l_tib <- l %>% 
    unlist(recursive = FALSE) %>% 
    enframe() %>% 
    unnest()
l_tib
#> # A tibble: 2,640 x 2
#>     name value
#>    <int> <chr>
#> 1      1     d
#> 2      1     z
#> 3      1     l
#> 4      1     b
#> 5      1     i
#> 6      1     j
#> 7      1     g
#> 8      1     w
#> 9      1     r
#> 10     1     p
#> # ... with 2,630 more rows

l_tib_spread <- l_tib %>%
    add_column(index = rep(1:20, 132)) %>%
    spread(key = index, value = value)
l_tib_spread
#> # A tibble: 132 x 21
#>     name   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`
#> *  <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1      1     d     z     l     b     i     j     g     w     r     p     y
#> 2      2     w     s     h     r     i     k     d     u     a     f     j
#> 3      3     r     v     q     s     m     u     j     p     f     a     i
#> 4      4     o     y     x     n     p     i     f     m     h     l     t
#> 5      5     p     w     v     d     k     a     l     r     j     q     n
#> 6      6     i     k     w     o     c     n     m     b     v     e     q
#> 7      7     c     d     m     i     u     o     e     z     v     g     p
#> 8      8     f     s     e     o     p     n     k     x     c     z     h
#> 9      9     d     g     o     h     x     i     c     y     t     f     j
#> 10    10     y     r     f     k     d     o     b     u     i     x     s
#> # ... with 122 more rows, and 9 more variables: `12` <chr>, `13` <chr>,
#> #   `14` <chr>, `15` <chr>, `16` <chr>, `17` <chr>, `18` <chr>,
#> #   `19` <chr>, `20` <chr>

@Frank 2017-04-09 19:37:03

Quoting the OP: "Is there a quick way to convert this structure into a data frame that has 132 rows and 20 columns of data?" So maybe you need a spread step or something.

@Matt Dancho 2017-04-10 20:03:28

Ah yes, there just needs to be an index column that can be spread. I will update shortly.

@ecerulm 2016-11-06 12:37:07

For the general case of deeply nested lists with 3 or more levels like the ones obtained from a nested JSON:

{
"2015": {
  "spain": {"population": 43, "GNP": 9},
  "sweden": {"population": 7, "GNP": 6}},
"2016": {
  "spain": {"population": 45, "GNP": 10},
  "sweden": {"population": 9, "GNP": 8}}
}

consider the approach of melt() to convert the nested list to a tall format first:

myjson <- jsonlite:fromJSON(file("test.json"))
tall <- reshape2::melt(myjson)[, c("L1", "L2", "L3", "value")]
    L1     L2         L3 value
1 2015  spain population    43
2 2015  spain        GNP     9
3 2015 sweden population     7
4 2015 sweden        GNP     6
5 2016  spain population    45
6 2016  spain        GNP    10
7 2016 sweden population     9
8 2016 sweden        GNP     8

followed by dcast() then to wide again into a tidy dataset where each variable forms a a column and each observation forms a row:

wide <- reshape2::dcast(tall, L1+L2~L3) 
# left side of the formula defines the rows/observations and the 
# right side defines the variables/measurements
    L1     L2 GNP population
1 2015  spain   9         43
2 2015 sweden   6          7
3 2016  spain  10         45
4 2016 sweden   8          9

@user36302 2016-10-24 21:46:57

Sometimes your data may be a list of lists of vectors of the same length.

lolov = list(list(c(1,2,3),c(4,5,6)), list(c(7,8,9),c(10,11,12),c(13,14,15)) )

(The inner vectors could also be lists, but I'm simplifying to make this easier to read).

Then you can make the following modification. Remember that you can unlist one level at a time:

lov = unlist(lolov, recursive = FALSE )
> lov
[[1]]
[1] 1 2 3

[[2]]
[1] 4 5 6

[[3]]
[1] 7 8 9

[[4]]
[1] 10 11 12

[[5]]
[1] 13 14 15

Now use your favorite method mentioned in the other answers:

library(plyr)
>ldply(lov)
  V1 V2 V3
1  1  2  3
2  4  5  6
3  7  8  9
4 10 11 12
5 13 14 15

@zhan2383 2016-04-20 17:48:48

l <- replicate(10,list(sample(letters, 20)))
a <-lapply(l[1:10],data.frame)
do.call("cbind", a)

@Amit Kohli 2015-12-11 11:15:24

This is what finally worked for me:

do.call("rbind", lapply(S1, as.data.frame))

@laubbas 2015-04-28 10:31:40

Extending on @Marek's answer: if you want to avoid strings to be turned into factors and efficiency is not a concern try

do.call(rbind, lapply(your_list, data.frame, stringsAsFactors=FALSE))

@Pankaj Kaundal 2018-11-13 06:59:57

excellent, worked for me too :)

@jdeng 2014-03-24 14:49:15

assume your list is called L,

data.frame(Reduce(rbind, L))

@jxramos 2014-10-23 19:47:23

Nice one! There is one difference with @Alex Brown's solution compared to yours, going your route yielded the following warning message for some reason: `Warning message: In data.row.names(row.names, rowsi, i) : some row.names duplicated: 3,4 --> row.names NOT used'

@Anastasia Pupynina 2015-10-09 10:36:32

Very good!! Worked for me here: stackoverflow.com/questions/32996321/…

@The Red Pea 2015-10-26 20:17:13

Works well unless the list has only one element in it: data.frame(Reduce(rbind, list(c('col1','col2')))) produces a data frame with 2 rows, 1 column (I expected 1 row 2 columns)

@Jack Ryan 2013-05-16 16:55:05

Reshape2 yields the same output as the plyr example above:

library(reshape2)
l <- list(a = list(var.1 = 1, var.2 = 2, var.3 = 3)
          , b = list(var.1 = 4, var.2 = 5, var.3 = 6)
          , c = list(var.1 = 7, var.2 = 8, var.3 = 9)
          , d = list(var.1 = 10, var.2 = 11, var.3 = 12)
)
l <- melt(l)
dcast(l, L1 ~ L2)

yields:

  L1 var.1 var.2 var.3
1  a     1     2     3
2  b     4     5     6
3  c     7     8     9
4  d    10    11    12

If you were almost out of pixels you could do this all in 1 line w/ recast().

@Marek 2010-11-19 17:04:47

With rbind

do.call(rbind.data.frame, your_list)

Edit: Previous version return data.frame of list's instead of vectors (as @IanSudbery pointed out in comments).

@eykanal 2011-12-21 17:03:41

Why does this work but rbind(your_list) returns a 1x32 list matrix?

@Marek 2011-12-21 22:30:13

@eykanal do.call pass elements of your_list as arguments to rbind. It's equivalent of rbind(your_list[[1]], your_list[[2]], your_list[[3]], ....., your_list[[length of your_list]]).

@Frank Wang 2012-05-09 09:38:35

This method suffers from the null situation.

@Marek 2012-05-09 20:42:39

@FrankWANG But this method is not designed to null situation. It's required that your_list contain equally sized vectors. NULL has length 0 so it should failed.

@Ian Sudbery 2013-03-15 10:18:24

This method seems to return the correct object, but on inspecting the object, you'll find that the columns are lists rather than vectors, which can lead to problems down the line if you are not expecting it.

@Marek 2013-03-19 21:29:58

@IanSudbery You're right. I'll edit my answer. Don't know why I was thinking that he got list of vectors, not list of lists. Nice catch.

@MySchizoBuddy 2015-07-25 19:14:15

doesn't work with the sample data provided in the question

@Marek 2015-07-27 20:46:38

@MySchizoBuddy Example added recently is not consistent with original description in question.

@John Haberstroh 2017-07-22 20:41:00

Sort of minor gripe, but this returns a list when it should be returning a data frame. Fortunately, you can call as.data.frame() on the return of do.call() and it will format correctly as a data.frame object.

@Marek 2017-07-22 21:09:54

@JohnH. It should return data.frame. Could you provide example?

@Simone 2017-12-31 09:55:32

Neat solution Marek especially because it takes column and row names (which the unlist solution doesn't). The output I got was a numeric DF - so no involuntary factor conversion. Thanks!

@Arthur Yip 2019-03-07 00:01:07

bind_rows is "is an efficient implementation of the common pattern of do.call(rbind, dfs)"

@mnel 2013-03-25 21:59:44

The package data.table has the function rbindlist which is a superfast implementation of do.call(rbind, list(...)).

It can take a list of lists, data.frames or data.tables as input.

library(data.table)
ll <- list(a = list(var.1 = 1, var.2 = 2, var.3 = 3)
  , b = list(var.1 = 4, var.2 = 5, var.3 = 6)
  , c = list(var.1 = 7, var.2 = 8, var.3 = 9)
  , d = list(var.1 = 10, var.2 = 11, var.3 = 12)
  )

DT <- rbindlist(ll)

This returns a data.table inherits from data.frame.

If you really want to convert back to a data.frame use as.data.frame(DT)

@Frank 2016-04-20 20:25:03

Regarding the last line, setDF now allows for returning to data.frame by reference.

@tallharish 2018-06-07 21:02:00

For my list with 30k items, rbindlist worked way faster than ldply

@Ian Sudbery 2013-03-15 10:57:41

More answers, along with timings in the answer to this question: What is the most efficient way to cast a list as a data frame?

The quickest way, that doesn't produce a dataframe with lists rather than vectors for columns appears to be (from Martin Morgan's answer):

l <- list(list(col1="a",col2=1),list(col1="b",col2=2))
f = function(x) function(i) unlist(lapply(x, `[[`, i), use.names=FALSE)
as.data.frame(Map(f(l), names(l[[1]])))

@Alex Brown 2010-11-19 17:07:38

data.frame(t(sapply(mylistlist,c)))

sapply converts it to a matrix. data.frame converts the matrix to a data frame.

@Alex Brown 2010-11-19 17:20:19

updated to take inner lists as rows.

@d_a_c321 2014-01-11 02:42:15

best answer by far! None of the other solutions get the types/column names correct. THANK YOU!

@jxramos 2014-10-23 19:42:13

What role are you intending c to play here, one instance of the list's data? Oh wait, c for the concatenate function right? Getting confused with @mnel's usage of c. I also concur with @dchandler, getting the column names right was a valuable need in my use case. Brilliant solution.

@Alex Brown 2014-10-23 21:35:03

that right - standard c function; from ?c : Combine Values into a Vector or List

@MySchizoBuddy 2015-07-25 19:12:33

doesn't work with the sample data provided in the question

@Alex Brown 2015-07-26 14:19:14

Someone (not the originator) changed the question. Should be changed back.

@Carl 2016-05-26 21:40:10

Doesn't this generate a data.frame of lists?

@Alex Brown 2016-05-26 22:07:35

@Carl why are you asking? What result did you get?

@Florent 2017-12-08 18:16:02

It works but df$id returns a list instead of a dataframe.

@mropa 2010-11-19 17:07:46

You can use the plyr package. For example a nested list of the form

l <- list(a = list(var.1 = 1, var.2 = 2, var.3 = 3)
      , b = list(var.1 = 4, var.2 = 5, var.3 = 6)
      , c = list(var.1 = 7, var.2 = 8, var.3 = 9)
      , d = list(var.1 = 10, var.2 = 11, var.3 = 12)
      )

has now a length of 4 and each list in l contains another list of the length 3. Now you can run

  library (plyr)
  df <- ldply (l, data.frame)

and should get the same result as in the answer @Marek and @nico.

@Michael Barton 2012-10-16 18:59:27

Great answer. I could you explain a little how that works? It simply returns a data frame for each list entry?

@Roah 2013-08-24 14:00:11

Imho the BEST answer. It returns a honest data.frame. All the data types (character, numeric, etc) are correctly transformed. If the list has different data types their will be all transformed to character with matrix approach.

@MySchizoBuddy 2015-07-25 19:21:08

the sample provided here isn't the one provided by the question. the result of this answer on the original dataset is incorrect.

@bAN 2016-07-31 11:57:30

Works great for me! And the names of the columns in the resulting Data Frame are set! Tx

@Garglesoap 2019-06-15 21:03:25

Is plyr multicore? Or is there a lapply version for use with mclapply?

Related Questions

Sponsored Content

20 Answered Questions

39 Answered Questions

[SOLVED] How to make a flat list out of list of lists

25 Answered Questions

[SOLVED] How do I concatenate two lists in Python?

30 Answered Questions

[SOLVED] How do I check if a list is empty?

  • 2008-09-10 06:20:11
  • Ray Vega
  • 2268133 View
  • 3236 Score
  • 30 Answer
  • Tags:   python list

13 Answered Questions

[SOLVED] How to join (merge) data frames (inner, outer, left, right)

15 Answered Questions

[SOLVED] How to clone or copy a list?

28 Answered Questions

[SOLVED] Finding the index of an item given a list containing it in Python

  • 2008-10-07 01:39:38
  • Eugene M
  • 3287028 View
  • 2704 Score
  • 28 Answer
  • Tags:   python list indexing

19 Answered Questions

[SOLVED] Drop data frame columns by name

  • 2011-01-05 14:34:29
  • Btibert3
  • 1231836 View
  • 769 Score
  • 19 Answer
  • Tags:   r dataframe r-faq

25 Answered Questions

[SOLVED] Why not inherit from List<T>?

9 Answered Questions

[SOLVED] Convert a list of data frames into one data frame

  • 2010-05-17 17:38:24
  • JD Long
  • 156551 View
  • 278 Score
  • 9 Answer
  • Tags:   list r dataframe

Sponsored Content