By Aaron McDaid


2012-01-13 18:47:49 8 Comments

The language R confuses me. Entities have modes and classes, but even this is insufficient to fully describe the entity.

This answer says

In R every 'object' has a mode and a class.

So I did these experiments:

> class(3)
[1] "numeric"
> mode(3)
[1] "numeric"
> typeof(3)
[1] "double"

Fair enough so far, but then I passed in a vector instead:

> mode(c(1,2))
[1] "numeric"
> class(c(1,2))
[1] "numeric"
> typeof(c(1,2))
[1] "double"

That doesn't make sense. Surely a vector of integers should have a different class, or different mode, than a single integer? My questions are:

  • Does everything in R have (exactly one) class ?
  • Does everything in R have (exactly one) mode ?
  • What, if anything, does 'typeof' tell us?
  • What other information is needed to fully describe an entity? (Where is the 'vectorness' stored, for example?)

Update: Apparently, a literal 3 is just a vector of length 1. There are no scalars. OK But... I tried mode("string") and got "character", leading me to think that a string was a vector of characters. But if that was true, then this should be true, but it's not! c('h','i') == "hi"

4 comments

@Dominic Comtois 2014-10-18 03:22:53

Adding to one of your sub-questions :

  • What other information is needed to fully describe an entity?

In addition to class, mode, typeof, attributes, str, and so on, is() is also worth noting.

is(1)
[1] "numeric" "vector"

While useful, it is also unsatisfactory. In this example, 1 is more than just that; it is also atomic, finite, and a double. The following function should show all that an object is according to all available is.(...) functions:

what.is <- function(x, show.all=FALSE) {

  # set the warn option to -1 to temporarily ignore warnings
  op <- options("warn")
  options(warn = -1)
  on.exit(options(op))

  list.fun <- grep(methods(is), pattern = "<-", invert = TRUE, value = TRUE)
  result <- data.frame(test=character(), value=character(), 
                       warning=character(), stringsAsFactors = FALSE)

  # loop over all "is.(...)" functions and store the results
  for(fun in list.fun) {
    res <- try(eval(call(fun,x)),silent=TRUE)
    if(class(res)=="try-error") {
      next() # ignore tests that yield an error
    } else if (length(res)>1) {
      warn <- "*Applies only to the first element of the provided object"
      value <- paste(res,"*",sep="")
    } else {
      warn <- ""
      value <- res
    }
    result[nrow(result)+1,] <- list(fun, value, warn)
  }

  # sort the results
  result <- result[order(result$value,decreasing = TRUE),]
  rownames(result) <- NULL

  if(show.all)
    return(result)
  else
    return(result[which(result$value=="TRUE"),])
}

So now we get a more complete picture:

> what.is(1)
        test value warning
1  is.atomic  TRUE        
2  is.double  TRUE        
3  is.finite  TRUE        
4 is.numeric  TRUE        
5  is.vector  TRUE 

> what.is(CO2)
           test value warning
1 is.data.frame  TRUE        
2       is.list  TRUE        
3     is.object  TRUE        
4  is.recursive  TRUE 

You also get more information with the argument show.all=TRUE. I am not pasting any example here as the results are over 50 lines long.

Finally, this is meant as a complementary source of information, not as a replacement for any of the other functions mentionned earlier.

EDIT

To include even more "is" functions, as per @Erdogan's comment, you could add this bit to the function:

  # right after 
  # list.fun <- grep(methods(is), pattern = "<-", invert = TRUE, value = TRUE)
  list.fun.2 <- character()

  packs <- c('base', 'utils', 'methods') # include more packages if needed

  for (pkg in packs) {
    library(pkg, character.only = TRUE)
    objects <- grep("^is.+\\w$", ls(envir = as.environment(paste('package', pkg, sep = ':'))),
                    value = TRUE)
    objects <- grep("<-", objects, invert = TRUE, value = TRUE)
    if (length(objects) > 0) 
      list.fun.2 <- append(list.fun.2, objects[sapply(objects, function(x) class(eval(parse(text = x))) == "function")])
  }

  list.fun <- union(list.fun.1, list.fun.2)  

  # ...and continue with the rest
  result <- data.frame(test=character(), value=character(), 
                       warning=character(), stringsAsFactors = FALSE)
  # and so on...

@Dominic Comtois 2015-01-29 02:50:52

I included this function (with a few added features) in my package summarytools. Make sure to use devtools::install_github(dcomtois/summarytools) to get the most up-to-date version. The function also uses the functions mentionned in the other answers (mode, class, typeof, attributes, among others) to summarize as much as possible what really is an object.

@Erdogan CEVHER 2018-07-23 11:37:01

Very nice approach. There are 55 is... functions in my R version. That said, what will be crema on your code is adding other is.. functions. For example, isS4 is not among the 55 grep(methods(is), ...) bacause 55 functions has the naming isPOINTname whereas in isS4, there is no "." in fnc name; i.e. it is not is.S4. Can you catch, Dominic, the is... functions that violates isPOINTname naming?

@Dominic Comtois 2018-07-23 23:40:06

@ErdoganCEVHER Pls see my Edit at the bottom... Including all packages would be much more complicated, but having only a few makes good sense!

@Richie Cotton 2016-10-21 08:12:49

Here's some code to determine what the four type functions, class, mode, typeof, and storage.mode return for each of the kinds of R object.

library(methods)
library(dplyr)
library(xml2)

setClass("dummy", representation(x="numeric", y="numeric"))

types <- list(
  "logical vector" = logical(),
  "integer vector" = integer(),
  "numeric vector" = numeric(),
  "complex vector" = complex(),
  "character vector" = character(),
  "raw vector" = raw(),
  factor = factor(),
  "logical matrix" = matrix(logical()),
  "numeric matrix" = matrix(numeric()),
  "logical array" = array(logical(8), c(2, 2, 2)),
  "numeric array" = array(numeric(8), c(2, 2, 2)),
  list = list(),
  pairlist = .Options,
  "data frame" = data.frame(),
  "closure function" = identity,
  "builtin function" = `+`,
  "special function" = `if`,
  environment = new.env(),
  null = NULL,
  formula = y ~ x,
  expression = expression(),
  call = call("identity"),
  name = as.name("x"),
  "paren in expression" = expression((1))[[1]],
  "brace in expression" = expression({1})[[1]],
  "S3 lm object" = lm(dist ~ speed, cars),
  "S4 dummy object" = new("dummy", x = 1:10, y = rnorm(10)),
  "external pointer" = read_xml("<foo><bar /></foo>")$node
)

type_info <- Map(
  function(x, nm)
  {
    data_frame(
      "spoken type" = nm,
      class = class(x), 
      mode  = mode(x),
      typeof = typeof(x),
      storage.mode = storage.mode(x)
    )
  },
  types,
  names(types)
) %>% bind_rows

knitr::kable(type_info)

Here's the output:

|spoken type         |class       |mode        |typeof      |storage.mode |
|:-------------------|:-----------|:-----------|:-----------|:------------|
|logical vector      |logical     |logical     |logical     |logical      |
|integer vector      |integer     |numeric     |integer     |integer      |
|numeric vector      |numeric     |numeric     |double      |double       |
|complex vector      |complex     |complex     |complex     |complex      |
|character vector    |character   |character   |character   |character    |
|raw vector          |raw         |raw         |raw         |raw          |
|factor              |factor      |numeric     |integer     |integer      |
|logical matrix      |matrix      |logical     |logical     |logical      |
|numeric matrix      |matrix      |numeric     |double      |double       |
|logical array       |array       |logical     |logical     |logical      |
|numeric array       |array       |numeric     |double      |double       |
|list                |list        |list        |list        |list         |
|pairlist            |pairlist    |pairlist    |pairlist    |pairlist     |
|data frame          |data.frame  |list        |list        |list         |
|closure function    |function    |function    |closure     |function     |
|builtin function    |function    |function    |builtin     |function     |
|special function    |function    |function    |special     |function     |
|environment         |environment |environment |environment |environment  |
|null                |NULL        |NULL        |NULL        |NULL         |
|formula             |formula     |call        |language    |language     |
|expression          |expression  |expression  |expression  |expression   |
|call                |call        |call        |language    |language     |
|name                |name        |name        |symbol      |symbol       |
|paren in expression |(           |(           |language    |language     |
|brace in expression |{           |call        |language    |language     |
|S3 lm object        |lm          |list        |list        |list         |
|S4 dummy object     |dummy       |S4          |S4          |S4           |
|external pointer    |externalptr |externalptr |externalptr |externalptr  |

The types of objects available in R are discussed in the R Language Definition manual. There are a few types not mentioned here: you can't test for objects of type "promise", "...", and "ANY", and "bytecode" and "weakref" are only available at the C-level.

The table of available types in the R source is here.

@Aaron McDaid 2016-10-21 08:26:29

The first column of the last table is interesting. Can we say that every object is exactly one of those types? My knowledge of R has improved a lot since I first asked this, but I'm still confused about the fundamentals. For example, you gave list and data.frame as two different items, but now I feel that a data.frame is really just a list with a certain class

@Richie Cotton 2016-10-24 08:45:17

@AaronMcDaid Yes, a data frame is just a list, but with extra checking that each element is the same length, and a row.names attribute. And yes, all objects have exactly one of the 20-something possible values of typeof.

@Mike Williamson 2017-05-06 20:09:39

Wow... this is like gold here! It was also educational to me how you were able to create particular syntactic structures. E.g., expression({1})[[1]] is how we can recreate a brace in an expression.

@coder.in.me 2018-05-11 09:18:22

Why did Hadley Wickham state that "all those answers are wrong" with reference to this table? twitter.com/richierocks/status/789380495033376768

@Richie Cotton 2018-05-14 14:05:15

@coder.in.me Because at this point, mode and storage.mode are legacy features left over from S. You should only ever need to care about class() and typeof().

@Erdogan CEVHER 2018-07-18 09:19:32

Shouldn't the spoken type "primitive function" be "builtin function"? I have an objection there. Because: is.primitive returns TRUE for both special-function and builtin-function. I propose that the spoken type "primitive function" in the above table to be "builtin function" to reveal the distinction btw builtin-fnc and special-fnc..

@Erdogan CEVHER 2018-07-18 09:52:50

My second objection: spoken type "character vector" should be "string vector". The word character wrongly intutions "1-character string". However, objects with many characters can be in this style: class(c("ac3","b")) # character. Note the "ac3". For class(c("ac3","b")), I have a vector whose components are "strings", not single characters.

@Richie Cotton 2018-07-19 20:21:33

@ErdoganCEVHER I've changed "primitive" to "builtin" as suggested. I don't agree with your second point: nobody says "string vector". "character vector" is in all the documentation; "string" is more informal.

@Tommy 2012-01-13 21:22:07

I agree that the type system in R is rather weird. The reason for it being that way is that it has evolved over (a long) time...

Note that you missed one more type-like function, storage.mode, and one more class-like function, oldClass.

So, mode and storage.mode are the old-style types (where storage.mode is more accurate), and typeof is the newer, even more accurate version.

mode(3L)                  # numeric
storage.mode(3L)          # integer
storage.mode(`identical`) # function
storage.mode(`if`)        # function
typeof(`identical`)       # closure
typeof(`if`)              # special

Then class is a whole different story. class is mostly just the class attribute of an object (that's exactly what oldClass returns). But when the class attribute is not set, the class function makes up a class from the object type and the dim attribute.

oldClass(3L) # NULL
class(3L) # integer
class(structure(3L, dim=1)) # array
class(structure(3L, dim=c(1,1))) # matrix
class(list()) # list
class(structure(list(1), dim=1)) # array
class(structure(list(1), dim=c(1,1))) # matrix
class(structure(list(1), dim=1, class='foo')) # foo

Finally, the class can return more than one string, but only if the class attribute is like that. The first string value is then kind of the main class, and the following ones are what it inherits from. The made-up classes are always of length 1.

# Here "A" inherits from "B", which inherits from "C"
class(structure(1, class=LETTERS[1:3])) # "A" "B" "C"

# an ordered factor:
class(ordered(3:1)) # "ordered" "factor"

@Josh O'Brien 2012-01-14 02:22:20

What an excellent, lucid explanation. You've cleared up many mysteries for me with this one answer. Thanks!

@Tommy 2012-01-14 06:52:03

@JoshO'Brien - Glad you found it useful!

@Aaron McDaid 2012-01-14 15:16:43

Thanks. I have another question. Does everything have 'attributes'? It appears that everything does, I was able to do class(structure(c(1,2), class="list")) and now it thinks the vector's class is "list"!

@Tommy 2012-01-15 04:25:17

@AaronMcDaid - Yes, all objects can have attributes. And setting the class attribute to something wrong (like you setting class of a numeric vector to "list"), can lead to errors. But is.list would still return FALSE because it uses the type information, not the class.

@Dambo 2017-08-29 15:49:04

@Tommy you say that "class makes up a function from the object type". So why typeof double corresponds to class numeric?

@Erdogan CEVHER 2018-07-30 10:34:10

Could u little bit elaborate on "The made-up classes are always of length 1"? In the example you gave, length(c("B","C")) is 2!.

@Ari B. Friedman 2012-01-13 19:09:25

Does everything in R have (exactly one) class ?

Exactly one is definitely not right:

> x <- 3
> class(x) <- c("hi","low")
> class(x)
[1] "hi"  "low"

Everything has (at least one) class.

Does everything in R have (exactly one) mode ?

Not certain but I suspect so.

What, if anything, does 'typeof' tell us?

typeof gives the internal type of an object. Possible values according to ?typeof are:

The vector types "logical", "integer", "double", "complex", "character", "raw" and "list", "NULL", "closure" (function), "special" and "builtin" (basic functions and operators), "environment", "S4" (some S4 objects) and others that are unlikely to be seen at user level ("symbol", "pairlist", "promise", "language", "char", "...", "any", "expression", "externalptr", "bytecode" and "weakref").

mode relies on typeof. From ?mode:

Modes have the same set of names as types (see typeof) except that types "integer" and "double" are returned as "numeric". types "special" and "builtin" are returned as "function". type "symbol" is called mode "name". type "language" is returned as "(" or "call".

What other information is needed to fully describe an entity? (Where is the 'listness' stored, for example?)

A list has class list:

> y <- list(3)
> class(y)
[1] "list"

Do you mean vectorization? length should be sufficient for most purposes:

> z <- 3
> class(z)
[1] "numeric"
> length(z)
[1] 1

Think of 3 as a numeric vector of length 1, rather than as some primitive numeric type.

Conclusion

You can get by just fine with class and length. By the time you need the other stuff, you likely won't have to ask what they're for :-)

@Ben Bolker 2012-01-13 19:42:17

attributes may be handy too.

@Tommy 2012-01-13 21:44:46

As I show in my answer, a list with a dim attribute is not of class "list".

@Ari B. Friedman 2012-01-13 22:28:26

Good points both. Feel free to edit, or I'll update later.

@Ari B. Friedman 2012-01-14 11:56:04

I didn't realize you could set dim on a list. Oddness.

Related Questions

Sponsored Content

3 Answered Questions

[SOLVED] What's the difference between integer class and numeric class in R

1 Answered Questions

[SOLVED] vector - character/integer class (under the hood)

2 Answered Questions

[SOLVED] What is the difference between mode and class in R?

  • 2016-02-16 23:33:16
  • Tripartio
  • 10987 View
  • 25 Score
  • 2 Answer
  • Tags:   r

11 Answered Questions

[SOLVED] How to Correctly Use Lists in R?

1 Answered Questions

[SOLVED] How to prevent integer reordering when converting to factor?

  • 2017-07-20 10:45:43
  • Dambo
  • 41 View
  • 1 Score
  • 1 Answer
  • Tags:   r

3 Answered Questions

[SOLVED] Possible bug in R all.equal

  • 2012-09-17 05:13:19
  • Igor
  • 791 View
  • 4 Score
  • 3 Answer
  • Tags:   r

2 Answered Questions

[SOLVED] Getting an integer atomic vector (vs. numeric)

  • 2015-09-02 05:20:10
  • jennybryan
  • 1869 View
  • 3 Score
  • 2 Answer
  • Tags:   r

4 Answered Questions

[SOLVED] Applying a function recursively over a list

1 Answered Questions

[SOLVED] convert character string into R integer vector

1 Answered Questions

[SOLVED] S4 method with a scalar(non vector) return value

  • 2013-11-07 22:18:30
  • agstudy
  • 204 View
  • 0 Score
  • 1 Answer
  • Tags:   r s4

Sponsored Content