Subset views in R

I don’t know how to do this in R. So let me just say why I can’t.

I wanted something akin to Boost‘s sub-matrix views, where you can have indexes map back to the original matrix, so you don’t create a new object.

Sounds straightforward, just overload ‘[[‘ to subtract the offset and check the length. Alas, no dice. R zealously copies objects to the point this is not (as far as I know, which isn’t much) possible.

To demonstrate, the following function executes and times expressions operating on a vector called “M”.

time.op = function(N, Exp) {
        Exp = parse(text=Exp)
        M = numeric(N)

        N.Trials = 10
        Times = numeric(N.Trials)

        for (II in 1:N.Trials) {
                Times[[II]] = system.time(eval(Exp))[['elapsed']]
        }

        mean(Times)
}

Like

> time.op(1e5, 'sqrt(M)')
[1] 0.0024

Then see, does the size of M affect the time of the operation?

time.test = function(Exp) {
        Ns = 10^(1:6)
        Times = sapply(Ns, time.op, Exp=Exp)

        data.frame(Ns, Times)
}
> time.test('sqrt(M)')
     Ns  Times
1 1e+01 0.0000
2 1e+02 0.0000
3 1e+03 0.0001
4 1e+04 0.0004
5 1e+05 0.0027
6 1e+06 0.0274

(obviously)

And here’s why we know it’s copying:

> time.test('list(M)')
     Ns  Times
1 1e+01 0.0001
2 1e+02 0.0001
3 1e+03 0.0002
4 1e+04 0.0000
5 1e+05 0.0004
6 1e+06 0.0086

Or with attributes

> time.test('attr(M, "name") = "mike"')
     Ns  Times
1 1e+01 0.0000
2 1e+02 0.0000
3 1e+03 0.0000
4 1e+04 0.0000
5 1e+05 0.0006
6 1e+06 0.0081

Good luck making a subset without copying!

And here’s the relevant parts of the R code.

Making a list (main/builtin.c)

    for (i = 0; i < n; i++) {
                if (TAG(args) != R_NilValue) {
                    SET_STRING_ELT(names, i, PRINTNAME(TAG(args)));
                    havenames = 1;
                }
                else {
                    SET_STRING_ELT(names, i, R_BlankString);
                }
                if (NAMED(CAR(args)))
                    SET_VECTOR_ELT(list, i, duplicate(CAR(args)));
                else
                    SET_VECTOR_ELT(list, i, CAR(args));
                args = CDR(args);
            }
            if (havenames) {
                setAttrib(list, R_NamesSymbol, names);
    }

Note the repeated calls to “duplicate”.

And yes, duplicate does copy, and it is deep (main/duplicate.c):

case VECSXP:
        n = LENGTH(s);
        PROTECT(s);
        PROTECT(t = allocVector(TYPEOF(s), n));
        for(i = 0 ; i < n ; i++)
            SET_VECTOR_ELT(t, i, duplicate1(VECTOR_ELT(s, i)));
        DUPLICATE_ATTRIB(t, s);
        SET_TRUELENGTH(t, TRUELENGTH(s));
        UNPROTECT(2);
        break;
About these ads
This entry was posted in R. Bookmark the permalink.

4 Responses to Subset views in R

  1. Charles says:

    It can be done, it would just be a bit hacky, even in pure R. You would need to return an environment (to get reference semantics when you pass it elsewhere), and store the parent frame and reference to appropriately rewrite then forward assignments later.

    An example showing a subset of this functionality:

    
    view <- function(obj, row, col) structure(new.env(), class='view', env=parent.frame(), obj=substitute(obj), row=row, col=col)
    
    `[.view` <- function(x, i, j) eval(attr(x, 'obj'), attr(x, 'env'))[attr(x, 'row') + i, attr(x, 'col') + j]
    
    `[<-.view` <- function(x, i, j, value) { eval(substitute(obj[i, j] <- value, list(obj=attr(x, 'obj'), i=attr(x, 'row') + i, j= attr(x, 'col') + j, value=value)), attr(x, 'env')); x }
    

    (I’m not bothering to implement all the other functions, nor handle missing args etc).

    
    > M = matrix(1:16, 4)
    > M2 = view(M, 1, 1)
    > M2[1, 1]
    [1] 6
    > M2[1, 1] = 42
    > M
         [,1] [,2] [,3] [,4]
    [1,]    1    5    9   13
    [2,]    2   42   10   14
    [3,]    3    7   11   15
    [4,]    4    8   12   16
    > 
    
  2. ellbur says:

    Charles:

    That’s awesome! Thanks so much!

    I suppose it would be possible to generalize this into a generic “pointer”, though it seems you’d still have to use eval() to access it without copying.

  3. priscian says:

    I’ve been goofing around with a pure-R pointer class that uses environments, but “dereferencing” probably causes a copy—I haven’t had time to test that or look at the R source code yet, though:

    
    # Create "pointer" variables for large data sets:
    ptr <- pointer <- function(x, pos=-1, envir=as.environment(pos), inherits=F)
    {
      if (missing(x))
        stop("Must supply reference object.")
    
      r <- list()
      r$object <- envir
      r$name <- as.character(match.call()[2])
      class(r) <- "pointer"
    
      return (r)
    }
    
    as.pointer <- function(x)
    {
      pointer(x)
    }
    
    is.pointer <- function(x)
    {
      return (inherits(x, "pointer"))
    }
    
    .. <- deref <- function(x)
    {
      if (is.environment(x)) return (x)
      else return (get(x$name, envir=x$object))
    }
    "..<-" <- "deref<-" <- function(x, value)
    {
      if (is.pointer(x)) assign(x$name, value, envir=x$object)
      return (x)
    }
    
    print.pointer <- function(x, ...)
    {
      environment.name <- capture.output(print(x$object))
      cat("Pointer to variable '", x$name, "' in ", environment.name, ":\n\n", sep="")
      str(..(x))
    }
    
    ## usage:
    x <- list(frog="frog", fish="!frog")
    z <- pointer(x)
    ..(z)
    ..(z)$fish <- "trout"
    ..(z)
    x
    
  4. ellbur says:

    Hi priscian,

    Sorry for late reply. That’s really cool. I don’t have the R source in front of me right now but I would bet that calling deref() would not copy the object, but most ways you would use the result (eg storing into a list) would copy.

    But that’s no problem for your code because ‘..<-' takes care of modifying the underlying object. Sweet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s