A postdoctoral researcher asked me the other day to help him expand a vector of comma delimited values so he could do computations in R with it. I wrote an R function to solve the problem. Here is the before and after:
> data Name Score1 Score2 1 Bill 1,3,4,3,6,9 F1,F3,F2 2 Bob 3,2,3 F2,F2,F4 3 Sam 2,5,3 F5,F2,F4 > expand.delimited(data) Name Score1 1 Bill 1 2 Bill 3 3 Bill 4 4 Bill 3 5 Bill 6 6 Bill 9 7 Bob 3 8 Bob 2 9 Bob 3 10 Sam 2 11 Sam 5 12 Sam 3
# Description # Accepts a data.frame where col1 represents a factor and col2 represents # comma or other delimited values to be expanded according to col1. # Returns a data.frame. # Usage # expand.delimited(x, ...) # Default # expand.delimited(x, col1=1, col2=2, sep=",") # Arguments # x A data.frame # col1 Column in data.frame to act as factor # col2 Column in data.frame that is delimited and will be expanded # sep Delimiter #Download data #Read in data data<-read.table("expand_delimited.txt",header=T) #Function to expand data expand.delimited <- function(x, col1=1, col2=2, sep=",") { rnum <- 1 expand_row <- function(y) { factr <- y[col1] strng <- toString(y[col2]) expand <- strsplit(strng, sep)[[1]] num <- length(expand) factor <- rep(factr,num) return(as.data.frame(cbind(factor,expand), row.names=seq(rnum:(rnum+num)-1))) rnum <- (rnum+num)-1 } expanded <- apply(x,1,expand_row) df <- do.call("rbind", expanded) names(df) <- c(names(x)[col1],names(x)[col2]) return(df) } # Example expand.delimited(data)

Just a thought: take the output of a dataframe:
scoretable <- sapply(data$Score1, function(x) as.numeric(unlist(strsplit(x,','))))
scoretable$names<-data$Name
That should
give the same output.
Thanks for the thought. The only part of your code I needed to change was to make sure x was a character: scoretable <- sapply(data$Score1, function(x) as.numeric(unlist(strsplit(as.character(x),’,')))).
I think the “melt” function of hadleys “reshape” package does already provide exactly this functionality. http://www.statmethods.net/management/reshape.html
Thanks for the comment. From what I can tell after reading more about the melt function, in order to use it I would first need to parse the Score1 column and split it into columns. I think this would be problematic because the number of values in Score1 vary and are not in a meaningful order. I’m no expert on reshape2, so there may be a way to piece together multiple functions to achieve my goal, but this function gets the job done.