给定readLines之后的输出,这必须是CSV文件的内容:
"","myid","var" "1","1","0.5949020" "2","2","0.8515591" "3","3","0.8139010" "4","4","0.3804234" "5","5","0.4923082" "6","6","0.9933775" "7","7","0.1740895" "8","8","0.8342808" "9","9","0.3958154" "10","10","0.9690561"
那是, 的 你的价值观是 逗号分开 和 用双引号括起来 强> 。当我读到这个文件时,我得到你的输出:
dat [1] "\"\",\"myid\",\"var\"" "\"1\",\"1\",\"0.5949020\"" [3] "\"2\",\"2\",\"0.8515591\"" "\"3\",\"3\",\"0.8139010\"" [5] "\"4\",\"4\",\"0.3804234\"" "\"5\",\"5\",\"0.4923082\"" [7] "\"6\",\"6\",\"0.9933775\"" "\"7\",\"7\",\"0.1740895\"" [9] "\"8\",\"8\",\"0.8342808\"" "\"9\",\"9\",\"0.3958154\"" [11] "\"10\",\"10\",\"0.9690561\""
所以你需要做的是
unlist(strsplit(..., split = ",")
和
gsub("\"", "", ...)
这给了我们:
unlist(strsplit(gsub("\"", "", dat), split = ",")) [1] "" "myid" "var" "1" "1" "0.5949020" "2" [8] "2" "0.8515591" "3" "3" "0.8139010" "4" "4" [15] "0.3804234" "5" "5" "0.4923082" "6" "6" "0.9933775" [22] "7" "7" "0.1740895" "8" "8" "0.8342808" "9" [29] "9" "0.3958154" "10" "10" "0.9690561"
以下是在发布到数据帧时转换数据的完整指令序列。
set.seed(1234) # Make the results reproducible write.csv(data.frame(myid=1:10,var=runif(10)),"temp.csv") dat <- readLines("temp.csv") df1 <- strsplit(dat[-1], ",") df1 <- do.call(rbind, df1) df1 <- df1[,-1] df1 <- as.data.frame(df1) df1[] <- lapply(df1, function(x) as.numeric(as.character(x))) names(df1) <- gsub('"', '', strsplit(dat[1], ',')[[1]][-1], fixed = TRUE) df1