我有一个名为core.df_long的data.frame,它的示例如下所示:
ID 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2013 1 6387 Aa Ba …
考虑使用构建数据框或矩阵列表 lapply 迭代一个 seq 列数R,使用基数R:
lapply
seq
的 数据 强>
txt <- ' ID 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 1 6387 Aa Ba Ba Baa Caa B A Baa Baa B Ba B B A Ba B Caa Ba 2 6403 B Caa Caa B Caa Caa Caa Caa B Caa Caa Caa B B B B B B 3 6408 A Ba Ba Baa Baa Ba A A B Ba B B Ba Ba Ba Baa Ba B 4 6411 B Ba B Ba Ba B Caa Caa B Ba B B Caa B Ba Caa B Ba' core.df_long <- read.table(text=txt, header=TRUE, stringsAsFactors=FALSE)
的 lapply + seq 强>
df_list <- lapply(seq(2, ncol(core.df_long)-1), function(x) { sub_df <- core.df_long[,c(1, x:(x+1))] agg_df <- setNames(aggregate(ID ~ ., sub_df, length), c("i", "j", "Count")) }) head(df_list) # [[1]] # i j Count # 1 A Ba 1 # 2 Aa Ba 1 # 3 B Ba 1 # 4 B Caa 1 # # [[2]] # i j Count # 1 Ba B 1 # 2 Ba Ba 2 # 3 Caa Caa 1 # # [[3]] # i j Count # 1 Caa B 1 # 2 B Ba 1 # 3 Ba Baa 2 # # [[4]] # i j Count # 1 Ba Ba 1 # 2 Baa Baa 1 # 3 B Caa 1 # 4 Baa Caa 1 # # [[5]] # i j Count # 1 Ba B 1 # 2 Caa B 1 # 3 Baa Ba 1 # 4 Caa Caa 1 # # [[6]] # i j Count # 1 B A 1 # 2 Ba A 1 # 3 B Caa 1 # 4 Caa Caa 1
并为 xtabs ,只需在匿名函数中添加一行:
xtabs
df_list <- lapply(seq(2, ncol(core.df_long)-1), function(x) { sub_df <- core.df_long[,c(1, x:(x+1))] agg_df <- setNames(aggregate(ID ~ ., sub_df, length), c("i", "j", "Count")) xtabs(Count ~ i + j, data = agg_df) }) head(df_list) # [[1]] # j # i Ba Caa # A 1 0 # Aa 1 0 # B 1 1 # # [[2]] # j # i B Ba Caa # Ba 1 2 0 # Caa 0 0 1 # # [[3]] # j # i B Ba Baa # B 0 1 0 # Ba 0 0 2 # Caa 1 0 0 # # [[4]] # j # i Ba Baa Caa # B 0 0 1 # Ba 1 0 0 # Baa 0 1 1 # # [[5]] # j # i B Ba Caa # Ba 1 0 0 # Baa 0 1 0 # Caa 1 0 1 # # [[6]] # j # i A Caa # B 1 1 # Ba 1 0 # Caa 0 1
你可以轻松地做到这一点 map 要么 lapply ,不需要循环:
map
library(tidyverse) # First, I create your example data as a matrix. Depending on your data, this first step might look different. data_raw <- matrix(c("ID", "1996", "1997", "1998", "1999", "2000", "2001", "2002", "2003", "2004", "2005", "2006", "2007", "2008", "2009", "2010", "2011", "2012", "2013", "6387", "Aa", "Ba", "Ba", "Baa", "Caa", "B", "A", "Baa", "Baa", "B", "Ba", "B", "B", "A", "Ba", "B", "Caa", "Ba", "6403", "B", "Caa", "Caa", "B", "Caa", "Caa", "Caa", "Caa", "B", "Caa", "Caa", "Caa", "B", "B", "B", "B", "B", "B", "6408", "A", "Ba", "Ba", "Baa", "Baa", "Ba", "A", "A", "B", "Ba", "B", "B", "Ba", "Ba", "Ba", "Baa", "Ba", "B", "6411", "B", "Ba", "B", "Ba", "Ba", "B", "Caa", "Caa", "B", "Ba", "B", "B", "Caa", "B", "Ba", "Caa", "B", "Ba"), nrow = 5, byrow = TRUE) # Second, I drop the first row (the IDs) and the first line (the colnames), since we don't need them later on data <- data_raw[-1, -1] # Third, I map the xtab-command to the column-number in your data map(seq(dim(data)[2] - 1), ~{ out <- as_tibble(data[,.x:(.x + 1)]) %>% set_names(c("i", "j")) %>% count(i, j) %>% xtabs(n ~ i + j, data = .) return(out) }) [[1]] j i Ba Caa A 1 0 Aa 1 0 B 1 1 [[2]] j i B Ba Caa Ba 1 1 0 Caa 0 0 1 ... # Finallay, I assign all these list elements to xtab-objects data_names <- data_raw[1, -1] for (i in seq(length(data_out))) { assign(str_c("res", str_sub(data_names[i], start = -2), str_sub(data_names[i + 1], start = -2), sep = ""), data_out[[i]]) } res0001 j i B Ba Caa Ba 1 0 0 Baa 0 1 0 Caa 1 0 1 str(res0001) 'xtabs' int [1:3, 1:3] 1 0 1 0 1 0 0 0 1 - attr(*, "dimnames")=List of 2 ..$ i: chr [1:3] "Ba" "Baa" "Caa" ..$ j: chr [1:3] "B" "Ba" "Caa" - attr(*, "call")= language xtabs(formula = n ~ i + j, data = .) # A solution for three years map(seq(dim(data)[2] - 2), ~{ # for four years the '-2' becomes '-3' out <- as_tibble(data[,.x:(.x + 2)]) %>% # for four years the '+2' becomes '+3' set_names(c("i", "j", "k")) %>% # for four years you have to add letter 'l' mutate(ij = str_c(i, j, sep = "-")) %>% # for four years it's 'ijk = str_c(i, j, k, sep = "-") count(ij, k) %>% # for four years it's 'ijk, l' xtabs(n ~ ij + k, data = .) # for four years it's 'n ~ ijk + l' return(out) })