你可以试试
dt <- fread("ID Time Time-2D Col Count A 2017-06-05 2017-06-03 M 1 A 2017-06-02 2017-05-31 M 1 A 2017-06-03 2017-06-01 M 1 B 2017-06-02 2017-05-31 K 1 B 2017-06-01 2017-05-30 M 4") dt1 <- dcast(dt, ID+Time+`Time-2D`~Col, value.var = c("Count")) dt1[, K := ifelse(is.na(K), 0, K)] dt1[, M := ifelse(is.na(M), 0, M)] ID Time Time-2D K M 1: A 2017-06-02 2017-05-31 0 1 2: A 2017-06-03 2017-06-01 0 1 3: A 2017-06-05 2017-06-03 0 1 4: B 2017-06-01 2017-05-30 0 4 5: B 2017-06-02 2017-05-31 1 0 dt1[, Col_K := K/(K+M)] dt1[, Col_M := M/(K+M)] ID Time Time-2D K M Col_K Col_M 1: A 2017-06-02 2017-05-31 0 1 0 1 2: A 2017-06-03 2017-06-01 0 1 0 1 3: A 2017-06-05 2017-06-03 0 1 0 1 4: B 2017-06-01 2017-05-30 0 4 0 1 5: B 2017-06-02 2017-05-31 1 0 1 0
也许你可以结合最后两行。就像是
dt1[, `:=`()]
我想我理解你的要求。你似乎关心观察的顺序,例如,无论是第二次观察 Time 是在第一次观察之前 Time 。这没有多大意义,但这是一个退出高效的data.table解决方案,以实现这一目标。这基本上是一个 非等距 加入 ID , Col ,两者 Time 列 和 行索引(基本上是外观顺序)。事后,它只是 dcast 从长到高转换(如在上一个问题中)。请注意,结果按日期排序,但我保留了 rowindx 变量,因此您可以使用重新排序 setorder 。此外,我将保持比率calc给你,因为这是非常基本的(提示 - 别 使用循环,它是一个完全矢量化的一个班轮)
Time
ID
Col
dcast
rowindx
setorder
library(data.table) #v1.10.4+ ## Read the data DT <- fread("ID Time Col Count A 2017-06-05 M 1 A 2017-06-02 M 1 A 2017-06-03 M 1 B 2017-06-02 K 1 B 2017-06-01 M 4") ## Prepare the variables we need for the join DT[, Time := as.IDate(Time)] DT[, Time_2D := Time - 2L] DT[, rowindx := .I] ## Non-equi join, sum `Count` by each join DT2 <- DT[DT, sum(Count), on = .(ID, Col, rowindx <= rowindx, Time <= Time, Time >= Time_2D), by = .EACHI] ## Fix column names (a known issue) setnames(DT2, make.unique(names(DT2))) ## Long to wide (You can reorder back using `rowindx` and `setorder` function) dcast(DT2, ID + Time + Time.1 + rowindx ~ Col, value.var = "V1", fill = 0) # ID Time Time.1 rowindx K M # 1: A 2017-06-02 2017-05-31 2 0 1 # 2: A 2017-06-03 2017-06-01 3 0 2 # 3: A 2017-06-05 2017-06-03 1 0 1 # 4: B 2017-06-01 2017-05-30 5 0 4 # 5: B 2017-06-02 2017-05-31 4 1 0