R - 搜索数字序列


老夫的少女心
2025-01-03 07:33:01 (2月前)


我的数据框看起来像这样:

nr grp start stop l1 ratio
11 1 300 350 + 1.0
12 1 400 450 - 0.8
13 1 50 550 + 1.0 ……

3 条回复
  1. 0# Fire ming | 2019-08-31 10-32



    我们可以使用顺序为每个变化创建组

    diff

    如果组中有多于1行并更新值,则从每个组中仅选择第一行

    stop

    在组中持续值。




    1. library(dplyr)

    2. df %>%
      group_by(group = cumsum(c(1, diff(nr) != 1))) %>%
      mutate(stop = last(stop)) %>%
      filter(n() > 1 & row_number() == 1) %>%
      ungroup() %>%
      select(-group)

    3. nr grp start stop l1 ratio

      1 11 1 300 650 + 1

      2 36 1 1000 1400 + 0

      </code>

  2. 1# 银角 | 2019-08-31 10-32



    使用data.table很容易:




    1. DT <- fread(“ nr grp start stop l1 ratio
      11 1 300 350 + 1.0
      12 1 400 450 - 0.8
      13 1 50 550 + 1.0
      14 1 600 650 - 1.0
      21 1 800 850 - 1.0
      36 1 1000 1050 + 0.0
      37 1 1100 1200 + 0.9
      38 1 1250 1300 - 0.7
      39 1 1350 1400 + 1.0”)

    2. setDT(DT) #if you haven’t imported with fread

    3. create group ID, here for didactic reason

      DT[, groups := cumsum(c(TRUE, diff(nr) != 1))]

    4. take first row and replace stop from last row

      DT[, if (.N > 1) {
      res <- .SD[1]
      res$stop <- .SD[.N, stop]
      res
      } else NULL, by = groups]

    5. groups nr grp start stop l1 ratio

      1: 1 11 1 300 650 + 1

      2: 3 36 1 1000 1400 + 0

      </code>

登录 后才能参与评论