早上好。我有这个数据集:
附录| Change_Serial_Number |状态|持续时间|里程|服务20101234 0。 60 120000 Z.20101234 ……
该 lag() 函数只对早期使用特定数量的观察值非常有用。在这种情况下,您不知道您要使用的值是来自先前的观察值还是之前的五或六次观察值,因此不使用 lag() , 你应该 RETAIN 附加变量并在适当时更新其值:
lag()
RETAIN
data db_rdg2; retain duration_prev .; set db_rdg; by Appendix; if first.Appendix or status = 'Activated' then duration_prev = duration; run;
该 RETAIN 声明允许 duration_prev 从输入读取每个新观察值时保留其值,而不是重置为丢失。
duration_prev
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000214163.htm
而不是使用LAG来检索 duration 从前一行开始,您需要将激活状态跟踪变量(持续时间,里程和序列)存储在保留和更新的变量中 的 后 强> 一个明确的输出。
duration
在这两个示例代码中,我一直在跟踪序列,因为您可能想知道先前激活的变化数量。
data have; input Appendix Change_Serial_Number Status $ Duration Mileage Service $; datalines; 20101234 0 . 60 120000 Z 20101234 1 Proposed 48 110000 Z 20101234 2 Activated 24 90000 Z 20101234 3 Proposed 60 120000 Z 20101234 4 Proposed 50 160000 B 20101234 5 Activated 36 110000 B run; * NOTE: _APA suffix means @ prior activate; * version 1; * implicit loop with by group processing means ; * explicit first. test needed in order to reset the apa tracking variables; data want; set have; by appendix; if first.appendix then do; length csn_apa dur_apa mil_apa 8; call missing(csn_apa, dur_apa, mil_apa); end; output; if status in (' ' 'Activate') then do; csn_apa = change_serial_number; dur_apa = duration; mil_apa = mileage; end; retain csn_apa dur_apa mil_apa; run; * version 2; * DOW version; * explicit loop over group means first. handling not explicitly needed; * implicit loop performs tracking variable resets; * retain not needed because output and tracking variables modified; * within current iteration of implicit loop; data want2; do until (last.appendix); set have; by appendix; output; if status in (' ' 'Activate') then do; csn_apa = change_serial_number; dur_apa = duration; mil_apa = mileage; end; end; run;