group_by并保留所有不包含特定值的组并过滤有值的地方

我有以下数据框:

df <- data.frame(
  Code = c("a", "a", "a", "a", "a", "b", "b", "b", "b", "b"),
  Inst = c("Yes", "No", "No", "No", "No", "No", "No", "No", "No", "No"),
  Date = c(
    "2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04", "2021-01-05", 
    "2021-01-06", "2021-01-06", "2021-01-06", "2021-01-09", "2021-01-10"
  )
)

我想应用dplyr::group_by到变量Code并过滤特定值 "Yes" 和 minimum Date,但我想保留不包含 Yes 值的组的所有观察结果。我试过了,filter(any(Inst == "Yes"))但这不起作用。

我想要这样的结果:

Code  Inst  Date
a      Yes  2021-01-01
b      No   2021-01-06
b      No   2021-01-06
b      No   2021-01-06

回答

如果可以有多个Yes值:

df %>%
 group_by(Code) %>%
 slice(if(all(Inst != "Yes")) 1:n() else which(Inst == "Yes"))

  Code  Inst 
  <chr> <chr>
1 a     Yes  
2 b     No   
3 b     No   
4 b     No   
5 b     No   
6 b     No  

考虑到更新的问题:

df %>%
 mutate(Date = as.Date(Date, format = "%Y-%m-%d")) %>%
 group_by(Code) %>%
 slice(if(all(Inst != "Yes")) 1:n() else which(Inst == "Yes")) %>%
 filter(Date == min(Date))

  Code  Inst  Date      
  <chr> <chr> <date>    
1 a     Yes   2021-01-01
2 b     No    2021-01-06
3 b     No    2021-01-06
4 b     No    2021-01-06


以上是group_by并保留所有不包含特定值的组并过滤有值的地方的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>