基于字符串结尾的正则表达式过滤数据集

我正在使用 REGEXP 过滤具有 10 行的数据集,如下所示:

ID     Product
1      "VENLAFAXINE HCL CAP ER 24HR 37.5 MG (BASE EQUIVALENT)"
2      "MINOXIDIL POWDER"
3      "MENTHOL LOZENGE 10 MG"
4      "ZINC CHLORIDE GRANULES"
5      "CLOPIDOGREL BISULFATE TAB 75 MG (BASE EQUIV)"
6      "METHYLPREDNISOLONE TAB THERAPY PACK 4 MG (21)"
7      "DEXAMETHASONE TAB THERAPY PACK 1.5 MG (7)"
8      "METHYLPREDNISOLONE DOSE P (16)"
9      "MILLIPRED DP (13)"
10     "ZONACORT 7 DAY"

并且会让它看起来像

ID     Product
6      "METHYLPREDNISOLONE TAB THERAPY PACK 4 MG (21)"
7      "DEXAMETHASONE TAB THERAPY PACK 1.5 MG (7)"
8      "METHYLPREDNISOLONE DOSE P (16)"
9      "MILLIPRED DP (13)"

实际上,我想根据最后一个字符是否是括号内的数字来过滤数据集。我试过使用无济于事:

ID     Product
1      "VENLAFAXINE HCL CAP ER 24HR 37.5 MG (BASE EQUIVALENT)"
2      "MINOXIDIL POWDER"
3      "MENTHOL LOZENGE 10 MG"
4      "ZINC CHLORIDE GRANULES"
5      "CLOPIDOGREL BISULFATE TAB 75 MG (BASE EQUIV)"
6      "METHYLPREDNISOLONE TAB THERAPY PACK 4 MG (21)"
7      "DEXAMETHASONE TAB THERAPY PACK 1.5 MG (7)"
8      "METHYLPREDNISOLONE DOSE P (16)"
9      "MILLIPRED DP (13)"
10     "ZONACORT 7 DAY"

回答

在 中base R,我们可以使用grepl左括号 ( () 后跟一位或多位数字 ( d+),然后匹配字符串)末尾 ( $)的右括号 ( )

subset(df1, grepl("(d+)$", Product))
#    ID                                       Product
#6  6 METHYLPREDNISOLONE TAB THERAPY PACK 4 MG (21)
#7  7     DEXAMETHASONE TAB THERAPY PACK 1.5 MG (7)
#8  8                METHYLPREDNISOLONE DOSE P (16)
#9  9                             MILLIPRED DP (13)

数据

df1 <- structure(list(ID = 1:10, Product = c("VENLAFAXINE HCL CAP ER 24HR 37.5 MG (BASE EQUIVALENT)", 
"MINOXIDIL POWDER", "MENTHOL LOZENGE 10 MG", "ZINC CHLORIDE GRANULES", 
"CLOPIDOGREL BISULFATE TAB 75 MG (BASE EQUIV)", "METHYLPREDNISOLONE TAB THERAPY PACK 4 MG (21)", 
"DEXAMETHASONE TAB THERAPY PACK 1.5 MG (7)", "METHYLPREDNISOLONE DOSE P (16)", 
"MILLIPRED DP (13)", "ZONACORT 7 DAY")), class = "data.frame", row.names = c(NA, 
-10L))


以上是基于字符串结尾的正则表达式过滤数据集的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>