匹配R中列表中的产品

html5 • 2022年9月13日 pm2:36 • 问答

我必须对这样的产品列表进行分类：

product_list<-data.frame(product=c('banana from ecuador 1 unit', 'argentinian meat (1 kg) cow','chicken breast','noodles','salad','chicken salad with egg'))

基于此向量的每个元素中包含的单词：

product_to_match<-c('cow meat','deer meat','cow milk','chicken breast','chicken egg salad','anana')

我必须将每个产品 product_to_match 的所有单词匹配到数据帧的每个元素中。

我不确定这样做的最佳方法是什么，以便将每个产品分类到一个新列中，以便有这样的东西：

product_list<-data.frame(product=c('banana from ecuador 1 unit', 'argentinian meat (1 kg) 
cow','chicken breast','noodles','salad','chicken salad with egg'),class=c(NA,'cow meat','chicken 
breast',NA,NA,'chicken egg salad'))

请注意，'anana' 与 'banana' 不匹配，尽管字符串中包含字符而不是单词。我不知道该怎么做。

谢谢你。

回答

也许这会有所帮助

q <- outer(
  strsplit(product_to_match, "s+"),
  strsplit(product_list$product, "s+"),
  FUN = Vectorize(function(x, y) all(x %in% y))
)
product_list$class <- product_to_match[replace(colSums(q * row(q)), colSums(q) == 0, NA)]

以至于

> product_list
                      product             class
1  banana from ecuador 1 unit              <NA>
2 argentinian meat (1 kg) cow          cow meat
3              chicken breast    chicken breast
4                     noodles              <NA>
5                       salad              <NA>
6      chicken salad with egg chicken egg salad

Also, variation on a theme - `product_to_match[max.col(cbind(outer(
strsplit(product_list$product, "s+"),
strsplit(product_to_match, "s+"),
FUN = Vectorize(function(x, y) all(y %in% x))
), TRUE), "first")]`

以上是匹配R中列表中的产品的全部内容。

THE END

二维码

尝试使用文件系统创建文件

< <上一篇

golang在将float32转换为float64时失去精度？

下一篇>>

搜索内容

匹配R中列表中的产品

回答

目录

目录

推荐文章

最新文章