列中除某些词外的标题词

html5 • 2022年12月26日 pm9:26 • 问答

除了列表中的单词，我如何命名所有单词，保留？

keep = ['for', 'any', 'a', 'vs']
df.col
 ``         
0    1. The start for one
1    2. Today's world any
2    3. Today's world vs. yesterday.

预期输出：

     number   title
0     1       The Start for One
1     2       Today's World any
2     3       Today's World vs. Yesterday.

我试过

df['col'] = df.col.str.title().mask(~clean['col'].isin(keep))

回答

这是使用str.replace和传递替换函数的一种方法：

def replace(match):
    word = match.group(1)
    if word not in keep:
        return word.title()
    return word

df['title'] = df['title'].str.replace(r'(w+)', replace)

   number                         title
0       1             The Start for One
1       2             Today'S World any
2       3  Today'S World vs. Yesterday.

回答

首先，我们创建您的number和title列。然后我们使用Series.explode每行获取一个单词。如果单词在keep我们忽略它，否则应用Series.str.title：

keep = ['for', 'any', 'a', 'vs']

# create 'number' and 'title' column
df[['number', 'title']] = df['col'].str.split(".", expand=True, n=1)
df = df.drop(columns='col')

# apply str.title if not in keep
words = df['title'].str.split().explode()
words = words.str.replace(".", "", regex=False)
words = words.mask(words.isin(keep)).str.title().fillna(words)
df['title'] = words.groupby(level=0).agg(" ".join)

输出

  number                         title
0      1             The Start for One
1      2             Today'S World any
2      3  Today'S World vs. Yesterday.

以上是列中除某些词外的标题词的全部内容。

THE END

二维码

为什么我不能在C#9中定义顶级扩展方法？

< <上一篇

Pandas数据框中每天新出现的次数（不是计数或总和）

下一篇>>

搜索内容

列中除某些词外的标题词

回答

回答

目录

目录

推荐文章

最新文章