将pandas列按数字拆分为两个（包含时间）

html5 • 2022年9月13日 pm5:03 • 问答

我有一个数据框：

col_1
Agent AB 7:00 AM
Agent AB 7:00 AM
Cust XY 8:00 AM
Cust XY 9:00 AM
Agent AB 11:00 AM

我想将其拆分为 2 列，以便将时间拆分为一个新列。

预期输出：

col_1        col_2
Agent AB     7:00 AM
Agent AB     7:00 AM
Cust XY      8:00 AM
Cust XY      9:00 AM
Agent AB     11:00 AM

我研究并发现这可以使用：字符串切片来完成。

就像是：

df['col_2'] = df['col_1'].str[-8:-1]

有没有更好的办法？？

回答

df["col_1"].str.extract(r"^(D+)(.+)$").rename(columns={0: "col_1", 1: "col_2"})

给

       col_1     col_2
0  Agent AB    7:00 AM
1  Agent AB    7:00 AM
2   Cust XY    8:00 AM
3   Cust XY    9:00 AM
4  Agent AB   11:00 AM

正则表达式正在寻找连续的非数字(D+)，然后用(.+). 然后我们重命名列。

回答

使用您显示的样本，您能否尝试以下操作。

import pandas as pd
df["col_1"].str.extract(r"^(.*?)s+(d{1,2}:d{1,2} [AP]M)$").rename(columns={0: "col_1", 1: "col_2"})

上述正则表达式的在线演示

说明：为上述正则表达式添加详细说明。

^(.*?)                      ##Creating 1st capturing group, matching from starting of value and doing a non-greedy match(till followed by spaces 1 or more occurrences).
s+                         ##Mentioning spaces 1 or more spaces here.
(d{1,2}:d{1,2} [AP]M)$    ##Creating 2nd capturing group, matching digits 1 or 2 numbers followed by : matching 1 or 2 digits followed by space and AM/PM.

显示示例的输出如下：

      col_1    col_2
0  Agent AB  7:00 AM
1  Agent AB  7:00 AM
2   Cust XY  8:00 AM

以上是将pandas列按数字拆分为两个（包含时间）的全部内容。

THE END

二维码

data.tablejoin很难理解

< <上一篇

使用新方法动态包装任意类

下一篇>>

搜索内容

将pandas列按数字拆分为两个（包含时间）

回答

回答

目录

目录

推荐文章

最新文章