30天滚动窗口中的行数
我有一个示例数据框
Account Date Amount
10 2020-06-01 100
10 2020-06-11 500
10 2020-06-21 600
10 2020-06-25 900
10 2020-07-11 1000
10 2020-07-15 600
11 2020-06-01 100
11 2020-06-11 200
11 2020-06-21 500
11 2020-06-25 1500
11 2020-07-11 2500
11 2020-07-15 6700
我想获取每个帐户每 30 天间隔的行数,即
Account Date Amount
10 2020-06-01 1
10 2020-06-11 2
10 2020-06-21 3
10 2020-06-25 4
10 2020-07-11 4
10 2020-07-15 4
11 2020-06-01 1
11 2020-06-11 2
11 2020-06-21 3
11 2020-06-25 4
11 2020-07-11 4
11 2020-07-15 4
我尝试过石斑鱼和重新采样,但这些给了我每 30 天的计数,而不是滚动计数。
提前致谢!
回答
def get_rolling_amount(grp, freq):
return grp.rolling(freq, on="Date", closed="both").count()
df["Date"] = pd.to_datetime(df["Date"])
df["Amount"] = df.groupby("Account").apply(get_rolling_amount, "30D").values
print(df)
印刷:
def get_rolling_amount(grp, freq):
return grp.rolling(freq, on="Date", closed="both").count()
df["Date"] = pd.to_datetime(df["Date"])
df["Amount"] = df.groupby("Account").apply(get_rolling_amount, "30D").values
print(df)