逐月变换逐月变换

初始表

salesman    training_date   01/20   02/20   03/20   04/20   05/20   06/20   07/20   08/20   09/20   10/20   11/20   12/20
0   John    2020-11-01       100      20     200     250      0       28      80      30     150     100     300    250
1   Ruddy   2020-07-12       90       50      30     225     300     100      95      10      20      0     20      100

在 Python 中:

t1 = {'salesman': ['John', 'Ruddy'],
     'training_date':['2020-11-30','2020-07-12'],
     '01/20': [100, 90], '02/20':[20,50], '03/20':[200,30],'04/20':[250,225],'05/20':[0,300],'06/20':[28,100],
     '07/20': [80, 95], '08/20':[30,10], '09/20':[150,20],'10/20':[100,0],'11/20':[300,20],'12/20':[250,100],
     }
t1a = pd.DataFrame(data=t1)
t1a

数据帧预期:

    salesman    training_date   training_month  1m_prior    2m_prior    3m_prior    4m_prior    5m_prior    6m_prior
0   John          2020-11-30         300           100        150           30         80      28       0
1   Ruddy         2020-07-12          95           100        300          225         30      50      90

在 Python 中:

t2 = {'salesman': ['John', 'Ruddy'],
     'training_date':['2020-11-30','2020-07-12'],
     'training_month': [300, 95], '1m_prior':[100,100], '2m_prior':[150,300],
     '3m_prior':[30,225],'4m_prior':[80,30],'5m_prior':[28,50], '6m_prior': [0, 90]}
t2a = pd.DataFrame(data=t2)
t2a

说明:
约翰在 11 月 1 日接受了培训。在 11 月 1 日之前的 100 万美元,即 10 月,约翰产生了 100 美元。在 11 月 1 日和 9 月之前的 200 万美元,John 产生了 150 美元。

Ruddy 于 7 月 12 日接受了培训。7 月 12 日之前的 100 万,即 6 月,Ruddy 产生了 100 美元。在 5 月 12 日之前的 200 万,Ruddy 产生了 300 美元。

在理想情况下,我们开始计算 1 个整月,始终从每个月的 1 日开始。因此,如果 Ruddy 在 2020 年 7 月 12 日受雇,那么前一个月应该是 6 月 1 日 - 6 月 30 日。

到目前为止,我们在 Excel 中手动转换数据。

回答

熔化+枢轴

t1a['training_date'] = pd.to_datetime(t1a['training_date'])

如果您确实需要在 6 个月前:

u = t1a.melt(['salesman','training_date'])
diff_ = (pd.to_datetime(u['variable'],format='%m/%y').dt.month
           .sub(u['training_date'].dt.month))

u = u.assign(k=diff_).loc[lambda x: x['k'].between(-6,0)]
choices = (diff_.abs().astype(str).replace("0","").add("")+
            np.select([diff_<0,diff_==0],['months_prior','training_month']))

out = (u.assign(Key=choices).pivot_table(index=["salesman","training_date"]
,columns="Key",values="value").set_index("training_month",append=True))

print(out)
Key                                    1months_prior  2months_prior  
salesman training_date training_month                                 
John     2020-11-30    300                       100            150   
Ruddy    2020-07-12    95                        100            300   

Key                                    3months_prior  4months_prior  
salesman training_date training_month                                 
John     2020-11-30    300                        30             80   
Ruddy    2020-07-12    95                        225             30   

Key                                    5months_prior  6months_prior  
salesman training_date training_month                                
John     2020-11-30    300                        28              0  
Ruddy    2020-07-12    95                         50             90 

如果您想要包括之前和之后的所有月份:

u = t1a.melt(['salesman','training_date'])
diff_ = (pd.to_datetime(u['variable'],format='%m/%y').dt.month
           .sub(u['training_date'].dt.month))

choices = (diff_.abs().astype(str).replace("0","").add("")+
           np.select([diff_>0,diff_<0,diff_==0],
           ['months_later','months_prior','training_month']))

out = (u.assign(Key=choices).pivot_table(index=["salesman","training_date"]
   ,columns="Key",values="value").set_index("training_month",append=True))

print(out)

Key                                    10months_prior  1months_later  
salesman training_date training_month                                  
John     2020-11-30    300.0                    100.0          250.0   
Ruddy    2020-07-12    95.0                       NaN           10.0   

Key                                    1months_prior  2months_later  
salesman training_date training_month                                 
John     2020-11-30    300.0                   100.0            NaN   
Ruddy    2020-07-12    95.0                    100.0           20.0   

Key                                    2months_prior  3months_later  
salesman training_date training_month                                 
John     2020-11-30    300.0                   150.0            NaN   
Ruddy    2020-07-12    95.0                    300.0            0.0   

Key                                    3months_prior  4months_later  
salesman training_date training_month                                 
John     2020-11-30    300.0                    30.0            NaN   
Ruddy    2020-07-12    95.0                    225.0           20.0   

Key                                    4months_prior  5months_later  
salesman training_date training_month                                 
John     2020-11-30    300.0                    80.0            NaN   
Ruddy    2020-07-12    95.0                     30.0          100.0   

Key                                    5months_prior  6months_prior  
salesman training_date training_month                                 
John     2020-11-30    300.0                    28.0            0.0   
Ruddy    2020-07-12    95.0                     50.0           90.0   

Key                                    7months_prior  8months_prior  
salesman training_date training_month                                 
John     2020-11-30    300.0                   250.0          200.0   
Ruddy    2020-07-12    95.0                      NaN            NaN   

Key                                    9months_prior  
salesman training_date training_month                 
John     2020-11-30    300.0                    20.0  
Ruddy    2020-07-12    95.0                      NaN 


以上是逐月变换逐月变换的全部内容。
THE END
分享
二维码
< <上一篇
下一篇>>