将ifelse函数应用于python/pandas中的两个字符串列
在我的数据框中,我有:
Name Sex Height
Jackie F Small
John M Tall
我已将以下函数应用于创建基于组合的新列:
def genderfunc(x,y):
if x =='Tall' & y=='M':
return 'T Male'
elif x =='Medium' & y=='M':
return 'Male'
elif x =='Small' & y=='M':
return 'Male'
elif x =='Tall' & y=='F':
return 'T Female'
elif x =='Medium' & y=='F':
return 'Female'
elif x =='Small' & y=='F':
return 'Female'
else:
return y
我应用此函数的代码行:
df['GenderDetails'] = df.apply(genderfunc(df['Height'],df['Sex']))
我得到以下信息:
类型错误:无法使用 dtyped [object] 数组和 [bool] 类型的标量执行“rand_”
关于我在这里做错了什么的任何想法?这是我第一次使用函数。
谢谢!
回答
这是另一种方法,使用map.
map_ = {"TallM": "T Male", "SmallF": "Female"}
df['GenderDetails'] = (df['Height'] + df['Sex']).str.strip().map(map_)
Name Sex Height GenderDetails
0 Jackie F Small Female
1 John M Tall T Male
回答
或者你可以使用 np.select,如果性能是一个问题-
condlist = [(df['Height'] == 'Tall') & (df['Sex'] == 'M'),
(df['Height'] == 'Medium') & (df['Sex'] == 'M'),
(df['Height'] == 'Small') & (df['Sex'] == 'M'),
(df['Height'] == 'Tall') & (df['Sex'] == 'F'),
(df['Height'] == 'Medium') & (df['Sex'] == 'F'),
(df['Height'] == 'Small') & (df['Sex'] == 'F')]
choiselist = [
'T Male',
'Male',
'Male',
'T Female',
'Female',
'Female'
]
df['GenderDetails'] = np.select(condlist, choiselist, df['Sex'])