如何在python中读取多索引数据帧

html5 • 2022年11月22日 am9:49 • 问答

这是我的名为 df 的数据框

University  Subject  Colour
Melb        Math     Red
            English  Blue
Sydney      Math     Green
            Arts     Yellow
            English  Green
Ottawa      Med      Blue
            Math     Yellow

University 和 Subject 都是此数据框的索引键

当我这样做时

print(df.to_dict('index'))

我得到

{(Melb, Math): {'Colour': Red}, (Melb, English): {'Colour': Blue}, ...

当我这样做时

print(df["Colour"])

我明白了

University  Subject  Colour
Melb        Math     Red
            English  Blue
Sydney      Math     Green
            Arts     Yellow
            English  Green
Ottawa      Med      Blue
            Math     Yellow

当我做

print(df["University"])

我收到一个错误

KeyError: 'University'

我想要的是一种分别读取每个值的方法

我想读大学，再读一次主题，第三次读颜色

怎么做？

回答

一种更快的方法是使用 python 的zip函数，这种方法比手动运行 for 循环要快得多。

快速回答您的问题：

university_list = list(zip(*df.index))[0]
subject_list = list(zip(*df.index))[1]
colour_list = list(df['Colour'])

说明

将索引作为列表：

index_list = list(zip(*df.index))

输出：

[('Melb','Sydney','Ottawa'),('Math','English','Math','Arts',...)]

您将获得一个元组列表，其中每个元组都与一个索引列相关。

（列将按从左到右的 顺序排列：例如第一个索引列将是第一个元组，第二个索引列将是第二个元组，依此类推！）

现在，要获得单独的索引列列表，您可以简单地做，

Universities = list(index_list[0]) #this will give you separate list for university ('Melb','Sydney','Ottawa')
Subjects = list(index_list[1]) #this will give you separate list for Subjects ('Math','English','Math','Arts',...)

从非索引列获取数据作为列表

你可以通过简单地做到这一点，

column_data = list(df['column_name'])

#which in your case will be

colour_list = list(df['Colour'])

我正在扩展答案以回答其中一条评论。

现在，想象一下你需要整个 Dataframe 作为一个元组列表的情况，其中每个元组都有一列的数据。（包括索引列）

列表看起来像，

[(Col-1_data, ,...),(Col-2_data, ,...),...]

要实现这样的目标，您必须重新设置索引、获取数据并再次设置索引。下面的代码将完成任务，

index_names = list(df.index.names) #saving current indexes so that we can reassign them later.
df.reset_index(inplace = True)
dataframe_raw_list = df.values.tolist() #This will be a list of tuples where each tuple is a row of dataframe
df.set_index(index_names, inplace = True)

dataframe_columns_list = list(zip(*dataframe_raw_list)) #This will be a list of tuples where each tuple is a Column of dataframe

输出：

[(Col-1_data, ,...),(Col-2_data, ,...),...]

以上是如何在python中读取多索引数据帧的全部内容。

THE END

二维码

将模板化函数作为参数传递给另一个函数

< <上一篇

[webpack新手]webpackmanifest插件的自动前缀问题

下一篇>>

搜索内容

如何在python中读取多索引数据帧

回答

快速回答您的问题：

说明

将索引作为列表：

从非索引列获取数据作为列表

我正在扩展答案以回答其中一条评论。

目录

目录

推荐文章

最新文章