Python中的Pandas DataFrame reset_index（）方法

发表于 2020年5 月27日星期三下午 6:09:44

熊市猫DataFrame reset_index（）用于重置DataFrame的索引。 reset_index（）用于将范围从0到数据长度的整数列表设置为索引。当索引需要被视为列时，或者索引无意义并且需要在其他操作之前重置为默认值时，reset_index（）方法很有用。对于MultiIndex，可以使用reset_index（）方法删除一个或多个级别。它可以重置索引或索引的级别。让我们从基本开始。

熊市猫中的DataFrame

Pandas DataFrame只是通过Python编程语言的Excel工作表的内存表示形式。索引对象是一个不可变的数组。索引使我们能够使用标签访问行或列。 Pandas DataFrame是包含二维数据及其相关标签的组合。 DataFrames在数据科学，机器学习，科学计算以及许多其他数据密集型领域中很有名。

您可以通过多种方式创建Pandas DataFrame。在大多数情况下，您将使用DataFrame构造函数并填写数据，标签和其他信息。有时，您将从CSV或Excel文件导入数据。您可以将数据作为二维列表，元组或NumPy数组传递。您也可以将其作为字典或Pandas Series实例或本示例未涵盖的许多其他数据类型之一进行传递。在继续之前，让我们了解Pandas中的set_index（）。

熊市猫DataFrame set_index（）

熊市猫set_index（）是用于将List，Series或DataFrame设置为数据框索引的函数。 Pandas DataFrame是2D标签的数据结构，具有可能不同类型的列。

请参见以下示例。

import pandas as pd

dataset = {
    'Name': ['Rohit', 'Mohit', 'Sohit', 'Arun', 'Shubh'],
    'Roll no': ['01', '03', '04', '05', '09'],
    'Marks in maths': ['93', '63', '74', '94', '83'],
    'Marks in science': ['88', '55', '66', '94', '35'],
    'Marks in english': ['93', '74', '84', '92', '87']}

# Converting dataset dict to a DataFrame
df = pd.DataFrame(dataset)
print('The DataFrame is: ')
print(df.head())

# Setting index on MultiIndex which is name and Roll no
df.set_index(["Name"], inplace=True,
             append=True, drop=True)
print('After setting Name and Roll no as an index using set_index()')
print(df.head())

输出量

The DataFrame is:
    Name Roll no Marks in maths Marks in science Marks in english
0  Rohit      01             93               88               93
1  Mohit      03             63               55               74
2  Sohit      04             74               66               84
3   Arun      05             94               94               92
4  Shubh      09             83               35               87
After setting Name and Roll no as an index using set_index()
        Roll no Marks in maths Marks in science Marks in english
  Name
0 Rohit      01             93               88               93
1 Mohit      03             63               55               74
2 Sohit      04             74               66               84
3 Arun       05             94               94               92
4 Shubh      09             83               35               87

您可以看到“名称”列已设置为索引。

了解Pandas DataFrame reset_index

通过使用reset_index（），可以将DataFrame和Series的索引（行标签）重新分配给从0开始的数字序列（行号）。如果将行号用作索引，则在重新索引时更方便地重新索引行的顺序在排序后发生更改，或者在删除行后丢失数字。

当使用行名（字符串）作为索引时，它也用于删除当前索引或返回到数据列。通过使用df.set_index（）和df.reset_index（），可以将索引更改为另一列。

请参阅Pandas reset_index（）函数的索引。

句法

DataFrame.reset_index(level = None, drop= False, inplace = False, col_level=0, col_fill=”)

参量

所有参数均填充有默认值。

python中的reset_index（）方法中有五个参数。

级别：可以是整数或字符串，也可以是一个列表，用于从索引中选择和删除传递的列。
drop：它是布尔数据类型，如果值为false，它将替换的索引列添加到数据中。
inplace：它也是布尔数据类型；如果布尔值是true，则为true时，它将在原始数据帧本身中提交一些更改。
col_level：其默认值为0.它选择在哪个列级别插入标签。
col_fill：是对象类型。它用于确定其他级别的命名方式。

返回值

如果参数inplace = true，则reset_index（）函数将返回带有新索引的DataFrame，否则将不返回。

熊市猫数据框上的示例reset_index（）

编写一个程序来演示reset_index（）的工作。

import pandas as pd

dataset = {
    'Name': ['Rohit', 'Mohit', 'Sohit', 'Arun', 'Shubh'],
    'Roll no': ['01', '03', '04', '05', '09'],
    'Marks in maths': ['93', '63', '74', '94', '83'],
    'Marks in science': ['88', '55', '66', '94', '35'],
    'Marks in english': ['93', '74', '84', '92', '87']}

# Creating dataframe from the dict data
df = pd.DataFrame(dataset)
print('The DataFrame is: ')
print(df)

# Setting index on name only
df.set_index(["Name"], inplace=True,
             append=True, drop=True)
print('After setting name as an index using set_index()')
print(df)

# resetting index to level 1
df.reset_index(level=1, inplace=True, col_level=1)

# display
print('After resetting the index using reset_index()')
print(df.head())

输出量

The DataFrame is:
    Name Roll no Marks in maths Marks in science Marks in english
0  Rohit      01             93               88               93
1  Mohit      03             63               55               74
2  Sohit      04             74               66               84
3   Arun      05             94               94               92
4  Shubh      09             83               35               87
After setting name as an index using set_index()
        Roll no Marks in maths Marks in science Marks in english
  Name
0 Rohit      01             93               88               93
1 Mohit      03             63               55               74
2 Sohit      04             74               66               84
3 Arun       05             94               94               92
4 Shubh      09             83               35               87
After resetting the index using reset_index()
    Name Roll no Marks in maths Marks in science Marks in english
0  Rohit      01             93               88               93
1  Mohit      03             63               55               74
2  Sohit      04             74               66               84
3   Arun      05             94               94               92
4  Shubh      09             83               35               87

在这里，我们可以看到已经为一个小的数据集创建了一个Dictionary，然后将其转换为DataFrame，然后使用Pandas set_index（）方法将index设置为name列。

然后使用reset_index，将其重置为1级并生成最终输出。

您可以在输出中看到，重置索引后，DataFrame会转换为其原始形式。

编写一个程序以在多个列（多个索引）上使用reset_index（）。

import pandas as pd

dataset = {
    'Name': ['Rohit', 'Mohit', 'Sohit', 'Arun', 'Shubh'],
    'Roll no': ['01', '03', '04', '05', '09'],
    'Marks in maths': ['93', '63', '74', '94', '83'],
    'Marks in science': ['88', '55', '66', '94', '35'],
    'Marks in english': ['93', '74', '84', '92', '87']}

# Converting dataset dict to a DataFrame
df = pd.DataFrame(dataset)
print('The DataFrame is: ')
print(df.head())

# Setting index on MultiIndex which is name and Roll no
df.set_index(["Name", "Roll no"], inplace=True,
             append=True, drop=True)
print('After setting Name and Roll no as an index using set_index()')
print(df.head())

# Resetting index to level 2
df.reset_index(level=2, inplace=True, col_level=1)

# Display
print('After resetting the index of col_level = 1 using reset_index()')
print(df.head())

输出量

The DataFrame is:
    Name Roll no Marks in maths Marks in science Marks in english
0  Rohit      01             93               88               93
1  Mohit      03             63               55               74
2  Sohit      04             74               66               84
3   Arun      05             94               94               92
4  Shubh      09             83               35               87
After setting Name and Roll no as an index using set_index()
                Marks in maths Marks in science Marks in english
  Name  Roll no
0 Rohit 01                  93               88               93
1 Mohit 03                  63               55               74
2 Sohit 04                  74               66               84
3 Arun  05                  94               94               92
4 Shubh 09                  83               35               87
After resetting the index of col_level = 1 using reset_index()
        Roll no Marks in maths Marks in science Marks in english
  Name
0 Rohit      01             93               88               93
1 Mohit      03             63               55               74
2 Sohit      04             74               66               84
3 Arun       05             94               94               92
4 Shubh      09             83               35               87

首先，我们定义了一个Dictionary，然后使用该字典创建一个小的DataFrame。

然后，我们使用了reset_index（）函数为Name和Roll no列分配了多个索引。

然后，使用reset_index方法，将其更改为级别1。这意味着，删除了Roll No上的索引，现在仅对Name列进行了索引。

Jupyter Notebook上的Pandas reset_index（）

当我们处理大型数据集时，Jupyter Notebook是重要的工具。

要处理熊市猫，我们需要一个数据集。所以我将使用ratings.csv数据集。你可以在这里下载。

现在，让我们导入熊市猫，并使用read_csv（）函数从CSV文件创建一个DataFrame。

import pandas as pd

df = pd.read_csv('ratings.csv')
df.head()

点击Shift + Enter按钮运行代码，您将获得以下输出。

我们必须使用DataFrame.head（）函数从DataFrame中选择前5行。

现在，将索引设置为“评级”列。

df.set_index(["rating"], inplace=True,
             append=True, drop=True)
df.head()

从输出中，您可以看到未对该评级编制索引。

现在，要重置索引，我们将使用reset_index（）函数。

df.reset_index(inplace = True)
df.head()

查看输出。

您可以看到它重置了索引。

如果您尚未在DataFrame中设置索引但仍使用reset_index（）函数，则它将创建一个从零开始的索引列。

从输出中，您可以看到Pandas reset_index（）方法将从0到数据长度的整数列表设置为索引。

删除原始索引：drop

如果参数drop设置为True，则会删除原始索引。

假设，我们正在为评分列分配一个索引，然后使用reset_index（）方法删除该索引。请参阅以下屏幕截图。

在输出中，您可以看到已删除“评级”列，并且现在没有索引列。

默认情况下，reset_index（）不会更改原始对象并返回新对象，但是如果inplace参数设置为True，则会更改原始对象。

通过reset_index（）和set_index（）将索引更改为另一列

假设将CSV数据转换为DataFrame时，我们可以传递名为index_col的参数，该参数指示哪一列将成为DataFrame的索引。

如果传递index_col = 0，则DataFrame的第一列将转换为索引。

然后，我们使用reset_index（）函数重置DataFrame的索引。

为了更好地理解，请参见以下示例。

结论

由于以数据为中心的python软件包的奇妙生态系统，Python是进行数据挖矿和分析的出色编程语言。 Pandas是其中的一种，使导入和分析数据更加易于管理。

熊市猫的reset_index（）方法重置数据框的索引。 reset_index（）方法将范围从0到数据长度的整数列表设置为索引。

也可以看看

熊市猫to_datetime（）

熊市猫系列to_dataframe（）

熊市猫unique（）

熊市猫DataFrame describe（）

熊市猫ExcelWriter（）

资讯来源：由0x资讯编译自APPDIVIDEND，版权归作者Ankit Lathiya所有，未经许可，不得转载