Home > OS >  Unsure of how to reindex a Pandas dataframe using integers 0 to n-1
Unsure of how to reindex a Pandas dataframe using integers 0 to n-1

Time:11-28

I need to reset the data frame index so it goes from 0 to n-1, where n is the number of rows in the data frame.

I'm only able to index once, so although I haven't tried, all the indexing I've found online (including multiple Stackoverflow resources) show tons of examples, I haven't found any including n-1.

df.reset_index()

df.reset_index(drop=True)

df.reset_index(drop=True, inplace=True)

I just want to be sure I'm not missing something, but so far nothing I've seen allows for n-1. I may be overthinking this.

Here's sample code:

`longitude  latitude    housing_median_age  total_rooms total_bedrooms  population  households  median_income   median_house_value  ocean_proximity
0   -122.23 37.88   45.0    884.0   131.0   323.0   130.0   8.3252  4526030.0   NEAR BAY
1   -122.34 37.88   41.0    3063.0  930.0   2560.0  926.0   1.7375  3500040.0   NEAR BAY
2   -122.29 37.88   54.0    1211.0  263.0   525.0   230.0   3.8672  2167030.0   NEAR BAY
3   -122.28 37.88   55.0    1845.0  333.0   772.0   335.0   4.2614  2613030.0   NEAR BAY
4   -122.26 37.88   53.0    2553.0  418.0   898.0   404.0   6.2425  3918030.0   NEAR BAY
`

CodePudding user response:

Commands that you included in your post give the following results:

  • df.reset_index() - creates a new DataFrame with:

    • the new index - consecutive values starting from 0,
    • the old index changed into a regular column, with index name (the question is whether you actually need the old index).
  • df.reset_index(drop=True) - creates a new DataFrame:

    • with the new index (as before),
    • but the old index is dropped (not converted to a regular column).

Both above commands leave the original DataFrame intact, so you should probably save the result either under the original name (df) or any other name o your choice (in this case you have both the "old" and "new" DataFrame).

But df.reset_index(drop=True, inplace=True) performs the above operation in place, so you have the result as if you ran df = df.reset_index(drop=True).

I suppose, just this is what you actually need.

  • Related