Home > database >  Python pandas dataseries slice with label vs index discrepancy
Python pandas dataseries slice with label vs index discrepancy

Time:12-30

s2 = pd.Series([10,20,30,40,50,60], index=['a', 'b', 'c', 'd', 'e', 'f'])

When I select slice using the labels as below

print(s2['b':'e'])

output is

b    23
c    33
d    43
e    54
dtype: int64

but when I select slice using index as below

print(s2[1:4])

output is

b    23
c    33
d    43
dtype: int64

Why is that when we select with index, it is selecting with step-1 whereas with label it is selecting the final label as well?

CodePudding user response:

s2['b':'e']

this works similar to df.loc[]

here a slice object with labels 'a':'f' (Note that contrary to usual Python slices, both the start and the stop are included, when present in the index! See Slicing with labels and Endpoints are inclusive.)

where as s2[1:4]

this works simlar to df.iloc[]

here general slicing rules of python apply. refer this link https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html

CodePudding user response:

From the documentation: "A slice object with labels 'a':'f' (Note that contrary to usual Python slices, both the start and the stop are included, when present in the index! See Slicing with labels and Endpoints are inclusive.)"

When you have an index that isn't the default RangeIndex (such as the strings in your example), slicing includes the start and stop. For a RangeIndex (or when specifying indices) slicing includes start to stop-1.

  • Related