I have a CSV file. I load it in pandas dataframe. Now, I am practicing the loc method. This CSV file contains a list of James bond movies and I am passing letters in the loc method. I could not interpret the result shown.
bond = pd.read_csv("jamesbond.csv", index_col = "Film")
bond.sort_index(inplace = True)
bond.head(3)
bond.loc["A": "I"]
The result for the above code is:
bond.loc["a": "i"]
And the result for the above code is:
What is happening here? I could not understand. Please someone help me to understand the properties of pandas.
Following is the file:
CodePudding user response:
Your dataframe uses the first column ("Film") as an index when it is imported (because of the option index_col = "Film"
). The column contains the name of each film stored as a string, and they all start with a capital letter. bond.loc["A":"I"]
returns all films where the index is greater than or equal to "A" and less than or equal to "I" (pandas slices are upper-bound inclusive), which by the rules of string comparison in Python includes all films beginning with "A"-"I". If you enter e.g. "A" <= "b" <="I"
in the python prompt you will see that lower-case letters are not within the range, because ord("b") > ord("I")
.
If you wrote bond.index = bond.index.str.lower()
that would change the index to lower case and you could search films using e.g. bond["a":"i"]
(but bond["A":"I"]
would no longer return any films).
CodePudding user response:
DataFrame.loc["A":"I"]
returns the rows that start with the letter in that range - from what I can see and tried to reproduce. Might you attach the data?