How can you extract a scalar in a for loop?-CodePudding

I'm using a for loop to slice a dataframe and then extract information from each slice. I then store that information in a dict so I can append it to a list for later use. My problem is that the infomation is not useable: it exists as a pandas Series rather than as the actual scalar value of the cell I'm trying to extract. Below is an example of the process I'm trying to execute:

df = pd.DataFrame({'c1': np.arange(0,15),'c2': np.arange(0,15), 'c3': ['A']*5 ['B']*5 ['C']*5})
iterable = ['A', 'B', 'C']
dict_list = []
for i in iterable:
    out_dict = dict()
    data = df[df.c3==i]
    out = data.c1[-1:].iloc[0]
    out_dict['out'] = out
    dict_list.append(out_dict)
out_df = pd.DataFrame.from_records(dict_list)

Bizzarrely, the code above works, but when I change the dataframe to my real data, I get an IndexError: single positional indexer is out-of-bounds error at line 7, which I believe means that there is no index. In both my data and the example above, the type of data.c1[-1:] is pandas.core.series.Series and they both have length 1. Even stranger is that If I run out = data.c1[-1:] inside the for loop, and then run out.iloc[0] outside the for loop I don't get an error.

Does anyone know why iloc would fail in this case? Is there a way to force out to be indexable?

CodePudding user response：

This happens when you index a row/column with a number that is larger than the dimensions of your dataframe.

dataframe1.fillna("nan") # or whatever you want as a fill value
dataframe2.fillna("nan")

for example df.iloc[:, 10] would refer to the eleventh column.

CodePudding user response：

Okay I don't have an answer to the original question, but replacing the .iloc[0] with .squeeze() solved my issue, like so: out = data.c1[-1:].squeeze()