Home > other >  How do I choose a row based off the index in python/pandas?
How do I choose a row based off the index in python/pandas?

Time:11-05

I have a row in my csv file here:

   0   1  2                        3                     4                        5                               
0  I  55  A  2018-03-10 00:00:00.000  username_in_current_row  2012-01-24 00:00:00.000 

I want to write something in python where it only selects rows that have "I" at the index of 0, so do a row count of how many records have "I" there. And I want to print ("X number of records were inserted successfully where an "I" exists.)

So in this case I would do something like:

df = pd.read_csv('file.csv', header=None')
print(df)

This would print the dataframe.

df_row_count = df.shape[0] 

Here I want a row count of how many records have "I" in it at the index of 0???

print("X amount of records inserted successfully.") 

I think I need an f-string here in place of X that would just count the total number of rows in the table with "I"?

CodePudding user response:

It sounds like you're saying you want to do filter the dataframe. If you want to find where there's an "I" at the index of zero and then print the length, you can do this:

filtered_df = df[df.loc[0, 0]=="I"]
print(f"{len(filtered_df)} records had an "I" at index 0 for column 0.")

The thing is, only the first row will have an index of 0 after your .read_csv() line, so it's 1 at most.

Did you actually want to count how many rows have "I" in the first column? If so, it's more like this:

filtered_df = df[df[0]=="I"]
print(f"{len(filtered_df)} records had an "I" in the first column.")

Hope that helps. :)

CodePudding user response:

To filter the rows with I in column 0 use boolean indexing:

out = df[df[0].eq('I')]

To count them sum the boolean Series to count the True:

count = df[0].eq('I').sum()
  • Related