Home > Blockchain >  Using For Loop to get count of rows for specific columns (Python/Pandas)
Using For Loop to get count of rows for specific columns (Python/Pandas)

Time:10-07

I would like to re-work my code to use a For Loop to get row counts by specific columns using Python (there are 15 columns in total and I am looking for row counts for 4 specific ones at this time) -

This is the current input:

#get row count by affiliate, race, ethnicity and abortion type
print('Column_1:', batch_df.groupby('Column_1').size().sum())
print('Column_2:', batch_df.groupby('Column_2').size().sum())
print('Column_3:',batch_df.groupby('Column_3').size().sum()) 
print('Column_4:',batch_df.groupby('Column_4').size().sum())

The output (which is correct) is below:

Column_1: 468676
Column_2: 465755
Column_3: 468400
Column_4: 468676

Is there a way to re-work the input so that it is a For Loop?

CodePudding user response:

This should work if you want to specify the columns by name:

for col in ['Column_1', 'Column_2', 'Column_3', 'Column_4']:
    print('{}:'.format(col), batch_df.groupby(col).size().sum())

CodePudding user response:

No need to write all column names as df.columns returns column names and then you can loop them true:

for c in df.columns:
    print(c)

For example with dataframe

df = pd.DataFrame({
   'Column_1': ['1', '2', '3', '4'],  
   'Column_2' : ['11','12','13','14'],                
   'Column_3': ['101', '102','103', '104']})

will print

Column_1
Column_2
Column_3
  • Related