how to count a value in a csv file?-CodePudding

In the code that I present, it reads csv files that are in one folder and prints them in another.In each of these csv contains two columns which were chosen when the dataframe was defined. In column f I need to count how many times this value was above 50.025 and write it in some column

CODE:

import pandas as pd   
import numpy as np       
import glob   
import os  
all_files = glob.glob("C:/Users/Gamer/Documents/Colbun/Saturn/*.csv")   


file_list = []   
for i,f in enumerate(all_files):   
    df = pd.read_csv(f,header=0,usecols=["t","f"])
    df.apply(lambda x: x['f'] > 50.025, axis=1)
    df.to_csv(f'C:/Users/Gamer/Documents/Colbun/Saturn2/{os.path.basename(f).split(".")[0]}_ext.csv')

CodePudding user response：

its not logical to store it in some column... since its the summary of entire table..not specific to any row.

df = pd.read_csv(f,header=0,usecols=["t","f"])
how_many_times= len(   df[df['f'] > 50.025]   )

# you may store it in some unique column but it still doesnt make sense

df['newcol']=how_many_times

CodePudding user response：

To output the count of occurrences in a column according to a particular filter and add it to every row in another column you can simply do the following:

df['cnt'] = df[df['f'] > 50.025]['f'].count()

If you need to use that count to then perform a calculation it would be better to store it in a variable and them perform the calculation while using said variable rather that storing it in your dataframe in an entire column.

In addition I can see from your comments to your question that you also want to remove the index when outputting to CSV so to do that you need to add index=False to the df.to_csv() call.

Your code should look something like this:

import pandas as pd   
import numpy as np       
import glob   
import os  
all_files = glob.glob("C:/Users/Gamer/Documents/Colbun/Saturn/*.csv")   


file_list = []   
for i,f in enumerate(all_files):   
    df = pd.read_csv(f,header=0,usecols=["t","f"])
    df['cnt'] = df[df['f'] > 50.025]['f'].count()
    df.to_csv(f'C:/Users/Gamer/Documents/Colbun/Saturn2/{os.path.basename(f).split(".")[0]}_ext.csv', index=False)