I have 500 text files, I need to find out how many entities are in each files. I did that in the following code:
import os
import pandas as pd
path = "newData"
files = [file for file in os.listdir(path) if file.endswith(".txt")]
c=0
for file in files:
df = pd.read_csv(os.path.join(path, file),
sep=' ',engine='python')
df.columns = ['word','token','?']
problem = df['token'].tolist().count('B-Problem')
method = df['token'].tolist().count('B-Method\oTool')
data = df['token'].tolist().count('B-Dataset')
I need to create an excel sheet to show the information excel output expect:
Filename #ofProblem #ofMethod #ofData
admin.txt {problem} {method} {data}
how can I store them into one big excel sheet for all 500files?
CodePudding user response:
Pandas can write excel sheets, just store your values in a dataframe you'll use to write to a file
import os
import pandas as pd
path = "newData"
files = [file for file in os.listdir(path) if file.endswith(".txt")]
out_data = []
c=0
for file in files:
df = pd.read_csv(os.path.join(path, file),
sep=' ',engine='python')
df.columns = ['word','token','?']
problem = df['token'].tolist().count('B-Problem')
method = df['token'].tolist().count('B-Method\oTool')
data = df['token'].tolist().count('B-Dataset')
out_data.append(
{
"Filename": file,
"#ofProblem": problem,
"#ofMethod": method,
"#ofData": data,
}
)
pd.DataFrame(out_data).to_excel("your_excel_name.xlsx", index=None)