How do I read only specific cells in a CSV file, then write the cell before it to a txt file?-CodePudding

So the idea is as follows: Assuming I have a .csv file with two columns (A, B). I want to be able to read all the cells in Column B with the value "3" inside them for example,

Column A    Column B
A      ,     1
B      ,     2
C      ,     3
D      ,     4
E      ,     3
F      ,     3
G      ,     1
K      ,     4
L      ,     4

and then write the corresponding cell in Column A to a txt file, so the output of this would be something like:

tmp.txt:
C
E
F

I hope it makes sense, many thanks in advance!

CodePudding user response：

I added a comma between two column names:

Column A,    Column B
A      ,     1
B      ,     2
C      ,     3
D      ,     4
E      ,     3
F      ,     3
G      ,     1
K      ,     4
L      ,     4

And use pandas to read the csv:

import pandas as pd

df = pd.read_csv('filename.csv', sep='\s*,\s*')
selected_rows = df[df['Column B']==3]
selected_As = selected_rows['Column A']
tmp = list(selected_As)

And you'll get tmp as ['C', 'E', 'F'].

CodePudding user response：

You can open the file as dataframe from its path file_path. Then select based on the condtion df['Column B'] == 3, Column B with value 3. And then select Column A as a dataframe and save it to a csv in the path file_output_name.

import pandas as pd
df=pd.read_csv(file_path)
sub_df = df[df['Column B'] == 3]['Column A']
sub_df.to_csv(file_output_name)