So the idea is as follows: Assuming I have a .csv file with two columns (A, B). I want to be able to read all the cells in Column B with the value "3" inside them for example,
Column A Column B
A , 1
B , 2
C , 3
D , 4
E , 3
F , 3
G , 1
K , 4
L , 4
and then write the corresponding cell in Column A to a txt file, so the output of this would be something like:
tmp.txt:
C
E
F
I hope it makes sense, many thanks in advance!
CodePudding user response:
I added a comma between two column names:
Column A, Column B
A , 1
B , 2
C , 3
D , 4
E , 3
F , 3
G , 1
K , 4
L , 4
And use pandas
to read the csv:
import pandas as pd
df = pd.read_csv('filename.csv', sep='\s*,\s*')
selected_rows = df[df['Column B']==3]
selected_As = selected_rows['Column A']
tmp = list(selected_As)
And you'll get tmp
as ['C', 'E', 'F']
.
CodePudding user response:
You can open the file as dataframe from its path file_path
. Then select based on the condtion df['Column B'] == 3
, Column B with value 3. And then select Column A as a dataframe and save it to a csv in the path file_output_name
.
import pandas as pd
df=pd.read_csv(file_path)
sub_df = df[df['Column B'] == 3]['Column A']
sub_df.to_csv(file_output_name)