I have a csv file and would like to be able to check if a value exists in a column. For example:
Frame,LED,R,G,B,W
2,8,0,0,0,126
3,1,0,0,0,126
17,1,0,0,0,126
19,1,0,0,0,126
20,1,0,0,0,126
I need a line that will give some indication that '17' already exists in the 'Frame' column
This should be really easy and there are already posts and articles about how to achieve this, but I have now tried so many things which makes me think I made a mistake earlier on which is preventing the 'search' functions from working.
import os
import pandas as pd
import csv
# bin/cache/temp.csv
'''
Frame,LED,R,G,B,W
1,1,0,0,0,126
1,1,0,0,0,126
2,1,0,0,0,126
etc
'''
def this_is_the_code_used_to_add_column_names():
if not os.path.exists('bin/cache/temp.csv'):
with open('bin/cache/temp.csv', 'a', encoding='UTF8', newline='') as file:
writer = csv.writer(file)
writer.writerow(['Frame', 'LED', 'R', 'G', 'B', 'W'])
def this_is_the_code_used_to_write_and_sort_the_data():
with open('bin/cache/temp.csv', 'a', encoding='UTF8', newline='') as file:
writer = csv.writer(file)
writer.writerow('var_1', 'var_2', 'ver_3', 'ver_4', 'var_5', 'var_6')
dataFrame = pd.read_csv('bin/cache/temp.csv')
dataFrame.sort_values(["Frame","LED"],axis=0, ascending=True,inplace=True,na_position='first') # Sorting file
dataFrame.to_csv('bin/cache/temp.csv', sep=',', encoding='utf-8', index=False) # Write to file new data
def the_part_that_is_not_working():
df = pd.read_csv('bin/cache/temp.csv')
# Need a line to return True or a list with all entries with the searched-for value
# Just any metric that will indicate if it exists
# Ideally with a way to reference back to it like an index, but I'll cross that bridge
print(df[df['Frame']=='1'].index.tolist())
the_part_that_is_not_working()
The code above shows every interaction with the csv file in my program, on run if the file doesn't exist it is created and the headers added. Then lines can be written with data, followed by a sort of the entire file. I am aiming to add a way to check if 'Frame' '17' already exists, so there are no duplicates.
I have had a plethora of outputs and errors so I didn't include all the solutions I have tried but essentially, the function named the_part_that_is_not_working is what I have been trying to get to work.
Any help will be greatly appreciated.
CodePudding user response:
To check if value 17
in column Frame
exists you can use .any()
:
df = pd.read_csv('your_data.csv')
out = df['Frame'].eq(17).any()
print(out)
Prints:
True
CodePudding user response:
Turns out, the issue was that the data was being read as int not str.
The fix is to change this:
x = df[df['Frame']=='1'].index.tolist()
Into this...
x = df[df['Frame']==1].index.tolist()
This returns a list containing the index of every occurrence
Good to know that csv files don't exclusively return strings.