Home > front end >  Check if value exists in a column in a csv file (Python)
Check if value exists in a column in a csv file (Python)

Time:01-07

I have a csv file and would like to be able to check if a value exists in a column. For example:

Frame,LED,R,G,B,W
2,8,0,0,0,126
3,1,0,0,0,126
17,1,0,0,0,126
19,1,0,0,0,126
20,1,0,0,0,126

I need a line that will give some indication that '17' already exists in the 'Frame' column

This should be really easy and there are already posts and articles about how to achieve this, but I have now tried so many things which makes me think I made a mistake earlier on which is preventing the 'search' functions from working.

import os
import pandas as pd
import csv

# bin/cache/temp.csv
'''
Frame,LED,R,G,B,W
1,1,0,0,0,126
1,1,0,0,0,126
2,1,0,0,0,126
etc
'''

def this_is_the_code_used_to_add_column_names():
    if not os.path.exists('bin/cache/temp.csv'):
        with open('bin/cache/temp.csv', 'a', encoding='UTF8', newline='') as file:
            writer = csv.writer(file)
            writer.writerow(['Frame', 'LED', 'R', 'G', 'B', 'W'])

def this_is_the_code_used_to_write_and_sort_the_data():
    with open('bin/cache/temp.csv', 'a', encoding='UTF8', newline='') as file:
            writer = csv.writer(file)
            writer.writerow('var_1', 'var_2', 'ver_3', 'ver_4', 'var_5', 'var_6')
        
    dataFrame = pd.read_csv('bin/cache/temp.csv')
    dataFrame.sort_values(["Frame","LED"],axis=0, ascending=True,inplace=True,na_position='first') # Sorting file
    dataFrame.to_csv('bin/cache/temp.csv', sep=',', encoding='utf-8', index=False) # Write to file new data

def the_part_that_is_not_working():
    df = pd.read_csv('bin/cache/temp.csv')
    # Need a line to return True or a list with all entries with the searched-for value
    # Just any metric that will indicate if it exists
    # Ideally with a way to reference back to it like an index, but I'll cross that bridge
    print(df[df['Frame']=='1'].index.tolist())

the_part_that_is_not_working()

The code above shows every interaction with the csv file in my program, on run if the file doesn't exist it is created and the headers added. Then lines can be written with data, followed by a sort of the entire file. I am aiming to add a way to check if 'Frame' '17' already exists, so there are no duplicates.

I have had a plethora of outputs and errors so I didn't include all the solutions I have tried but essentially, the function named the_part_that_is_not_working is what I have been trying to get to work.

Any help will be greatly appreciated.

CodePudding user response:

To check if value 17 in column Frame exists you can use .any():

df = pd.read_csv('your_data.csv')

out = df['Frame'].eq(17).any()
print(out)

Prints:

True

CodePudding user response:

Turns out, the issue was that the data was being read as int not str.

The fix is to change this:

x = df[df['Frame']=='1'].index.tolist()

Into this...

x = df[df['Frame']==1].index.tolist()

This returns a list containing the index of every occurrence

Good to know that csv files don't exclusively return strings.

  • Related