Home > OS >  Implement a "where" method to select column_name and value from string in Python
Implement a "where" method to select column_name and value from string in Python

Time:11-04

Ive got a little issue while coding a script that takes a CSV string and is supposed to select a column name and value based on the input. The CSV string contains Names of NBA players, their Universities etc. Now when the input is "name" && "Andre Brown", it should search for those values in the given CSV string. I have a rough code laid out - but I am unsure on how to implement the where method. Any ideas?

import csv
import pandas as pd
import io

class MySelectQuery:
    def __init__(self, table, columns, where):
        self.table = table
        self.columns = columns
        self.where = where

    def __str__(self):
        return f"SELECT {self.columns} FROM {self.table} WHERE {self.where}"

csvString = "name,year_start,year_end,position,height,weight,birth_date,college\nAlaa Abdelnaby,1991,1995,F-C,6-10,240,'June 24, 1968',Duke University\nZaid Abdul-Aziz,1969,1978,C-F,6-9,235,'April 7, 1946',Iowa State University\nKareem Abdul-Jabbar,1970,1989,C,7-2,225,'April 16, 1947','University of California, Los Angeles\nMahmoud Abdul-Rauf,1991,2001,G,6-1,162,'March 9, 1969',Louisiana State University\n"

df = pd.read_csv(io.StringIO(csvString), error_bad_lines=False)
where = "name = 'Alaa Abdelnaby' AND year_start = 1991"
df = df.query(where)
print(df)

The CSV string is being transformed into a pandas Dataframe, which should then find the values based on the input - however I get the error "name 'where' not defined". I believe everything until the df = etc. part is correct, now I need help implementing the where method. (Ive seen one other solution on SO but wasnt able to understand or figure that out)

CodePudding user response:

# importing pandas
import pandas as pd
  
record = {
  'Name': ['Ankit', 'Amit', 'Aishwarya', 'Priyanka', 'Priya', 'Shaurya' ],
  'Age': [21, 19, 20, 18, 17, 21],
  'Stream': ['Math', 'Commerce', 'Science', 'Math', 'Math', 'Science'],
  'Percentage': [88, 92, 95, 70, 65, 78]}
  
# create a dataframe
dataframe = pd.DataFrame(record, columns = ['Name', 'Age', 'Stream', 'Percentage'])
  
print("Given Dataframe :\n", dataframe) 
  
options = ['Math', 'Science']
  
# selecting rows based on condition
rslt_df = dataframe[(dataframe['Age'] == 21) &
          dataframe['Stream'].isin(options)]
  
print('\nResult dataframe :\n', rslt_df)

Output:

Output

Source: https://www.geeksforgeeks.org/selecting-rows-in-pandas-dataframe-based-on-conditions/

Sometimes Googling does the trick ;)

CodePudding user response:

You need the double = there. So should be:

where = "name == 'Alaa Abdelnaby' AND year_start == 1991"

  • Related