Please help me with the python script to filter the below CSV. Below is the example of the CSV dump for which I have done the initial filtration.
Last_name | Gender | Name | Phone | city |
---|---|---|---|---|
Ford | Male | Tom | 123 | NY |
Rich | Male | Robert | 21312 | LA |
Ford | Female | Jessica | 123123 | NY |
Ford | Male | John | 3412 | NY |
Rich | Other | Linda | 12312 | LA |
Ford | Other | James | 4321 | NY |
Smith | Male | David | 123123 | TX |
Rich | Female | Mary | 98689 | LA |
Rich | Female | Jennifer | 86860 | LA |
Ford | Male | Richard | 12123 | NY |
Smith | Other | Daniel | 897097 | TX |
Ford | Other | Lisa | 123123123 | NY |
import re
def gather_info (L_name):
dump_filename = "~/Documents/name_report.csv"
LN = []
with open(dump_filename, "r") as FH:
for var in FH.readlines():
if L_name in var
final = var.split(",")
print(final[1], final[2], final[3])
return LN
if __name__ == "__main__":
L_name = input("Enter the Last name: ")
la_name = gather_info(L_name)
By this, I am able to filter by the last name. for example, if I choose L_name as Ford, then I have my output as
Gender | Name | Phone |
---|---|---|
Male | Tom | 123 |
Female | Jessica | 123123 |
Male | John | 3412 |
Other | James | 4321 |
Male | Richard | 12123 |
Other | Lisa | 22412 |
I need help extending the script by selecting each gender and the values in the list to perform other functions, then calling the following gender and the values to achieve the same functions. for example, first, it selects the gender Male [Tom, John] and performs other functions. then selects the next gender Female [Jessica] and performs the same functions and then selects the gender Other [James, Lisa] and performs the same functions.
CodePudding user response:
I would recomend using the pandas module which allows for easy filtering and grouping of data
import pandas as pd
if __name__ == '__main__':
data = pd.read_csv('name_reports.csv')
L_name = input("Enter the last name: ")
by_last_name = data[data['Last_name'] == L_name]
groups = by_last_name.groupby(['Gender'])
for group_name, group_data in groups:
print(group_name)
print(group_data)
Breaking this down into its pieces the first part is
data = pd.read_csv('name_reports.csv')
This reads the data from the csv and places it into a dataframe Second we have
by_last_name = data[data['Last_name'] == L_name]
This filters the dataframe to only have results with Last_name equal to L_name Next we group the data.
groups = by_last_name.groupby(['Gender'])
this groups the filtered data frames by gender then we iterate over this. It returns a tuple with the group name and the dataframe associated with that group.
for group_name, group_data in groups:
print(group_name)
print(group_data)
This loop just prints out the data to access fields from it you can use the iterrows function
for index,row in group_data.iterrows():
print(row['city']
print(row['Phone']
print(row['Name']
And then you can use those for whatever function you want. I would recommend reading on the documentation for pandas since depending on the function you plan on using there may be a better way to do it using the library. Here is the link to the library https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.html
CodePudding user response:
Since you cannot use the pandas module then a method using only the csv module would look like this
import csv
def has_last_name(row,last_name):
return row['Last_name'] == last_name
def has_gender(row,current_gender):
return row['Gender'] == current_gender
if __name__ == '__main__':
data = None
genders = ['Male','Female','Other']
with open('name_reports.csv') as csvfile:
data = list(csv.DictReader(csvfile,delimiter=','))
L_name = input('Enter the Last name: ')
get_by_last_name = lambda row: has_last_name(row,L_name)
filtered_by_last_name = list(filter(get_by_last_name,data))
for gender in genders:
get_by_gender = lambda row: has_gender(row,gender)
filtered_by_gender = list(filter(get_by_gender,filtered_by_last_name))
print(filtered_by_gender)
The important part is the filter built in function. This takes in a function that takes in an item from a list and returns a bool. filter takes this function and an iterable and returns a generator of items that return true for that function. The other important part is the csv.DictReader which returns your csv file as a dictionary which makes allows you to access attributes by key instead of by index.