I have a csv file named film.csv here is the header line with a few lines to use as an example
Year;Length;Title;Subject;Actor;Actress;Director;Popularity;Awards;*Image
1990;111;Tie Me Up! Tie Me Down!;Comedy;Banderas, Antonio;Abril, Victoria;Almodóvar, Pedro;68;No;NicholasCage.png
1991;113;High Heels;Comedy;Bosé, Miguel;Abril, Victoria;Almodóvar, Pedro;68;No;NicholasCage.png
1983;104;Dead Zone, The;Horror;Walken, Christopher;Adams, Brooke;Cronenberg, David;79;No;NicholasCage.png
1979;122;Cuba;Action;Connery, Sean;Adams, Brooke;Lester, Richard;6;No;seanConnery.png
1978;94;Days of Heaven;Drama;Gere, Richard;Adams, Brooke;Malick, Terrence;14;No;NicholasCage.png
1983;140;Octopussy;Action;Moore, Roger;Adams, Maud;Glen, John;68;No;NicholasCage.png
I am trying to filter, and need to display the move titles, for this criteria: first name contains "Richard", Year < 1985, Awards == "Y"
I am able to filter for the award, but not the rest. can you help?
file_name = "film.csv"
lines = (line for line in open(file_name,encoding='cp1252')) #generator to capture lines
lists = (s.rstrip().split(";") for s in lines) #generators to capture lists containing values from lines
#browse lists and index them per header values, then filter all movies that have been awarded
#using a new generator object
cols=next(lists) #obtains only the header
print(cols)
collections = (dict(zip(cols,data)) for data in lists)
filtered = (col["Title"] for col in collections if col["Awards"][0] == "Y")
for item in filtered:
print(item)
# input()
This works for the award but I don't know how to add additional filters. Also when I try to filter for if col["Year"] < 1985
I get error message because string vs int not compatible. How do I make the years a value?
I believe for the first name I can filter like this:
if col["Actor"].split(", ")[-1] == "Richard"
CodePudding user response:
You know how to add one filter. There is no such thing as "additional" filters. Just add your conditions to the current condition. Since you want all of the conditions to be True
to select a record, you'd use the boolean and logic. For example:
filtered = (
col["Title"]
for col in collections
if col["Awards"][0] == "Y"
and col["Actor"].split(", ")[-1] == "Richard"
and int(col["Year"]) < 1985
)
Notice I added an int()
around the col["Year"]
to convert it to an integer.
You've actually gone and reinvented csv.DictReader
in the setup to this problem! Instead of
file_name = "film.csv"
lines = (line for line in open(file_name,encoding='cp1252')) #generator to capture lines
lists = (s.rstrip().split(";") for s in lines) #generators to capture lists containing values from lines
#browse lists and index them per header values, then filter all movies that have been awarded
#using a new generator object
cols=next(lists) #obtains only the header
print(cols)
collections = (dict(zip(cols,data)) for data in lists)
filtered = ...
You could have just done:
import csv
file_name = "film.csv"
with open(file_name) as f:
collections = csv.DictReader(delimiter=";")
filtered = ...