I need to filter the df based on 2 criteria: where Name != jack and ignore all dates for jack where Date <= 2020-04-01
# List of Tuples
df = [ ('jack', 'Apples' , '2020-01-01') ,
('Riti', 'Mangos' , '2020-02-01') ,
('Aadi', 'Grapes' , '2020-03-01') ,
('jack', 'Oranges', '2020-04-01') ,
('Lucy', 'Mangos' , '2020-05-01') ,
('jack', 'Apples' , '2020-12-01')
]
#Create a DataFrame object
df1 = pd.DataFrame(df, columns = ['Name' , 'Product', 'Date'])
df1
Expected result being:
Name Product Date
0 Riti Mangos 2020-02-01
1 Aadi Grapes 2020-03-01
2 Lucy Mangos 2020-05-01
3 jack Apples 2020-12-01
CodePudding user response:
boolean indexing
of multi condition
cond1 = df1['Name'] != 'jack'
cond2 = pd.to_datetime(df1['Date']) > pd.Timestamp('2020-04-01')
df1[cond1 | cond2].reset_index(drop=True)
output:
Name Product Date
0 Riti Mangos 2020-02-01
1 Aadi Grapes 2020-03-01
2 Lucy Mangos 2020-05-01
3 jack Apples 2020-12-01
CodePudding user response:
You could do the below:
import pandas as pd
# List of Tuples
df = [('jack', 'Apples' , '2020-01-01'),
('Riti', 'Mangos' , '2020-02-01'),
('Aadi', 'Grapes' , '2020-03-01'),
('jack', 'Oranges', '2020-04-01'),
('Lucy', 'Mangos' , '2020-05-01'),
('jack', 'Apples' , '2020-12-01')
]
#Create a DataFrame object
df1 = pd.DataFrame(df, columns = ['Name', 'Product', 'Date'])
# Filtering for condition Name = !jack
df1 = df1[df1['Name'] != 'jack']
# Convert the date to datetime64
df1['Date'] = pd.to_datetime(df1['Date'], format='%Y-%m-%d')
# Filter by specified date
filtered_df = df1.loc[(df1['Date'] <= '2020-04-01')]
# print the filtered result
print(filtered_df)