Here is the original data:
Name Wine Year
0 Mark Volnay 1983
1 Mark Volnay 1979
3 Mary Volnay 1979
4 Mary Volnay 1999
5 Mary Champagne 1993
6 Mary Champagne 1989
I would like to be able to get the value of Year
in function of the values of Name
and Wine
. It would return all the values in the Year
column of the entries that have the corresponding values in the Name
and Wine
columns.
For example: with the key ['Mark', 'Volnay']
I would get the values [1983, 1979]
I tried manipulating the data and here is the best I could get.
Keep one instance of each key:
Name Wine Year
1 Jean Volnay 1979
4 Pierre Volnay 1999
6 Pierre Champagne 1989
Remove the Year
column
Name Wine
1 Jean Volnay
4 Pierre Volnay
6 Pierre Champagne
Get the values in a list
[['Mark', 'Volnay'], ['Mary', 'Volnay'], ['Mary', 'Champagne']]
I now have the keys I need, but I can't get the values in the original dataframe in function of the value of the key.
CodePudding user response:
You could use set_index
and then loc
:
key = ['Mark', 'Volnay']
lst = df.set_index(['Name', 'Wine']).loc[key, 'Year'].tolist()
Output:
>>> lst
[1983, 1979]
CodePudding user response:
You can also use groupby
with get_group
def getyear(datafrae,keys:list):
values = df.groupby(['Name', 'Wine']).get_group(tuple(key))['Year']
dedupvalues = [*dict.fromkeys(values).keys()] #incase of duplicates
return dedupvalues
keys = ['Mark', 'Volnay']
print(getyear(df,keys))
[1983, 1979]