I'm trying to find Names that contain the letters by user input. In this case, finding Names in the Name
column that contain 'a' and 'i' however getting an error:
data = {'Name': ['Aerial', 'Tom', 'Amie', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['pennsylvania', 'newyork', 'newjersey', 'delaware'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd']}
df = pd.DataFrame(data)
df["Name"] = df["Name"].str.lower()
print(df)
letters_in = input('Words in Name Column that contain these letters: \n ').split()
new_output = df.loc[df['Name'].str.contains(letters_in, case=False)]
Code run:
Words in Name Column that contain these letters:
>? a e
ERROR:
TypeError: unhashable type: 'list'
Ideal Output (as dataframe):
Aerial
Amie
CodePudding user response:
First, to address your error message, the contains()
method expects a string as its first argument, not a list.
The string it expects is a character sequence or regular expression (see here) that it will attempt to match, which I believe is different from what you are attempting, namely to find rows with Name containing all input letters.
To do this, you can use the following approach, for example:
import pandas as pd
data = {'Name': ['Aerial', 'Tom', 'Amie', 'Anuj'],
'Age': [27, 24, 22, 32],
'Address': ['pennsylvania', 'newyork', 'newjersey', 'delaware'],
'Qualification': ['Msc', 'MA', 'MCA', 'Phd']}
df = pd.DataFrame(data)
df["Name"] = df["Name"].str.lower()
#letters_in = input('Words in Name Column that contain these letters: \n ').split()
letters_in = ['a', 'i']
new_output = df[df.apply(lambda x: all(letter in x['Name'] for letter in letters_in), axis=1)]
print(new_output)
Output:
Name Age Address Qualification
0 aerial 27 pennsylvania Msc
2 amie 22 newjersey MCA