Input DataFrame :
data = { "id" : ['[1,2]','[2,4]','[4,3]'],
"name" : ['a','b','c'] }
df = pd.DataFrame(data)
filterstr = [1,2]
Expected Output:
id name
[1,2] a
[2,4] b
Tried Code :
df1 = df[df.id.map(lambda x: np.isin(np.array(x), [[ str([i]) for i in filter]]).all())]
This works for single value in id column but not for two values like '[1,2]' Not sure where i am going wrong.
CodePudding user response:
Ok, well, let's do this :)
Creating DF:
data = { "id" : [[1,2],[2,4],[4,3]], #REMOVE STRING AROUND []!!!
"name" : ['a','b','c'] }
df = pd.DataFrame(data)
df
Result:
index | id | name |
---|---|---|
0 | 1,2 | a |
1 | 2,4 | b |
2 | 4,3 | c |
Then let's create a variable which will be our "boolean" filter:
reg = [1,2]
filter = df.id.apply(lambda x: any(i for i in x if i in reg))
Intermediate result:
0 True
1 True
2 False
Name: id, dtype: bool
Then select only "True" values:
df = df[filter]
df
Final result:
index | id | name |
---|---|---|
0 | 1,2 | a |
1 | 2,4 | b |
Please accept ✅ this answer if it solved your problem :)
Otherwise mention me (using @) in comment while telling me what's wrong ;)
CodePudding user response:
Taking exactly what you've given:
data = { "id" : ['[1,2]','[2,4]','[4,3]'],
"name" : ['a','b','c'] }
df = pd.DataFrame(data)
filterstr = [1,2]
I do:
df['id'] = df['id'].apply(eval) # Convert from string to list.
output = df[df.id.map(lambda id: any(x for x in id if x in filterstr))]
print(output)
Output:
id name
0 [1, 2] a
1 [2, 4] b