So I have a dataframe with a column as such:
column
--------
['getNode', 'getCodec', 'PackStore', 'DownRoute']
['MessageDigest', 'getInstance', 'SecureRandom']
...
I also have a dictionary that looks like this:
{
getNode: 1,
getCodec: 2,
PackStore: 3,
DownRoute: 4,
MessageDigest: 5,
getInstance: 6,
SecureRandom: 7,
...
}
My goal is to replace each item in the lists within column with the values that appear in the dictionary. i.e.:
column
--------
[1,2,3,4]
[5,6,7]
...
I have tried calling:
df.column.map(dict)
But i get an error: unhashable type: 'list'
Any additional help would be awesome! thanks!
CodePudding user response:
Try apply
:
df.column.apply(lambda x: pd.Series(x).map(dct).tolist())
Or just:
df.column.apply(lambda x: list(map(dct.get, x)))
CodePudding user response:
Here's another way:
df.explode('column').squeeze().map(dd).groupby(level=0).agg(list)
Output:
0 [1, 2, 3, 4]
1 [5, 6, 7]
Name: column, dtype: object
Option 2:
pd.Series([list(map(dd.get, l)) for l in df['column']])
Output:
0 [1, 2, 3, 4]
1 [5, 6, 7]
dtype: object
Timings:
apply-lambda-map-tolist:
%timeit df.column.apply(lambda x: pd.Series(x).map(dd).tolist())
1.15 ms ± 39.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
explode-squeeze-map-groupby:
%timeit df.explode('column').squeeze().map(dd).groupby(level=0).agg(list)
2.56 ms ± 78.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
pd.Series construct with list comprehension and map:
%timeit pd.Series([list(map(dd.get, l)) for l in df['column']])
88.7 µs ± 4.58 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
CodePudding user response:
Let us do explode
df.column.explode().map(dd).groupby(level=0).agg(list)