So I have a dataframe with a column as such:

column
--------
['getNode', 'getCodec', 'PackStore', 'DownRoute']
['MessageDigest', 'getInstance', 'SecureRandom']
...

I also have a dictionary that looks like this:

{
getNode: 1,
getCodec: 2, 
PackStore: 3, 
DownRoute: 4,
MessageDigest: 5, 
getInstance: 6, 
SecureRandom: 7,
...
}

My goal is to replace each item in the lists within column with the values that appear in the dictionary. i.e.:

column
--------
[1,2,3,4]
[5,6,7]
...

I have tried calling:

df.column.map(dict)

But i get an error: unhashable type: 'list'

Any additional help would be awesome! thanks!

CodePudding user response：

Try apply:

df.column.apply(lambda x: pd.Series(x).map(dct).tolist())

Or just:

df.column.apply(lambda x: list(map(dct.get, x)))

CodePudding user response：

Here's another way:

df.explode('column').squeeze().map(dd).groupby(level=0).agg(list)

Output:

0    [1, 2, 3, 4]
1       [5, 6, 7]
Name: column, dtype: object

Option 2:

pd.Series([list(map(dd.get, l)) for l in df['column']])

Output:

0    [1, 2, 3, 4]
1       [5, 6, 7]
dtype: object

Timings:

%timeit df.column.apply(lambda x: pd.Series(x).map(dd).tolist())

1.15 ms ± 39.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

%timeit df.explode('column').squeeze().map(dd).groupby(level=0).agg(list)

2.56 ms ± 78.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit pd.Series([list(map(dd.get, l)) for l in df['column']])

88.7 µs ± 4.58 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

CodePudding user response：

Let us do explode

df.column.explode().map(dd).groupby(level=0).agg(list)