Assuming a dataframe where a the content of a column is one list of 0 to n strings
df = pd.DataFrame({'col_w_list':[['c/100/a/111','c/100/a/584','c/100/a/324'],
['c/100/a/327'],
['c/100/a/324','c/100/a/327'],
['c/100/a/111','c/100/a/584','c/100/a/999'],
['c/100/a/584','c/100/a/327','c/100/a/999']
]})
How would I go about transforming the column (either the same or a new one) if all I wanted was the last set of digits, meaning
| | target_still_list |
|--|-----------------------|
|0 | ['111', '584', '324'] |
|1 | ['327'] |
|2 | ['324', '327'] |
|3 | ['111', '584', '999'] |
|4 | ['584', '327', '999'] |
I know how to handle this one list at a time
from os import path
ls = ['c/100/a/111','c/100/a/584','c/100/a/324']
new_ls = [path.split(x)[1] for x in ls]
# or, alternatively
new_ls = [x.split('/')[3] for x in ls]
But I have failed at doing the same over a dataframe. For instance
df['target_still_list'] = df['col_w_list'].apply([lambda x: x.split('/')[3] for x in df['col_w_list']])
Throws an AttributeError
at me.
CodePudding user response:
How to apply transformation to each element?
For a data frame, you can use pandas.DataFrame.applymap
.
For a series, you can use pandas.Series.map
or pandas.Series.apply
, which is your posted solution.
Your error is caused by the lambda expression. It takes an element x
, so the type of x
is list
, you can directly iterate over its items.
The correct code should be,
df['target_still_list'] = df['col_w_list'].apply(lambda x: [item.split('/')[-1] for item in x])
# or
# df['target_still_list'] = df['col_w_list'].map(lambda x: [item.split('/')[-1] for item in x])
# or (NOTE: This assignment works only if df has only one column.)
# df['target_still_list'] = df.applymap(lambda x: [item.split('/')[-1] for item in x])