I have a column of values where some rows have a range of values. I want to keep the second value of the range.
[10%,15%,30%,16.1% - 22%,16.1% - 22%,16.189% - 20.67%,16.189% - 20.67%]
The code I use to loop through the column and do something based on the condition is :
for i in df.column:
if ' - ' in str(i):
split = i.replace(' ', '').split('-')[1]
print(split)
When I print split, I get the value that I want. However, I can't figure out how to update the value in the column to keep the value I want.
When I do this:
for i in df.column:
if ' - ' in str(i):
split = i.replace(' ', '').split('-')[1]
df['column'][i] =split
I get: **C:\Users\lukem\AppData\Local\Temp\ipykernel_15752\833050674.py:4: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy df['column'][i] =split C:\Users\lukem\AppData\Local\Temp\ipykernel_15752\833050674.py:4: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame**
Any advice is appreciated.
CodePudding user response:
you can use apply with lambda to apply it directly on the whole column:
import pandas as pd
df=pd.DataFrame(['10%','15%','30%','16.1% - 22%','16.1% - 22%','16.189% - 20.67%','16.189% - 20.67%'],columns=['column'])
df.column=df.column.apply(lambda x :x.split('-')[1] if ' - ' in x else x)
CodePudding user response:
For the first code you can try this instead df.apply function.
df[column]=df[column].apply(functn_name)
CodePudding user response:
You can use str.extract
for this:
import pandas as pd
data = {'column': ['10%','15%','30%','16.1% - 22%','16.1% - 22%',
'16.189% - 20.67%','16.189% - 20.67%']}
df = pd.DataFrame(data)
df['column'] = df.column.str.extract(r'(\d \.?\d %$)')
print(df)
column
0 10%
1 15%
2 30%
3 22%
4 22%
5 20.67%
6 20.67%
Or you could use str.split
with expand=True
. Apply df.ffill
on axis=1
and select only the last column:
df['column'] = df.column.str.split(' - ', expand=True).ffill(axis=1).iloc[:,-1]