I'm trying to change a dataframe column using
df.loc[df['xxx'].notna(), 'xxx'] = df.loc[df['xxx'].notna(), 'xxx'].astype(str).str[:10].str.replace('-','')
This does not seem to have any effect on the column's values. When running it without the loc[conditional, 'xxx'], it does seem to work
df['xxx'] = df['xxx'].astype(str).str[:10].str.replace('-','')
This challenges my core understanding of pandas, since I always use .loc to change a subset of a row.
I'm using pandas 1.2.4
CodePudding user response:
My test is effect, test code as below. But my version is 1.0.4.
import pandas as pd
print(pd.__version__)
df = pd.DataFrame(
{'xxx': ['AABBCC-DDEEE', 'DIs-sssssssssssP', 'KKK', 'A', 'A'],
'tmp': [1, 2, 3, 4, 5]})
print(df)
df.loc[df['xxx'].notna(), 'xxx'] = df.loc[df['xxx'].notna(), 'xxx'].astype(str).str[:10].str.replace('-','')
print(df)
Result as below
1.0.4
xxx tmp
0 AABBCC-DDEEE 1
1 DIs-sssssssssssP 2
2 KKK 3
3 A 4
4 A 5
xxx tmp
0 AABBCCDDE 1
1 DIsssssss 2
2 KKK 3
3 A 4
4 A 5
CodePudding user response:
For me working your solution correct, here is alternative solution:
df = pd.DataFrame({'xxx': ['AABBCC-DDEEE', 'AABBCC-DDEEE', np.nan, np.nan]})
print(df)
xxx
0 AABBCC-DDEEE
1 AABBCC-DDEEE
2 NaN
3 NaN
df.update(df.loc[df['xxx'].notna(), 'xxx'].astype(str).str[:10].str.replace('-',''))
print(df)
xxx
0 AABBCCDDE
1 AABBCCDDE
2 NaN
3 NaN
Your second solution converting missing values to nan
s strings:
df['xxx'] = df['xxx'].astype(str).str[:10].str.replace('-','')
print(df)
xxx
0 AABBCCDDE
1 AABBCCDDE
2 nan
3 nan