Is there an efficient way to add leading zeroes to a Dataframe's columns if:
- The column contains integer values, and
- For a given value of another column in the same Dataframe
E.g. for the following Dataframe, how do I create a new DataFrame that has leading zeroes in values in col_2
if:
col_2
are integer values (i.e. not"text"
nor"text2"
)col_1 == "A"
Initial Dataframe:
col_1 col_2
0 A 12345
1 B 863
2 A text
3 C 893423
4 D text2
Desired output Dataframe:
col_1 col_2
0 A 00012345
1 B 863
2 A text
3 C 893423
4 D text2
CodePudding user response:
You can use to_numeric
with errors="coerce"
to ensure values are numeric:
# is the value numeric?
m1 = pd.to_numeric(df['col_2'], errors='coerce').notna()
# is col_1 equal to "A"?
m2 = df['col_1'].eq('A')
# pick the rows matching both conditions
# and do something with it
df[m1&m2]
If you want to ensure having integers and not any numerical value (i.e. floating points), you can use:
s = pd.to_numeric(df['col_2'], errors='coerce')
# is a numeric value and an integer
m1 = s.notna() & s.eq(s.round())
zfilling
m1 = pd.to_numeric(df['col_2'], errors='coerce').notna()
m2 = df['col_1'].eq('A')
df.loc[m1&m2, 'col_2'] = df.loc[m1&m2, 'col_2'].astype(str).str.zfill(8)
output:
col_1 col_2
0 A 00012345
1 B 863
2 A text
3 C 893423
4 D text2
CodePudding user response:
I think this does what you want :
df[df['col_1']=='A'].apply(lambda x: isinstance(x, int) )
Output :
col_1 False
col_2 False