Dateframe that I am changing values of rows based on conditions.
Current Dataframe:
import pandas as pd
import re
data = [['ACK_ID','TEXT',30],
['TOT_ACTIVE_PARTCP_CNT','NUMERIC'],
['ADMIN_SIGNED_DATE', "TEXT", 30],
['BENEF_RCVG_BNFT_CNT','NUMERIC'],
['SPONS_SIGNED_DATE','TEXT',30]]
df = pd.DataFrame(data, columns=['FIELD_NAME', 'TYPE','SIZE (only for text fields)'])
#Change all "NUMERIC" to "FLOAT" in ['TYPE'] column.
df.loc[df["TYPE"] == "NUMERIC", "TYPE"] = "FLOAT"
I also want to change all ['TYPE'] rows that have 'DATE' within their ['FIELD_NAME'] entry. I want to use regex to capture 'DATE'.
Code attempt with regex:
df.loc[df["FIELD_NAME"] == r'^.*DATE $', "TYPE"] = "DATE"
This code does not change the dataframe at all.
The desired output is:
data = [['ACK_ID','TEXT',30],
['TOT_ACTIVE_PARTCP_CNT','FLOAT'],
['ADMIN_SIGNED_DATE', "DATE", 30],
['BENEF_RCVG_BNFT_CNT','FLOAT'],
['SPONS_SIGNED_DATE','DATE',30]]
df = pd.DataFrame(data, columns=['FIELD_NAME', 'TYPE','SIZE (only for text fields)'])
CodePudding user response:
You can use simple .str.contains
:
df.loc[df["FIELD_NAME"].str.contains("DATE"), "TYPE"] = "DATE"
print(df)
Prints:
FIELD_NAME TYPE SIZE (only for text fields)
0 ACK_ID TEXT 30.0
1 TOT_ACTIVE_PARTCP_CNT FLOAT NaN
2 ADMIN_SIGNED_DATE DATE 30.0
3 BENEF_RCVG_BNFT_CNT FLOAT NaN
4 SPONS_SIGNED_DATE DATE 30.0
CodePudding user response:
you can use str.contains
with a regex expression.
df.loc[df['FIELD_NAME'].str.contains(r'^.*DATE $'), 'TYPE'] = 'DATE'
print(df)
FIELD_NAME TYPE SIZE (only for text fields)
0 ACK_ID TEXT 30.0
1 TOT_ACTIVE_PARTCP_CNT FLOAT NaN
2 ADMIN_SIGNED_DATE DATE 30.0
3 BENEF_RCVG_BNFT_CNT FLOAT NaN
4 SPONS_SIGNED_DATE DATE 30.0
If Date is always at the end you could also just use str.endswith
:
df.loc[df['FIELD_NAME'].str.endswith('DATE'), 'TYPE'] = 'DATE'