I have a column with data like 9 years 3 months 22 days. How to convert it into =9 (3/12) (22/365) =9.31
I want output as 9.31.
How to do the same in python? Please anyone can help me out?
CodePudding user response:
If leaps years are not important extract values by Series.str.extract
and count new column by multiple with dictionary and sum values:
df = pd.DataFrame({'col': ['9 years 3 months 22 days',
'3 YEARS 6 MONTHS',
'4 YEARS',
'3 YEARS',
'3.6']})
numeric = pd.to_numeric(df['col'], errors='coerce')
print (numeric)
y = df['col'].str.extract(r'(?i)(\d )\s*year', expand=False).astype(float)
m = df['col'].str.extract(r'(?i)(\d )\s*month', expand=False).astype(float).div(12)
d = df['col'].str.extract(r'(?i)(\d )\s*day', expand=False).astype(float).div(365)
df['float'] = y.add(m, fill_value=0).add(d, fill_value=0).fillna(numeric)
print (df)
col float
0 9 years 3 months 22 days 9.310274
1 3 YEARS 6 MONTHS 3.500000
2 4 YEARS 4.000000
3 3 YEARS 3.000000
4 3.6 3.600000
CodePudding user response:
If you want to update the same column then you can use
df = pd.DataFrame({'time': ['9 years 3 months 22 days',
'9 years 3 months 21 days',
'9 years 3 months 20 days']})
for idx in range(len(df)):
if df.time[idx] is np.NaN:
continue
# Extract values from string
val = df.time[idx].lower()
if ' ' in val:
val = val.split()
t = {val[i 1]: int(val[i]) for i in range(0, len(val), 2)}
# calculate float value upto two decimal places
float_val = "{:.2f}".format(t.get("years", 0) (t.get("months", 0)/12) (t.get("days", 0)/365))
# update table
df.time[idx] = float_val