I have the following sample data:
| | date | value |
|:---- |:------:| -----:|
| 0 | 2021/05| 50 |
| 1 | 2021/06| 60 |
| 2 | 2021/07| 70 |
| 3 | 2021/08| 80 |
| 4 | 2021/09| 90 |
| 5 | 2021/10| 100 |
I want to update the data in the 'date' column, where for example '2021/05' becomes '05/10/2021', '2021/06' becomes '06/12/2021' and so long (I have to choose the new date manually for every row).
Is there a better/more clever way to do it instead of:
for i in df.index:
if df['date'][i] == '2021/05':
df['date'][i] = '05/10/2021'
elif df['date'][i] == '2021/06':
df['date'][i] = '06/12/2021'
The problem is that there are more than hundred rows that have to be updated and the code above will be tremendously long.
CodePudding user response:
I would use an interactive approach, saving the amended DataFrame
to a file at the end:
import pandas as pd
dt = pd.DataFrame({"date":["2021/05", "2021/06", "2021/07", "2021/08", "2021/09", "2021/10"], "value": [50, 60, 70, 80, 90, 100]})
for n, i in enumerate(dt.loc[:,"date"]):
to_be_parsed = True
while parsed:
day = input("What is the day for {:s}?".format(i))
date_str = "{:s}/{:0>2s}".format(i, day)
try:
dt.loc[n,"date"] = pd.to_datetime("{:s}/{:0>2s}".format(i, day)).strftime("%m/%d/%Y")
to_be_parsed = False
except:
print("Invalid date: {:s}. Try again".format(date_str))
output_path = input("Save amended dataframe to path (no input to skip): ")
if len(output_path) > 0:
dt.to_csv(output_path, index=False)
CodePudding user response:
We can use the select
method from numpy
like so :
import numpy as np
condlist = [df['date'] == '2021/05',
df['date'] == '2021/06']
choicelist = ['05/10/2021',
'06/12/2021']
df['date'] = np.select(condlist, choicelist, default=np.nan)