I have a dataframe and I want to insert a record with the input date and leave the remaining columns NA if input date does not exist. If input date does exist do nothing
InputDate="5/1/2022"
if inputdate does not exist as a record insert new record
In
Invoice Date ... Check
1 2022-04-01 ... 1.30
2 2022-03-01 ... 1.19
Out
Invoice Date ... Check
0 2022-05-01 ... NaN
1 2022-04-01 ... 1.30
2 2022-03-01 ... 1.19
CodePudding user response:
You can use:
import pandas as pd
import numpy as np
df = pd.DataFrame({'Invoice Date': ['2022-04-01', '2022-03-01'], 'Check': [1.30, 1.19]})
df['Invoice Date'] = pd.to_datetime(df['Invoice Date'])
input_date="5/1/2022"
if pd.to_datetime(input_date) not in df['Invoice Date']:
row = {'Invoice Date': pd.to_datetime(input_date)}
row.update({col: np.nan for col in df.columns if col not in row})
df.loc[len(df)] = row
Output
Invoice Date Check
0 2022-04-01 1.30
1 2022-03-01 1.19
2 2022-05-01 NaN
If you want them sorted in reverse chronological order like shown above you can do:
df.sort_values(by='Invoice Date', ascending=False).reset_index(drop=True)
Output
Invoice Date Check
0 2022-05-01 NaN
1 2022-04-01 1.30
2 2022-03-01 1.19
CodePudding user response:
You could also set the date column as index and then use append
method with verify_integrity=True
.
If the index exists, it will throw an error, if not then add NaN values to the remaining columns.
import pandas as pd
import numpy as np
df = pd.DataFrame({'Invoice Date': ['2022-04-01', '2022-03-01'], 'Check': [1.30, 1.19], 'Check2': [1.30, 1.19]})
df = df.set_index('Invoice Date')
new_row = pd.DataFrame(np.nan, columns=df.columns, index=['2023-04-01'])
df.append(new_row, verify_integrity=True)
Output:
Check Check2
2022-04-01 1.30 1.30
2022-03-01 1.19 1.19
2023-04-01 NaN NaN
The following code will raise an error ValueError: Indexes have overlapping values: Index(['2022-03-01'], dtype='object')
. So you can catch it!
new_row = pd.DataFrame(np.nan, columns=df.columns, index=['2022-03-01'])
df.append(new_row, verify_integrity=True)