I have a dataset where I would like to append rows if this value is missing from a specific column.
Data
ID Date type energy
AA Q1 2022 a 1
AA Q2 2022 ok 1
AA Q3 2022 yes 1
AA Q4 2022 yes 4
FC Q1 2022 no 4
FC Q2 2022 no 4
FC Q3 2022 yes 45
FC Q4 2022 yes 5
Desired
ID Date type energy
AA Q1 2022 a 1
AA Q2 2022 ok 1
AA Q3 2022 yes 1
AA Q4 2022 yes 4
FC Q1 2022 no 4
FC Q2 2022 no 4
FC Q3 2022 yes 45
FC Q4 2022 yes 5
BC Q1 2022 0
DA Q1 2022 0
BC Q1 2023 0
DA Q1 2023 0
Doing
stat = {
'BC':'Q1 2022', Q1 2023
'DA':'Q1 2022', Q1 2023 }
df0 = df.append(pd.DataFrame({'ID':[x for x in stat], 'Date':[stat[x] for x in stat]}))
df1 = df.merge(df0, how='left').fillna({'energy': 0})
However, this is not appending, not sure how to attach the values dictionary within this script. I am still researching, any suggestion is appreciated.
CodePudding user response:
You can append new rows to a dataframe
in various ways. One of which is to use lists
and loc
method:
import pandas as pd
data = pd.read_csv('data.csv', sep=';')
new_data = [
['BC', 'Q1 2022', '', 0],
['DA', 'Q1 2022', '', 0],
['BC', 'Q1 2023', '', 0],
['DA', 'Q1 2023', '', 0]
]
for row in new_data:
data.loc[len(data)] = row
print(data)
Which will give you the following output:
ID Date type energy
0 AA Q1 2022 a 1
1 AA Q2 2022 ok 1
2 AA Q3 2022 yes 1
3 AA Q4 2022 yes 4
4 FC Q1 2022 no 4
5 FC Q2 2022 no 4
6 FC Q3 2022 yes 45
7 FC Q4 2022 yes 5
8 BC Q1 2022 0
9 DA Q1 2022 0
10 BC Q1 2023 0
11 DA Q1 2023 0
From the other hand, if you want to use dictonaries
specifically, then you can use append
method:
data = pd.read_csv('data.csv', sep=';')
new_data = {
'ID': ['BC', 'DA', 'BC', 'DA'],
'Date': ['Q1 2022', 'Q1 2022', 'Q1 2023', 'Q1 2023'],
'type': ['', '', '', ''],
'energy': [0, 0, 0, 0]
}
data = data.append(pd.DataFrame(new_data), ignore_index=True)
print(data)
It will give you the same output as previously.
Here you can find more information on how to append data to dataframe
: Quick Examples of Append Row to DataFrame