Home > OS >  Adding 2 rows with 0s at the start and end of pandas dataframe
Adding 2 rows with 0s at the start and end of pandas dataframe

Time:12-07

I have a pandas Dataframe named dataframe. I want to add two rows at the start and end of the data frame with 0s.

#create DataFrame
df_x = pd.DataFrame({'logvalue': ['20', '20.5', '18.5', '2', '10'],
                     'ID': ['1', '2', '3', '4', '5']})

Output should look like below.

logvalue ID violatedInstances
0 0 0
20 1 0
20.5 2 1
18.5 3 0
2 4 1
10 5 1
0 0 0

The output should rearrange the indexes of the dataframe as well. How can I do this in pandas?

CodePudding user response:

You can use concat:

  • First create a new dataframe (df_y) that contains the zero'd row
  • Use the concat function to join this dataframe with the original
  • Use the reset_index(drop=True) function to reset the index.

Code:

df_x = pd.DataFrame({ 'logvalue': [20.0, 20.5, 18.5, 2.0, 10.0, 0.0],
                    'ID': [1, 2, 3, 4, 5, 0],
                    'violatedInstances': [0, 1, 0, 1, 1, 0]})

# Extract the column names from the original dataframe
column_names = df_x.columns
number_of_columns = len(column_names)
row_of_zeros = [0]*number_of_columns

# Create a new dataframe that has a row of zeros
df_y = pd.DataFrame([row_of_zeros], columns=column_names)

# Join the dataframes together
output = pd.concat([df_y, df_x, df_y]).reset_index(drop=True)

print(output)

Output:

   logvalue  ID  violatedInstances
0       0.0   0                  0
1      20.0   1                  0
2      20.5   2                  1
3      18.5   3                  0
4       2.0   4                  1
5      10.0   5                  1
6       0.0   0                  0
7       0.0   0                  0

CodePudding user response:

Example

df_x = pd.DataFrame({'logvalue': ['20', '20.5', '18.5', '2', '10'],
                     'ID': ['1', '2', '3', '4', '5']})

df_x

    logvalue    ID
0   20          1
1   20.5        2
2   18.5        3
3   2           4
4   10          5

Code

use reindex with fill_value

idx = ['start']   df_x.index.tolist()   ['end']
df_x.reindex(idx, fill_value=0).reset_index(drop=True)

result:

    logvalue    ID
0   0           0
1   20          1
2   20.5        2
3   18.5        3
4   2           4
5   10          5
6   0           0

['start'] and ['end'] of idx variable : any label that is not in index of df_x.

  • Related