I want to add multiple empty rows at start of my dataframe. I have tried using list but it dosen't seem to return optimum result:
Example df:
Col1 | col2 | col3 | col4 |
---|---|---|---|
One | Two | Three | four |
2 | 4 | 5 | 8 |
Desired df:
Col1 | col2 | col3 | col4 |
One | Two | Three | four |
2 | 4 | 5 | 8 |
Column names should also start from the nth row, I want to add n empty rows at the beginning of my Dataframe.
CodePudding user response:
I'm not sure why you would want to do this but I did it by splitting up the original dataframe into a dataframe with a row of the column names and a separate dataframe of the data. I then created a dataframe of nans to be the blank rows and joined the 3 together. You will need to import numpy for this.
I created a variable no_cols
to be the number of columns in the dataframe and no_empty_rows
to be how many empty rows to simplify code:
no_cols = len(df.columns)
no_empty_rows = 6
Then I turned the columns into their own dataframe, with 1 row which is the column names, and headers as np.nan:
cols = pd.DataFrame([df.columns], columns = [np.nan]*no_cols)
NaN NaN NaN NaN
0 Col1 col2 col3 col4
Next I renamed the columns in the original dataframe to nan:
df.columns = [np.nan]*no_cols
NaN NaN NaN NaN
0 One Two Three four
1 2 4 5 8
Then I created a new dataframe of nans, with 6 blank rows (this can be changed):
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
You can then append together all 3. First I put the columns and data of df
back together and reset their index, then append that to df_empty_rows
:
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))
NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
NaN NaN NaN NaN NaN
0.0 Col1 col2 col3 col4
1.0 One Two Three four
2.0 2 4 5 8
Full code:
no_cols = len(df.columns)
no_empty_rows = 6
cols = pd.DataFrame([df.columns], columns=[np.nan]*no_cols)
df.columns = [np.nan]*no_cols
df_empty_rows = (pd.DataFrame(data=[[np.nan]*no_cols]*no_empty_rows,
columns=[np.nan]*no_cols,
index=[np.nan]*no_empty_rows))
df_out = df_empty_rows.append(cols.append(df).reset_index(drop=True))