Home > Software engineering >  Creating an empty dataframe with 50 columns with only 5 specific columns filled
Creating an empty dataframe with 50 columns with only 5 specific columns filled

Time:06-02

I have a pandas dataframe A that has 5 columns and a few 100 thousand rows. What I need is to create a dataframe B that has 50 columns with 45 of them empty and the other 5 filled with the data I have in dataframe A.

The reason I need it in this format is because I want to eventually covert to a csv file with a (,) delimiter and most of the columns empty.

My Dataframe A looks like this:

id order first last type
1 111 Johnny Depp type1
2 222 Amber Heard type2

my Dataframe B should look something like this with more empty columns at the end:

x order first last x x x x x x x type x x x x
empty 111 Johnny Depp empty empty empty empty empty empty empty type1 empty empty empty empty
empty 222 Amber Heard empty empty empty empty empty empty empty type2 empty empty empty empty

As you can see I need to specify the position of the column for the type column. This is because I eventually want to convert to CSV with the function to_csv(delimiter=',') which will eventually looks like this:

,111,Johnny,Depp,,,,,,,,,type1,,,,,
,222,Amber,Heard,,,,,,,,,type2,,,,,

CodePudding user response:

import pandas as pd

a = pd.DataFrame({"id": [1, 2], "order": [111, 222], "first": ["Johnny", "Amber"], "last": ["Depp", "Heard"], "type": ["type1", "type2"]})
push = ["x", "order", "first", "last"]   list("x" * 7)   ["type"]   list("x" * 4)
cols = [f"x{num}" if value == "x" else value for num, value in enumerate(push)]
b = pd.DataFrame({col: a[col] if col in a.columns.to_list() else None for col in cols})
print(b)

Seems like a fairly arbitrary problem, but I think this solves your specific request. Feel free to change the "x" * 7 value to reflect your wishes. Also you can replace None with np.nan if you import numpy as np. Or you could replace None with "" to insert empty strings. Your questions is a bit vague by stating "empty".

Output:

     x0  order   first   last    x4    x5    x6    x7    x8    x9   x10   type   x12   x13   x14   x15
0  None    111  Johnny   Depp  None  None  None  None  None  None  None  type1  None  None  None  None
1  None    222   Amber  Heard  None  None  None  None  None  None  None  type2  None  None  None  None

CodePudding user response:

Ok, so I am assuming dataframe B already has the the first 5 columns filled with the data you need.

You can then just make a loop to add however many blank columns you want:

i=4 # However many columns the df started with

while i < 50: # or however many blank columns you want to add
    df[f'column_{i}'] = ''
    i =1
  • Related