I have a pandas dataframe A that has 5 columns and a few 100 thousand rows. What I need is to create a dataframe B that has 50 columns with 45 of them empty and the other 5 filled with the data I have in dataframe A.
The reason I need it in this format is because I want to eventually covert to a csv file with a (,) delimiter and most of the columns empty.
My Dataframe A looks like this:
id | order | first | last | type |
---|---|---|---|---|
1 | 111 | Johnny | Depp | type1 |
2 | 222 | Amber | Heard | type2 |
my Dataframe B should look something like this with more empty columns at the end:
x | order | first | last | x | x | x | x | x | x | x | type | x | x | x | x |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
empty | 111 | Johnny | Depp | empty | empty | empty | empty | empty | empty | empty | type1 | empty | empty | empty | empty |
empty | 222 | Amber | Heard | empty | empty | empty | empty | empty | empty | empty | type2 | empty | empty | empty | empty |
As you can see I need to specify the position of the column for the type
column. This is because I eventually want to convert to CSV with the function
to_csv(delimiter=',')
which will eventually looks like this:
,111,Johnny,Depp,,,,,,,,,type1,,,,,
,222,Amber,Heard,,,,,,,,,type2,,,,,
CodePudding user response:
import pandas as pd
a = pd.DataFrame({"id": [1, 2], "order": [111, 222], "first": ["Johnny", "Amber"], "last": ["Depp", "Heard"], "type": ["type1", "type2"]})
push = ["x", "order", "first", "last"] list("x" * 7) ["type"] list("x" * 4)
cols = [f"x{num}" if value == "x" else value for num, value in enumerate(push)]
b = pd.DataFrame({col: a[col] if col in a.columns.to_list() else None for col in cols})
print(b)
Seems like a fairly arbitrary problem, but I think this solves your specific request. Feel free to change the "x" * 7
value to reflect your wishes. Also you can replace None
with np.nan
if you import numpy as np
. Or you could replace None
with ""
to insert empty strings. Your questions is a bit vague by stating "empty".
Output:
x0 order first last x4 x5 x6 x7 x8 x9 x10 type x12 x13 x14 x15
0 None 111 Johnny Depp None None None None None None None type1 None None None None
1 None 222 Amber Heard None None None None None None None type2 None None None None
CodePudding user response:
Ok, so I am assuming dataframe B already has the the first 5 columns filled with the data you need.
You can then just make a loop to add however many blank columns you want:
i=4 # However many columns the df started with
while i < 50: # or however many blank columns you want to add
df[f'column_{i}'] = ''
i =1