How to append to each column of empty pandas data frame different size of lists in a loop?-CodePudding

Hi guys this is getting frustrating.! After long hours of online browsing. I can not find a single source that can help here. How to append to each column of empty pandas data frame different size of lists? For instance, I have these three variables:

var1 = ['BBCL15', 'KL12TT', 'TMAA03', '1523FR']
var2 = [253, 452, 16]
var3 = ['23n2', 'akg_9', '12.3bl', '30x2', 'dd91']

And I would like to append it to empty pandas data frame in a loop:

df = pd.DataFrame(columns=['col1', 'col2', 'col3'])

# something like this.
for x in var1:
    df['col1'].append(pd.Series(x), ignore_index=True)

for x in var2:
    df['col2'].append(pd.Series(x), ignore_index=True)

for x in var3:
    df['col3'].append(pd.Series(x), ignore_index=True)

Where each variable correspond to a single column and fill in empties spaces with NaN as length of variables is not the same. Can someone help with this?

CodePudding user response：

>>> cols = ['col1', 'col2', 'col3']
>>> df = pd.DataFrame(columns=cols)
>>> max_len = max([len(var1), len(var2), len(var3)])
>>> for col, var in zip(cols, [var1, var2, var3]):
...     df[col] = var ([None]*(max_len - len(var)))
>>> df
     col1   col2    col3
0  BBCL15  253.0    23n2
1  KL12TT  452.0   akg_9
2  TMAA03   16.0  12.3bl
3  1523FR    NaN    30x2
4    None    NaN    dd91

CodePudding user response：

Create a list of lists to use list comphrehensions:

lists = [var1, var2, var3]

Get the length of the longesst list:

longest_length = max([len(v) for v in lists])

Pad the lists as required:

padded_lists = [v   [float("NaN")]*(longest_length - len(v)) for v in lists]

Create the data frame:

pd.DataFrame(padded_lists).T

CodePudding user response：

Here is another solution using pd.concat :

var1 = ['BBCL15', 'KL12TT', 'TMAA03', '1523FR']
var2 = [253, 452, 16]
var3 = ['23n2', 'akg_9', '12.3bl', '30x2', 'dd91']


df = pd.DataFrame()

for i in [var1, var2, var3] :
  df = pd.concat([df, pd.Series(i)], axis = 1, ignore_index  = True)
  
df.columns = ['col1', 'col2', 'col3']

Note: avoid naming the data frame columns in the first place when you are using this solution.