Home > front end >  Joining variable sized lists as columns to a dataframe
Joining variable sized lists as columns to a dataframe

Time:06-25

I would like to join multiple sized lists to an empty dataframe from right. The code looks like

L1 = [1,2,3,4]
L2 = [5,1,7,10,8,2,3]
cols = ['L1', 'L2']
df = pd.DataFrame(columns=cols)
df = df.join(pd.DataFrame(L1), how="right")
df = df.join(pd.DataFrame(L2), how="right")
print(df)

But I get this error:

    df = df.join(pd.DataFrame(L2), how="left")
  File "/home/mnaderan/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 8110, in join
    return self._join_compat(
  File "/home/mnaderan/.local/lib/python3.8/site-packages/pandas/core/frame.py", line 8135, in _join_compat
    return merge(
  File "/home/mnaderan/.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 89, in merge
    return op.get_result()
  File "/home/mnaderan/.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 686, in get_result
    llabels, rlabels = _items_overlap_with_suffix(
  File "/home/mnaderan/.local/lib/python3.8/site-packages/pandas/core/reshape/merge.py", line 2178, in _items_overlap_with_suffix
    raise ValueError(f"columns overlap but no suffix specified: {to_rename}")
ValueError: columns overlap but no suffix specified: Index([0], dtype='object')

How can I fix that?

P.S: In the real code, the lists are generated from a function, but the number of lists are known.

cols = ['L1', 'L2']
df = pd.DataFrame(columns=cols)
for i in range(0,2):
    L = get_list()
    df = df.join(pd.DataFrame(L), how="right")
print(df)
df.boxplot(column=cols) 

CodePudding user response:

Use dict comprehension with Series constructors for join list with different sizes:

L = [get_list() for i in range(0,2)]

df = pd.DataFrame({a: pd.Series(b) for a, b in zip(cols, L)})
print (df)
    L1  L2
0  1.0   5
1  2.0   1
2  3.0   7
3  4.0  10
4  NaN   8
5  NaN   2
6  NaN   3
  • Related