Concat Pandas array type column element wise-CodePudding

I want to concat Pandas array type column element wise.

Input

Year    Month
['2021','2020','']  ['11','12','']
['2019','2020','']  ['11','12','']

Output

Output
['202111','202012','']
['201911','202012','']

CodePudding user response：

Use list comprehension if there is possible different length of lists per rows:

df['Output'] = [[c   d for c, d in zip(a, b)] for a, b in zip(df['Year'], df['Month'])]
print (df)
             Year       Month              Output
0  [2021, 2020, ]  [11, 12, ]  [202111, 202012, ]
1  [2019, 2020, ]  [11, 12, ]  [201911, 202012, ]

If there are same length in both columns/rows (here 3) use:

df1 = pd.DataFrame(df['Year'].tolist())   pd.DataFrame(df['Month'].tolist())
print (df1)
        0       1 2
0  202111  202012  
1  201911  202012  


df['Output'] = df1.to_numpy().tolist()
print (df)
             Year       Month              Output
0  [2021, 2020, ]  [11, 12, ]  [202111, 202012, ]
1  [2019, 2020, ]  [11, 12, ]  [201911, 202012, ]

CodePudding user response：

You can try with explode:

df['Output'] = np.sum(df.explode(['Year', 'Month']), axis=1) \
                 .groupby(level=0).apply(list)

For 5,000,000 rows, the above operation took 1min 2s.

Setup:

data = {'Year': [['2021', '2020', ''], ['2019', '2020', ''], ['2018']],
        'Month': [['11', '12', ''], ['11', '12', ''], ['07']]}
df = pd.DataFrame(data)
df1 = df.reindex(df.index.repeat(1666666)).reset_index(drop=True)

In [721]: %timeit -n 1 np.sum(df1.explode(['Year', 'Month']), axis=1).groupby(level=0).apply(list)
1min 2s ± 998 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)