Home > Software design >  Interleave a list and distribute to two columns of a dataframe by pandas?
Interleave a list and distribute to two columns of a dataframe by pandas?

Time:04-21

I have a list and I want to interleave the elements in all combinations then distribute them to two columns of a dataframe in pandas, like:

df = pd.DataFrame(columns = ["pair1","pair2"])

mylist = ["a", "b", "c"]

for i in mylist:

    for j in mylist:

        df.loc[df.shape[0]] = [i, j]

to output

    pair1   pair2
0   a   a
1   a   b
2   a   c
3   b   a
4   b   b
5   b   c
6   c   a
7   c   b
8   c   c

However, such an assignment is slow.

Do we have a faster method?

CodePudding user response:

For a pandas solution, you could use pd.MultiIndex:

df[['pair1','pair2']] = pd.MultiIndex.from_product([mylist]*2).tolist()

or you could also cross-merge (if you have pandas>=1.2.0):

df = pd.merge(pd.Series(mylist, name='pair1'), pd.Series(mylist, name='pair2'), how='cross')

Output:

  pair1 pair2
0     a     a
1     a     b
2     a     c
3     b     a
4     b     b
5     b     c
6     c     a
7     c     b
8     c     c

CodePudding user response:

You can use itertools.product() to generate the data ahead of time, rather than repeatedly appending to the end of the dataframe:

import pandas as pd
from itertools import product

mylist = ["a", "b", "c"]
df = pd.DataFrame(product(mylist, repeat=2), columns = ["pair1","pair2"])
print(df)

This outputs:

  pair1 pair2
0     a     a
1     a     b
2     a     c
3     b     a
4     b     b
5     b     c
6     c     a
7     c     b
8     c     c
  • Related