Home > Mobile >  How to create a data frame using two unequal length of lists
How to create a data frame using two unequal length of lists

Time:11-05

I have two lists like below:-

column1= [30, 40, 50, 60,90,20,30,20,30]
column2= ['bat', 'ball','tent']

I want to create a dataframe using two lists such that the values of column2 have to equally repated to create a data frame. I tried to implement it but I am getting an output like below:-

import pandas as pd
import numpy as np

column1= [30, 40, 50, 60,90,20,30,20,30]
column2= ['bat', 'ball','tent']
table1 = np.tile(column1, len(column2))
table2 = np.repeat(column2, len(column1))
df = pd.DataFrame({"table1": table1, "table2": table2})
print(df)

output received:-

 table1 table2
0   30  bat
1   40  bat
2   50  bat
3   60  bat
4   90  bat
5   20  bat
6   30  bat
7   20  bat
8   30  bat
9   30  ball
10  40  ball
11  50  ball
12  60  ball
13  90  ball
14  20  ball
15  30  ball
16  20  ball
17  30  ball
18  30  tent
19  40  tent
20  50  tent
21  60  tent
22  90  tent
23  20  tent
24  30  tent
25  20  tent
26  30  tent

Excepted output:-

table1  table2
0   30  bat
1   40  bat
2   50  bat
3   60  ball
4   90  ball
5   20  ball
6   30  tent
7   20  tent
8   30  tent

Is there a better way to get the excepted output?

CodePudding user response:

Use np.repeat like this:

df = pd.DataFrame({'table1': column1,
                   'table2': np.repeat(column2, len(column1) // len(column2))})
print(df)

# Output:
   table1 table2
0      30    bat
1      40    bat
2      50    bat
3      60   ball
4      90   ball
5      20   ball
6      30   tent
7      20   tent
8      30   tent

CodePudding user response:

The code you are using will help you to achieve this result. there is no need to work on column1, you can try this it will return you the desired output

import pandas as pd
import numpy as np

column1= [30, 40, 50, 60,90,20,30,20,30]
column2= ['bat', 'ball','tent']
# table1 = np.tile(column1, len(column2))
table2 = np.repeat(column2, len(column1)/len(column2))
df = pd.DataFrame({"table1": column1, "table2": table2})
print(df)

but in this approach column1 must be a multiple of column2

  • Related