can anybody help me to create a Dataframe in python with for loop: I want to create a Dataframe from two lists: list1 and list of lists (list2) where the length of list1 = the number of sublists in list2: an example: list1= ["A", "B", "C", "D"] list2= [[1, 2, 3, 4], [1, 3, 4, 6, 7], [2, 3, 4, 5, 6], [2, 4, 5, 7, 8]]=>
my goal is to add/ iterate the values in list1 to matching the values in each sublist in list2 attached (and see belwo) is my final target/wished table:
| Col1 | Col2 |
| -------- | -------- |
| A | 1
| A | 2
A 3
A 4
B 1
B 3
B 4
B 6
B 7
Thank you in advance for answering my questions.
CodePudding user response:
Try:
list1 = ["A", "B", "C", "D"]
list2 = [[1, 2, 3, 4], [1, 3, 4, 6, 7], [2, 3, 4, 5, 6], [2, 4, 5, 7, 8]]
df = pd.DataFrame(zip(list1, list2), columns=["Col1", "Col2"]).explode("Col2")
print(df)
Prints:
Col1 Col2
0 A 1
0 A 2
0 A 3
0 A 4
1 B 1
1 B 3
1 B 4
1 B 6
1 B 7
2 C 2
2 C 3
2 C 4
2 C 5
2 C 6
3 D 2
3 D 4
3 D 5
3 D 7
3 D 8
CodePudding user response:
Using numpy.repeat
and a list comprehension:
import numpy as np
df = pd.DataFrame({'Col1': np.repeat(list1, list(map(len, list2))),
'Col2': [x for l in list2 for x in l]})
Output:
Col1 Col2
0 A 1
1 A 2
2 A 3
3 A 4
4 B 1
5 B 3
6 B 4
7 B 6
8 B 7
9 C 2
10 C 3
11 C 4
12 C 5
13 C 6
14 D 2
15 D 4
16 D 5
17 D 7
18 D 8
CodePudding user response:
Creating a dataframe from lists can be done this way:
df = pd.DataFrame(list(zip(list1, list2)),
columns =['list1_name', 'list2_name'])
Or this way as a dictionary
df = pd.DataFrame(data = {“list1”: list1, “list2”: list2})