I have the following df:
df = pd.DataFrame({'A': ['foo', 'bar', 'dex', 'tru'],
'B': ['abc', 'def', 'ghi', 'jkl']})
Which looks like:
A
0 foo
1 bar
2 dex
3 tru
I also have the following list:
block_number = [1000, 2000]
I want the following output new_df
:
df = pd.DataFrame({'A': ['foo', 'bar', 'dex', 'tru', 'foo', 'bar', 'dex', 'tru'],
'B': ['abc', 'def', 'ghi', 'jkl', 'abc', 'def', 'ghi', 'jkl'],
'block_number': [1000, 1000, 1000, 1000, 2000, 2000, 2000, 2000]})
Which looks like:
A B block_number
0 foo abc 1000
1 bar def 1000
2 dex ghi 1000
3 tru jkl 1000
4 foo abc 2000
5 bar def 2000
6 dex ghi 2000
7 tru jkl 2000
I basically need have each item in my_list
per row on df
, but do that for all the different items in my list.
How should I proceed?
CodePudding user response:
Try itertools.product
:
from itertools import product
df_out = pd.DataFrame(
product(df.A, block_number), columns=["A", "block_number"]
).sort_values(by="block_number")
print(df_out)
Prints:
A block_number
0 foo 1000
2 bar 1000
4 dex 1000
6 tru 1000
1 foo 2000
3 bar 2000
5 dex 2000
7 tru 2000
EDIT: With new input:
df_out = pd.DataFrame(
product(zip(df.A, df.B), block_number),
columns=["tmp", "block_number"],
).sort_values(by="block_number")
df_out[["A", "B"]] = df_out.pop("tmp").apply(pd.Series)
print(df_out)
Prints:
block_number A B
0 1000 foo abc
2 1000 bar def
4 1000 dex ghi
6 1000 tru jkl
1 2000 foo abc
3 2000 bar def
5 2000 dex ghi
7 2000 tru jkl
CodePudding user response:
pd.DataFrame(itertools.product([1000, 2000], df.A), columns=["block_number", "A"])
Out[18]:
block_number A
0 1000 foo
1 1000 bar
2 1000 dex
3 1000 tru
4 2000 foo
5 2000 bar
6 2000 dex
7 2000 tru