Home > Back-end >  Multiply and repeat every row of df by n-item list
Multiply and repeat every row of df by n-item list

Time:10-15

I have the following df:

df = pd.DataFrame({'A': ['foo', 'bar', 'dex', 'tru'],
                   'B': ['abc', 'def', 'ghi', 'jkl']})

Which looks like:

     A
0  foo
1  bar
2  dex
3  tru

I also have the following list:

block_number = [1000, 2000]

I want the following output new_df:

df = pd.DataFrame({'A': ['foo', 'bar', 'dex', 'tru', 'foo', 'bar', 'dex', 'tru'],
                   'B': ['abc', 'def', 'ghi', 'jkl', 'abc', 'def', 'ghi', 'jkl'],
                   'block_number': [1000, 1000, 1000, 1000, 2000, 2000, 2000, 2000]})

Which looks like:

     A    B  block_number
0  foo  abc          1000
1  bar  def          1000
2  dex  ghi          1000
3  tru  jkl          1000
4  foo  abc          2000
5  bar  def          2000
6  dex  ghi          2000
7  tru  jkl          2000

I basically need have each item in my_list per row on df, but do that for all the different items in my list.

How should I proceed?

CodePudding user response:

Try itertools.product:

from itertools import product

df_out = pd.DataFrame(
    product(df.A, block_number), columns=["A", "block_number"]
).sort_values(by="block_number")

print(df_out)

Prints:

     A  block_number
0  foo          1000
2  bar          1000
4  dex          1000
6  tru          1000
1  foo          2000
3  bar          2000
5  dex          2000
7  tru          2000

EDIT: With new input:

df_out = pd.DataFrame(
    product(zip(df.A, df.B), block_number),
    columns=["tmp", "block_number"],
).sort_values(by="block_number")
df_out[["A", "B"]] = df_out.pop("tmp").apply(pd.Series)

print(df_out)

Prints:

   block_number    A    B
0          1000  foo  abc
2          1000  bar  def
4          1000  dex  ghi
6          1000  tru  jkl
1          2000  foo  abc
3          2000  bar  def
5          2000  dex  ghi
7          2000  tru  jkl

CodePudding user response:

pd.DataFrame(itertools.product([1000, 2000], df.A), columns=["block_number", "A"])
Out[18]: 
   block_number    A
0          1000  foo
1          1000  bar
2          1000  dex
3          1000  tru
4          2000  foo
5          2000  bar
6          2000  dex
7          2000  tru
  • Related