Home > Software engineering >  How do I repeat a set of values across all entries in a dataframe?
How do I repeat a set of values across all entries in a dataframe?

Time:12-02

I apologize if this question has been asked but I don't know how to properly ask it and thus find the answer.

I have a dataframe:

val1 val2
val1 val3
val2 val1
val2 val3

I want to append a set of years to every entry:

val1 val2 1990
val1 val2 1991
val1 val2 1992
val1 val3 1990
val1 val3 1991
val1 val3 1992
etc....

I figured out how to do this with only one column of values, but I have since added another column and cannot figure out how to replicate the process. There must be an easy way to do this, but I cannot figure it out, nor can I find an answer on this. How can I do this?

CodePudding user response:

You can use a cross join in Pandas.

>>> df1 = pd.DataFrame({
     'col1': ['val1', 'val1', 'val2', 'val2'],
     'col2': ['val2', 'val3', 'val1', 'val3']
})
>>> df1
   col1  col2
0  val1  val2
1  val1  val3
2  val2  val1
3  val2  val3
>>> df2 = pd.DataFrame({'col3': [1990, 1991, 1992]})
>>> df2
   col3
0  1990
1  1991
2  1992
>>> pd.merge(df1, df2, how='cross')
    col1  col2  col3
0   val1  val2  1990
1   val1  val2  1991
2   val1  val2  1992
3   val1  val3  1990
4   val1  val3  1991
5   val1  val3  1992
6   val2  val1  1990
7   val2  val1  1991
8   val2  val1  1992
9   val2  val3  1990
10  val2  val3  1991
11  val2  val3  1992

CodePudding user response:

One way would be to assign the list to each row and then explode:

df["Year"] = [[1990, 1991, 1992]]*df.shape[0]
df = df.explode("Year")

>>> df
      A     B  Year
0  val1  val2  1990
0  val1  val2  1991
0  val1  val2  1992
1  val1  val3  1990
1  val1  val3  1991
1  val1  val3  1992
2  val2  val1  1990
2  val2  val1  1991
2  val2  val1  1992
3  val2  val3  1990
3  val2  val3  1991
3  val2  val3  1992
  • Related