Join two columns of integers in a pandas dataframe to a column of tuples-CodePudding

I want to combine two columns of features into one column, where each row will represent a data point as a tuple.

For example, here is my data frame:

      Weather  Temp  Play
0         2     1     0
1         2     1     0
2         0     1     1
3         1     2     1
4         1     0     1
5         1     0     0

I want it to look something like this:

                 x     Play
0              (2,1)     0
1              (2,1)     0
2              (0,1)     1
3              (1,2)     1
4              (1,0)     1
5              (1,0)     0

I want to then use this for model.fit(df[x], df[Play]) for Bernoulli Naive Bayes.

Is this at all possible? I am trying to avoid using lists. How can I do this for n columns next time?

CodePudding user response：

df.apply() can be used for a variety of abnormal cases such as this one:

df['x'] = df.apply(lambda x: (x.Weather, x.Temp), axis=1)

Output:

   Weather  Temp  Play       x
0        2     1     0  (2, 1)
1        2     1     0  (2, 1)
2        0     1     1  (0, 1)
3        1     2     1  (1, 2)
4        1     0     1  (1, 0)
5        1     0     0  (1, 0)

CodePudding user response：

You can use zip

df['x'] = list(zip(df.Weather, df.Temp))

   Weather  Temp  Play       x
0        1     1     4  (1, 1)
1        2     1     5  (2, 1)
2        3     1     6  (3, 1)