I want to combine two columns of features into one column, where each row will represent a data point as a tuple.
For example, here is my data frame:
Weather Temp Play
0 2 1 0
1 2 1 0
2 0 1 1
3 1 2 1
4 1 0 1
5 1 0 0
I want it to look something like this:
x Play
0 (2,1) 0
1 (2,1) 0
2 (0,1) 1
3 (1,2) 1
4 (1,0) 1
5 (1,0) 0
I want to then use this for model.fit(df[x], df[Play]) for Bernoulli Naive Bayes.
Is this at all possible? I am trying to avoid using lists. How can I do this for n columns next time?
CodePudding user response:
df.apply()
can be used for a variety of abnormal cases such as this one:
df['x'] = df.apply(lambda x: (x.Weather, x.Temp), axis=1)
Output:
Weather Temp Play x
0 2 1 0 (2, 1)
1 2 1 0 (2, 1)
2 0 1 1 (0, 1)
3 1 2 1 (1, 2)
4 1 0 1 (1, 0)
5 1 0 0 (1, 0)
CodePudding user response:
You can use zip
df['x'] = list(zip(df.Weather, df.Temp))
Weather Temp Play x
0 1 1 4 (1, 1)
1 2 1 5 (2, 1)
2 3 1 6 (3, 1)