I have a dataframe data :
Cluster OsId BrowserId PageId VolumePred ConversionPred
255 7 11 17 1149582 4.0
607 18 99 16 917224 8.0
22 0 12 14 1073848 4.0
I would like to add new column "OSBROWSER" which is the concatenation of two columns : OsId and BrowserId.
The result should be like this :
Cluster OsId BrowserId PageId VolumePred ConversionPred OSBROWSER
255 7 11 17 1149582 4.0 (7, 11)
607 18 99 16 917224 8.0 (18, 99)
22 0 12 14 1073848 4.0 (0, 12)
I try like this :
data['OSBrowser'] = data["OsId"] data["BrowserId"]
But it gave me the sum of the two clumns values
Any idea please? thanks you
SOLUTION :
data['OSBrowser'] = list(zip(data.OsId, data.BrowserId))
CodePudding user response:
This is a possible solution:
data['OSBrowser'] = data[["OsId", "BrowserId"]].apply(tuple, axis=1)
CodePudding user response:
I would convert the columns to string, I think that's what you're looking to do.
df = pd.DataFrame(((123, 456, 789), (98, 765, 432)), columns=('a', 'b', 'c'))
df['a_str'] = df['a'].astype(str)
df['b_str'] = df['b'].astype(str)
df['ab'] = df['a_str'] df['b_str']
df then looks like this
a b c a_str b_str ab
0 123 456 789 123 456 123456
1 98 765 432 98 765 98765
Then you can just drop a_str and b_str
df = df[['a', 'b', 'c', 'ab']]