I need to convert a part of my data to make it compatible with this solution: https://stackoverflow.com/a/64854873
The data is a pandas.core.frame.DataFrame
with:
result data_1 data_2
1 1.523 4 1223
3 1.33 84 1534
Some index values might be removed, therefore 1, 3, ...
It should be a tuple with data values and the result. The type in the solution was scipy.sparse._coo.coo_matrix
, like:
(4, 1223) 1.523
(84, 1534) 1.33
Just scipy.sparse.coo_matrix(df.values)
seems to mix the data.
(0, 0) 1.523
(0, 1) 1.53
(0, 24) 1.92
: :
(2, 151) 123.0
(2, 142) 834.0
How can I generate a compatible matrix?
CodePudding user response:
You can filter out the data columns, then apply tuple
on axis=1
which will essentially create the tuple out of row values, I'm assigning it as a new column as the output you've mentioned is not clear if its an array or dataframe, but I think you should be able to move forward with the remaining outcome you need.
>>> df.assign(data=df.filter(like='data').apply(tuple, axis=1))
result data_1 data_2 data
1 1.523 4 1223 (4, 1223)
3 1.330 84 1534 (84, 1534)
CodePudding user response:
Try this:
df['tuple'] = list(zip(df.data_1, df.data_2))
result = df[['tuple', 'result']].to_numpy()
print(result)
Result:
[[(4, 1223) 1.523]
[(84, 1534) 1.33]]
Source:
How to form tuple column from two columns in Pandas
Convert pandas dataframe to NumPy array