Home > Software engineering >  How to strip each array of the square brackets and add a prefix
How to strip each array of the square brackets and add a prefix

Time:05-03

I have a dataframe with the top 12 predictions my kNN has made for each ID, it looks like this:

customer_id prediction
00000dbacae5abe5e2 [677530001, 677515001, 677511001, 677506003, 677501001, 677490001, 677478006, 677478003, 677478002, 677546006, 949551001, 903049003]
0000423b00ade9141 [677511001, 677506003, 677501001, 677490001, 677478006, 677478003, 677478002, 677386001, 677385001, 677760003, 949551001, 826674001]

Is it possible to remove the square brackets from each line (they are arrays) in the dataframe and also add a prefix of zero before each prediction, like this:

customer_id prediction
00000dbacae5abe5e2 0677530001, 0677515001, 0677511001.....
0000423b00ade9141 0677511001, 0677506003, 0677501001.....

My code in generating these predictions and tables:

n = 12
probas = kNN.predict_proba(X.head())
top_n_idx = np.argsort(probas, axis=1)[:,-n:]
top_n = [kNN.classes_[i] for i in top_n_idx]
results = list(zip(top_n))

results = pd.DataFrame(results)
ids_test.reset_index(drop=True, inplace=True)
results.reset_index(drop=True, inplace=True)
y_test.reset_index(drop=True, inplace=True)
knn_table = pd.concat([ids, results], axis=1, ignore_index=True)
knn_table = knn_table.rename(columns={0: 'customer_id', 1: 'prediction'})

CodePudding user response:

Try:

df["prediction"] = ("0" df["prediction"].explode().astype(str)).groupby(level=0).agg(", ".join)

Alternatively with apply:

df["prediction"] = df["prediction"].apply(lambda x: "0" ", 0".join(map(str,x)))
Output:
>>> df

          customer_id                                         prediction
0  00000dbacae5abe5e2  0677530001, 0677515001, 0677511001, 0677506003...
1   0000423b00ade9141  0677511001, 0677506003, 0677501001, 0677490001...
  • Related