I have a dataframe that looks like
RMSE SELECTED DATA information
0 100 [12, 15, 19, 13] (arr1, str1, fl1)
1 200 [7, 12, 3] (arr2, str2, fl2)
2 300 [5, 9, 3, 3, 3, 3] (arr3, str3, fl3)
Here, I want to break up the information
column into three distinct columns: the first column containing the arrays , the second column containing the string and the last column containing the float Thus the new dataframe would look like
RMSE SELECTED DATA ARRAYS STRING FLOAT
0 100 [12, 15, 19, 13] arr1 str1 fl1
1 200 [7, 12, 3] arr2 str2 fl2
2 300 [5, 9, 3, 3, 3, 3] arr3 str3 fl3
I thought one way would be to isolate the information
column and then slice it using .apply
like so:
df['arrays'] = df['information'].apply(lambda row : row[0])
and do this for each entry. But I was curious if there is a better way to do this as if there are many more entries it may become tedious or slow with a for loop
CodePudding user response:
Let us recreate the dataframe
tojoin = pd.DataFrame(df.pop('information').to_numpy().tolist(),
index = df.index,
columns = ['ARRAYS', 'STRING', 'FLOAT'])
df = df.join(tojoin)