I have a list of 8 dataframes named all_df, this is only a part of the list.
[ CloneID P1Sig P1STB P1Cov ... P2Cov P2Sig*S1/S2 r r>=1
0 849492 1268 167 88 ... 88 1300.556505 -0.025675 False
1 847936 707 92 120 ... 120 926.126468 -0.309938 False
2 848434 608 78 94 ... 94 654.800354 -0.076974 False
3 849038 4374 507 110 ... 110 3860.066177 0.133141 True
4 845994 796 103 71 ... 71 756.095437 0.052777 True
... ... ... ... ... ... ... ... ... ...
9591 833817 1444 164 94 ... 94 1428.984199 0.010508 True
9592 834712 664 83 105 ... 105 640.329628 0.036966 True
9593 760753 1512 168 127 ... 127 1416.322313 0.067554 True
9594 834148 403 53 100 ... 100 472.107438 -0.171482 False
9595 833601 574 72 72 ... 72 537.225705 0.068452 True
[9596 rows x 10 columns]
For each dataframe in de list column r should be extracted and added into a new dataframe called df_relative expression with the CloneID as index.
I only seem to get the last object in the list added to the new dataframe with
r_column= df_relatieve_expression[["r"]] df_relatieve_expression[file]= r_column The column name should be the name of the file of which this r value is computed from.
All filenames are in a tuple all_files = ("day1.txt","day2.txt", "day4.txt", "day7.txt", "day14.txt", "day21.txt","day45.txt", "day90.txt")
CodePudding user response:
You can iterate all_df
with enumerate
so it will give you index and df where index will help to get name of file from all_files
tuple and df will hold the df from all_df
df_relative = pd.DataFrame()
all_files = ("day1.txt", "day2.txt", "day4.txt", "day7.txt", "day14.txt", "day21.txt", "day45.txt", "day90.txt")
for idx, df in enumerate(all_df):
df_relative[all_files[idx]] = df["r"]
# as mentioned in comments 'CloneId' is same across all df
df_relative.set_index(all_df[0]["CloneId"]) # setting 'CloneId' as index for new dataframe
CodePudding user response:
Please post your code here, but if I don't get you wrong then you probably might use this code below in order to achieve what you was expecting for:
col_R = all_df[['r']]
df_relative_expression['Colum name'] = col_R