How to select one column in a dataframe and add this to an empty dataframe-CodePudding

I have a list of 8 dataframes named all_df, this is only a part of the list.

[      CloneID  P1Sig  P1STB  P1Cov  ...  P2Cov  P2Sig*S1/S2         r   r>=1

 0      849492   1268    167     88  ...     88  1300.556505 -0.025675  False

 1      847936    707     92    120  ...    120   926.126468 -0.309938  False

 2      848434    608     78     94  ...     94   654.800354 -0.076974  False

 3      849038   4374    507    110  ...    110  3860.066177  0.133141   True

 4      845994    796    103     71  ...     71   756.095437  0.052777   True

 ...       ...    ...    ...    ...  ...    ...          ...       ...    ...

 9591   833817   1444    164     94  ...     94  1428.984199  0.010508   True

 9592   834712    664     83    105  ...    105   640.329628  0.036966   True

 9593   760753   1512    168    127  ...    127  1416.322313  0.067554   True

 9594   834148    403     53    100  ...    100   472.107438 -0.171482  False

 9595   833601    574     72     72  ...     72   537.225705  0.068452   True


 [9596 rows x 10 columns]

For each dataframe in de list column r should be extracted and added into a new dataframe called df_relative expression with the CloneID as index.

I only seem to get the last object in the list added to the new dataframe with

r_column= df_relatieve_expression[["r"]] df_relatieve_expression[file]= r_column The column name should be the name of the file of which this r value is computed from.

All filenames are in a tuple all_files = ("day1.txt","day2.txt", "day4.txt", "day7.txt", "day14.txt", "day21.txt","day45.txt", "day90.txt")

CodePudding user response：

You can iterate all_df with enumerate so it will give you index and df where index will help to get name of file from all_files tuple and df will hold the df from all_df

df_relative = pd.DataFrame()
all_files = ("day1.txt", "day2.txt", "day4.txt", "day7.txt", "day14.txt", "day21.txt", "day45.txt", "day90.txt")
for idx, df in enumerate(all_df):
    df_relative[all_files[idx]] = df["r"]

# as mentioned in comments 'CloneId' is same across all df
df_relative.set_index(all_df[0]["CloneId"]) # setting 'CloneId' as index for new dataframe

CodePudding user response：

Please post your code here, but if I don't get you wrong then you probably might use this code below in order to achieve what you was expecting for:

col_R = all_df[['r']]
df_relative_expression['Colum name'] = col_R