Is there a way to do these sets of steps quicker - less steps and how can I re-organise a string of-CodePudding

I have the below code and wanted to know if I can do this in less steps? The aim is to create a list of labels to use for a dataframe's columns labels. ***Note, new_df is a dictionary of lists - contains five lists.

new_df = 

{0: [            PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.004747
  2019-10-09               -0.000298
  2019-10-10                0.014256
  2019-10-11                0.012584
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.004921
  2019-10-09               -0.000263
  2019-10-10                0.014631
  2019-10-11                0.012907
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.004929
  2019-10-09               -0.000287
  2019-10-10                0.014268
  2019-10-11                0.012907
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns]],
 1: [            PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.002732
  2019-10-09               -0.000125
  2019-10-10                0.009697
  2019-10-11                0.007894
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.002736
  2019-10-09               -0.000009
  2019-10-10                0.009225
  2019-10-11                0.007721
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.002682
  2019-10-09                0.000069
  2019-10-10                0.008937
  2019-10-11                0.007704
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns]],
 2: [            PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.002303
  2019-10-09                0.002377
  2019-10-10                0.005691
  2019-10-11                0.005406
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.002254
  2019-10-09                0.002206
  2019-10-10                0.006091
  2019-10-11                0.005411
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.002064
  2019-10-09                0.001719
  2019-10-10                0.006165
  2019-10-11                0.005435
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns]],
 3: [            PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.001983
  2019-10-09                0.001143
  2019-10-10                0.005741
  2019-10-11                0.004969
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.001983
  2019-10-09                0.001198
  2019-10-10                0.005779
  2019-10-11                0.004922
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.001927
  2019-10-09                0.001765
  2019-10-10                0.005745
  2019-10-11                0.004922
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns]],
 4: [            PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.003854
  2019-10-09               -0.000561
  2019-10-10                0.012138
  2019-10-11                0.010996
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.003849
  2019-10-09               -0.000558
  2019-10-10                0.012435
  2019-10-11                0.010720
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns],
              PCA Weighted Portfolio
  2019-10-07                0.000000
  2019-10-08               -0.003850
  2019-10-09               -0.000592
  2019-10-10                0.012169
  2019-10-11                0.010770
  ...                            ...
  2021-09-29                0.000000
  2021-09-30                0.000000
  2021-10-01                0.000000
  2021-10-04                0.000000
  2021-10-05                0.000000
  
  [518 rows x 1 columns]]}

and concat_new_df loooks like:

concat_new_df = 


        PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio  PCA Weighted Portfolio
date    pca_component                                                       
2019-10-07  weights_1   0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000
weights_2   -0.003885   -0.003905   -0.003904   -0.003907   -0.003907   -0.003906   -0.003905   -0.003908   -0.003912   -0.003912   -0.003911   -0.003911   -0.003912   -0.003916
weights_3   -0.000656   -0.000656   -0.000651   -0.000670   -0.000669   -0.000670   -0.000668   -0.000671   -0.000671   -0.000671   -0.000671   -0.000670   -0.000672   -0.000671
weights_4   0.012639    0.012638    0.012633    0.012636    0.012635    0.012627    0.012636    0.012667    0.012667    0.012664    0.012664    0.012668    0.012681    0.012690
weights_5   0.011109    0.011091    0.011089    0.011087    0.011080    0.011080    0.011119    0.011118    0.011118    0.011117    0.011123    0.011139    0.011141    0.011145
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2020-02-28  weights_4   0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000
weights_5   0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000
2020-03-02  weights_1   0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000
weights_2   0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000
weights_3   0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000    0.000000
518 rows × 14 columns

The code in question:

new_df_cols = [concat_new_df.columns[0]   " "   "returns window "   str(i) for i in range(7, 10, 1)] 
new_df_cols = new_df_cols * len(concat_new_df.columns)

list_of_new_cols= []
for col in list(set(new_df_cols )):
    new_cols= [col   " "   "weights "   str(i   1) for i in range(len(new_df))]
    list_of_new_cols.append(new_cols)

final_col_labels = np.concatenate(list_of_new_cols)

outputs:

array(['PCA Weighted Portfolio returns window 7 weights 1',
       'PCA Weighted Portfolio returns window 7 weights 2',
       'PCA Weighted Portfolio returns window 7 weights 3',
       'PCA Weighted Portfolio returns window 7 weights 4',
       'PCA Weighted Portfolio returns window 7 weights 5',
       'PCA Weighted Portfolio returns window 9 weights 1',
       'PCA Weighted Portfolio returns window 9 weights 2',
       'PCA Weighted Portfolio returns window 9 weights 3',
       'PCA Weighted Portfolio returns window 9 weights 4',
       'PCA Weighted Portfolio returns window 9 weights 5',
       'PCA Weighted Portfolio returns window 8 weights 1',
       'PCA Weighted Portfolio returns window 8 weights 2',
       'PCA Weighted Portfolio returns window 8 weights 3',
       'PCA Weighted Portfolio returns window 8 weights 4',
       'PCA Weighted Portfolio returns window 8 weights 5'], dtype='<U49')

Also, after getting the final list of labels, I want to re-organise them from the about output to

array(['PCA Weighted Portfolio returns window 7 weights 1',
       'PCA Weighted Portfolio returns window 8 weights 1',
       'PCA Weighted Portfolio returns window 9 weights 1',
       'PCA Weighted Portfolio returns window 7 weights 2',
       'PCA Weighted Portfolio returns window 8 weights 2',
       'PCA Weighted Portfolio returns window 9 weights 2',
       'PCA Weighted Portfolio returns window 7 weights 3',
       'PCA Weighted Portfolio returns window 8 weights 3',
       'PCA Weighted Portfolio returns window 9 weights 3',
       'PCA Weighted Portfolio returns window 7 weights 4',
       'PCA Weighted Portfolio returns window 8 weights 4',
       'PCA Weighted Portfolio returns window 9 weights 4',
       'PCA Weighted Portfolio returns window 7 weights 5',
       'PCA Weighted Portfolio returns window 8 weights 5',       
       'PCA Weighted Portfolio returns window 9 weights 5'], dtype='<U49')

How would I go about doing this?

CodePudding user response：

I might be misunderstanding some of your data structure, but I think there's a bit of unnecessary assignment here.

Consider this:

column_prefix = concat_new_df.columns[0]
final_col_labels = []
for i in range(len(new_df)):
  for j in range(7, 10):
    final_col_labels.append(f"{column_prefix} {j} weights {i}")

Convert to a numpy array if necessary with np.array(final_col_labels).