I have the below code and wanted to know if I can do this in less steps? The aim is to create a list of labels to use for a dataframe's columns labels. ***Note, new_df is a dictionary of lists - contains five lists.
new_df =
{0: [ PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.004747
2019-10-09 -0.000298
2019-10-10 0.014256
2019-10-11 0.012584
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.004921
2019-10-09 -0.000263
2019-10-10 0.014631
2019-10-11 0.012907
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.004929
2019-10-09 -0.000287
2019-10-10 0.014268
2019-10-11 0.012907
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns]],
1: [ PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.002732
2019-10-09 -0.000125
2019-10-10 0.009697
2019-10-11 0.007894
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.002736
2019-10-09 -0.000009
2019-10-10 0.009225
2019-10-11 0.007721
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.002682
2019-10-09 0.000069
2019-10-10 0.008937
2019-10-11 0.007704
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns]],
2: [ PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.002303
2019-10-09 0.002377
2019-10-10 0.005691
2019-10-11 0.005406
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.002254
2019-10-09 0.002206
2019-10-10 0.006091
2019-10-11 0.005411
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.002064
2019-10-09 0.001719
2019-10-10 0.006165
2019-10-11 0.005435
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns]],
3: [ PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.001983
2019-10-09 0.001143
2019-10-10 0.005741
2019-10-11 0.004969
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.001983
2019-10-09 0.001198
2019-10-10 0.005779
2019-10-11 0.004922
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.001927
2019-10-09 0.001765
2019-10-10 0.005745
2019-10-11 0.004922
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns]],
4: [ PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.003854
2019-10-09 -0.000561
2019-10-10 0.012138
2019-10-11 0.010996
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.003849
2019-10-09 -0.000558
2019-10-10 0.012435
2019-10-11 0.010720
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns],
PCA Weighted Portfolio
2019-10-07 0.000000
2019-10-08 -0.003850
2019-10-09 -0.000592
2019-10-10 0.012169
2019-10-11 0.010770
... ...
2021-09-29 0.000000
2021-09-30 0.000000
2021-10-01 0.000000
2021-10-04 0.000000
2021-10-05 0.000000
[518 rows x 1 columns]]}
and concat_new_df loooks like:
concat_new_df =
PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio PCA Weighted Portfolio
date pca_component
2019-10-07 weights_1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
weights_2 -0.003885 -0.003905 -0.003904 -0.003907 -0.003907 -0.003906 -0.003905 -0.003908 -0.003912 -0.003912 -0.003911 -0.003911 -0.003912 -0.003916
weights_3 -0.000656 -0.000656 -0.000651 -0.000670 -0.000669 -0.000670 -0.000668 -0.000671 -0.000671 -0.000671 -0.000671 -0.000670 -0.000672 -0.000671
weights_4 0.012639 0.012638 0.012633 0.012636 0.012635 0.012627 0.012636 0.012667 0.012667 0.012664 0.012664 0.012668 0.012681 0.012690
weights_5 0.011109 0.011091 0.011089 0.011087 0.011080 0.011080 0.011119 0.011118 0.011118 0.011117 0.011123 0.011139 0.011141 0.011145
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
2020-02-28 weights_4 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
weights_5 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
2020-03-02 weights_1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
weights_2 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
weights_3 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
518 rows × 14 columns
The code in question:
new_df_cols = [concat_new_df.columns[0] " " "returns window " str(i) for i in range(7, 10, 1)]
new_df_cols = new_df_cols * len(concat_new_df.columns)
list_of_new_cols= []
for col in list(set(new_df_cols )):
new_cols= [col " " "weights " str(i 1) for i in range(len(new_df))]
list_of_new_cols.append(new_cols)
final_col_labels = np.concatenate(list_of_new_cols)
outputs:
array(['PCA Weighted Portfolio returns window 7 weights 1',
'PCA Weighted Portfolio returns window 7 weights 2',
'PCA Weighted Portfolio returns window 7 weights 3',
'PCA Weighted Portfolio returns window 7 weights 4',
'PCA Weighted Portfolio returns window 7 weights 5',
'PCA Weighted Portfolio returns window 9 weights 1',
'PCA Weighted Portfolio returns window 9 weights 2',
'PCA Weighted Portfolio returns window 9 weights 3',
'PCA Weighted Portfolio returns window 9 weights 4',
'PCA Weighted Portfolio returns window 9 weights 5',
'PCA Weighted Portfolio returns window 8 weights 1',
'PCA Weighted Portfolio returns window 8 weights 2',
'PCA Weighted Portfolio returns window 8 weights 3',
'PCA Weighted Portfolio returns window 8 weights 4',
'PCA Weighted Portfolio returns window 8 weights 5'], dtype='<U49')
Also, after getting the final list of labels, I want to re-organise them from the about output to
array(['PCA Weighted Portfolio returns window 7 weights 1',
'PCA Weighted Portfolio returns window 8 weights 1',
'PCA Weighted Portfolio returns window 9 weights 1',
'PCA Weighted Portfolio returns window 7 weights 2',
'PCA Weighted Portfolio returns window 8 weights 2',
'PCA Weighted Portfolio returns window 9 weights 2',
'PCA Weighted Portfolio returns window 7 weights 3',
'PCA Weighted Portfolio returns window 8 weights 3',
'PCA Weighted Portfolio returns window 9 weights 3',
'PCA Weighted Portfolio returns window 7 weights 4',
'PCA Weighted Portfolio returns window 8 weights 4',
'PCA Weighted Portfolio returns window 9 weights 4',
'PCA Weighted Portfolio returns window 7 weights 5',
'PCA Weighted Portfolio returns window 8 weights 5',
'PCA Weighted Portfolio returns window 9 weights 5'], dtype='<U49')
How would I go about doing this?
CodePudding user response:
I might be misunderstanding some of your data structure, but I think there's a bit of unnecessary assignment here.
Consider this:
column_prefix = concat_new_df.columns[0]
final_col_labels = []
for i in range(len(new_df)):
for j in range(7, 10):
final_col_labels.append(f"{column_prefix} {j} weights {i}")
Convert to a numpy array if necessary with np.array(final_col_labels)
.