I have a large dataset(pandes dataframe) with following headers
RAM = [f"RUT1_Azi_{i}" for i in range(10)]
RDP = [f"RUT1_Dtctn_Probb_{i}" for i in range(´10)]
RDI = [f"RUT1_Dtctn_ID_{i}" for i in range(10)]
REM = [f"RUT1_Elev_{i}" for i in range(10)]
RCC = ['RUT1_Cycle_Counter']
Now i want to make many subset from the original dataframe as below.
subset_0
index,RUT1_Cycle_Counter, RUT1_Azi_0, RUT1_Dtctn_Probb_0, RUT1_Dtctn_ID_0, RUT1_Elev_0
subset_1
index,RUT1_Cycle_Counter, RUT1_Azi_1, RUT1_Dtctn_Probb_1, RUT1_Dtctn_ID_1, RUT1_Elev_1
.
.
.
subset_9
index,RUT1_Cycle_Counter, RUT1_Azi_9, RUT1_Dtctn_Probb_9, RUT1_Dtctn_ID_9, RUT1_Elev_9
How can I do this in python? i am a beginner in python
Thank you very much in advance
CodePudding user response:
Here is an example:
RAM = [f"RUT1_Azi_{i}" for i in range(10)]
RDP = [f"RUT1_Dtctn_Probb_{i}" for i in range(10)]
RDI = [f"RUT1_Dtctn_ID_{i}" for i in range(10)]
REM = [f"RUT1_Elev_{i}" for i in range(10)]
# made up example with the columns above
cols = RAM RDP RDI REM
nrows = 10
df = pd.DataFrame(np.arange(nrows * len(cols)).reshape(nrows, -1), columns=cols)
Now:
subsets = [df[list(subcols)] for subcols in zip(RAM, RDP, RDI, REM)]
For example:
>>> subsets[5]
RUT1_Azi_5 RUT1_Dtctn_Probb_5 RUT1_Dtctn_ID_5 RUT1_Elev_5
0 5 15 25 35
1 45 55 65 75
2 85 95 105 115
3 125 135 145 155
4 165 175 185 195
5 205 215 225 235
6 245 255 265 275
7 285 295 305 315
8 325 335 345 355
9 365 375 385 395
Edit: modified answer to include a common list of columns for all subsets (RCC = ['RUT1_Cycle_Counter']
):
subsets = [df[RCC list(subcols)] for subcols in zip(RAM, RDP, RDI, REM)]
CodePudding user response:
With pandas you can natively call a subset of a dataframe
as long as list_of_subset_headers
is a subset of your dataframes columns just write
sub_df=df[list_of_subset_headers]
Or in this case :
sub_df0=df[['RUT1_Azi_0', 'RUT1_Dtctn_Probb_0', 'RUT1_Dtctn_ID_0', 'RUT1_Elev_0']]