How to dynamically cut filtered data from two columns and paste them into new columns in pandas-CodePudding

I have numerous data structures that take the form of following table:

    traces  length   
0   domain      10
1  length1       1
2  length2       2
3  length3       3
4   width1       4
5   width2       5
6   width3       6

And I need it to look like this:

    traces  measurments new_traces new_measurements 
0   domain      10    
1  length1       1       width1         4
2  length2       2       width2         5
3  length3       3       width3         6

Since my data structures contain various lengths and widths, I can't hard code range information and need something that can sort the traces column regardless of how many lengths and widths there are. I've been using pandas to try to coax this into shape, but I'm having a lot of issues (because I'm new to this and very naive) and haven't made much progress. Any help would be so much appreciated!

CodePudding user response：

Assuming the name of your original dataframe is df, I would first rename the length column and then split the dataframe in two parts, making sure to rename the indices of the second part such that they align with those in the first part the way you want. Then pd.concat will do the rest:

df.rename({'length': 'measurements'},
          axis=1, inplace=True)

part_1 = df.loc[:3]
part_2 = df.loc[4:].rename(index=lambda x: x - 3,
                           columns=lambda s: 'new_'   s)

new_df = pd.concat([part_1, part_2], axis=1)
new_df

    traces  measurements    new_traces  new_measurements
0   domain            10           NaN               NaN
1  length1             1        width1               4.0
2  length2             2        width2               5.0
3  length3             3        width3               6.0

CodePudding user response：

Is this what you are looking for?:

df = df.set_index(df['traces'].str.extract('(\d )',expand = False))
mask = df['traces'].str.contains('width')
df.loc[~mask].join(df.loc[mask],rsuffix='_new').rename(lambda x: '_'.join(x.split('_')[::-1]),axis=1).reset_index(drop=True)

Output:

    traces  length new_traces  new_length
0   domain      10        NaN         NaN
1  length1       1     width1         4.0
2  length2       2     width2         5.0
3  length3       3     width3         6.0