Home > Back-end >  Copy one column header to other column header in Pandas
Copy one column header to other column header in Pandas

Time:10-07

I have the following small data set. It is a part of a big data set.

The header changes each month based on the months I am dealing with.

Items Jan-sale Feb-sale Mar-sale Jan-price Feb-price Mar-price
A 10 25 0 3 3 5
B 15 23 10 2 3 10
C 20 21 5 5 10 15

I used the following code to find out the total sale

import pandas as pd
df= pd.read_excel('Sale.xlsx')

df.loc[:,7] =  df['Jan-sale']*df['Jan-price'] 
df.loc[:,8] =  df['Feb-sale']*df['Feb-price']
df.loc[:,9] =  df['Mar-sale']*df['Mar-price']
df.head()

I got the following output

Items Jan-sale Feb-sale Mar-sale Jan-price Feb-price Mar-price 7 8 9
A 10 25 0 3 3 5 30 75 0
B 15 23 10 2 3 10 30 69 100
C 20 21 5 5 10 15 100 210 75

I can use the following code to assign the header

df['Jan'] =   df['Jan-sale']*df['Jan-price']

Since my header changes every month. is there a way I can use a code to copy the headers(from column 2,3 and 4 ) over to columns 7, 8, and 9 as follow

Items Jan-sale Feb-sale Mar-sale Jan-price Feb-price Mar-price Jan Feb Mar
A 10 25 0 3 3 5 30 75 0
B 15 23 10 2 3 10 30 69 100
C 20 21 5 5 10 15 100 210 75

Thank you.

CodePudding user response:

With a slightly different approach we can create a MultiIndex columns and use alignment to handle the computation. Then join the DataFrame back together:

# Columns that don't get split
df = df.set_index('Items')
# Save To Restore Later
og_cols = df.columns
# Create MultiIndex
df.columns = df.columns.str.split('-', expand=True).swaplevel()
# Compute New Cols and put DataFrame back together
df = (
    df.set_axis(og_cols, axis=1)  # Restore Original Columns
        .join(df['sale'] * df['price'])  # Add new columns
        .reset_index()  # Restore Index
)
Items Jan-sale Feb-sale Mar-sale Jan-price Feb-price Mar-price Jan Feb Mar
A 10 25 0 3 3 5 30 75 0
B 15 23 10 2 3 10 30 69 100
C 20 21 5 5 10 15 100 210 75

Explanation:

set_index str.split creates a MultiIndex

df = df.set_index('Items')
df.columns = df.columns.str.split('-', expand=True).swaplevel()

df:

      sale         price        
       Jan Feb Mar   Jan Feb Mar
Items                           
A       10  25   0     3   3   5
B       15  23  10     2   3  10
C       20  21   5     5  10  15

These individual levels can now be multiplied as a group and the Headers are correct because they align on the lower level:

df['sale'] * df['price']

       Jan  Feb  Mar
Items               
A       30   75    0
B       30   69  100
C      100  210   75

Then we just put the DataFrame restore the original column names set_axis and join together:

# Step 1
df.set_axis(og_cols, axis=1) 

       Jan-sale  Feb-sale  Mar-sale  Jan-price  Feb-price  Mar-price
Items                                                               
A            10        25         0          3          3          5
B            15        23        10          2          3         10
C            20        21         5          5         10         15

# Step 2
df.set_axis(og_cols, axis=1).join(df['sale'] * df['price'])

       Jan-sale  Feb-sale  Mar-sale  Jan-price  ...  Mar-price  Jan  Feb  Mar
Items                                           ...                          
A            10        25         0          3  ...          5   30   75    0
B            15        23        10          2  ...         10   30   69  100
C            20        21         5          5  ...         15  100  210   75

[3 rows x 9 columns]

# Step 3
df.set_axis(og_cols, axis=1).join(df['sale'] * df['price']).reset_index()

  Items  Jan-sale  Feb-sale  Mar-sale  ...  Mar-price  Jan  Feb  Mar
0     A        10        25         0  ...          5   30   75    0
1     B        15        23        10  ...         10   30   69  100
2     C        20        21         5  ...         15  100  210   75

[3 rows x 10 columns]

If we really want to copy values, we can update the underlying column values if needed:

df.loc[:, 7] = df['Jan-sale'] * df['Jan-price']
df.loc[:, 8] = df['Feb-sale'] * df['Feb-price']
df.loc[:, 9] = df['Mar-sale'] * df['Mar-price']

# Overwrite last 3 values with the first value from columns 1:4 split on '-'
df.columns.values[-3:] = df.columns[1:4].str.split('-').str[0]

However this approach would require manually creating the initial columns anyway so iterating over a list of months might be more straightforward:

# Iterate to select and create columns
for c in ['Jan', 'Feb', 'Mar']:
    df[c] = df[f'{c}-sale'] * df[f'{c}-price']

Either option results in df:

Items Jan-sale Feb-sale Mar-sale Jan-price Feb-price Mar-price Jan Feb Mar
A 10 25 0 3 3 5 30 75 0
B 15 23 10 2 3 10 30 69 100
C 20 21 5 5 10 15 100 210 75

Setup and imports:

import pandas as pd

df = pd.DataFrame({
    'Items': ['A', 'B', 'C'], 'Jan-sale': [10, 15, 20],
    'Feb-sale': [25, 23, 21], 'Mar-sale': [0, 10, 5], 'Jan-price': [3, 2, 5],
    'Feb-price': [3, 3, 10], 'Mar-price': [5, 10, 15]
})
  • Related