Home > front end >  Duplicating columns in pandas dataframe
Duplicating columns in pandas dataframe

Time:01-05

I'm looking for a way to duplicate all columns in a dataframe, and have the duplicated column as the original name with a '_2' on the end.

Example:

d = {'col1': [1, 2], 'col2': [3, 4]}
start_df = pd.DataFrame(data=d)

d2 = {'col1':[1,2],'col1_2':[1,2],'col2':[3,4],'col2_2':[3,4]}
end_df = pd.DataFrame(data=d2)

Thanks.

CodePudding user response:

Try this:

d = {'col1': [1, 2], 'col2': [3, 4]}
start_df = pd.DataFrame(data = d)

for column in start_df.columns:
    start_df[column   '_2'] = start_df[column]
    

CodePudding user response:

NB. this answer demonstrates a generalization of the process

Without any loop for generating the dataframe, you can simple use the repeat method of the columns index.

Then you can set columns names programmatically with a list comprehension.

For 2 repeats:

end_df = start_df[start_df.columns.repeat(2)]
end_df.columns = [f'{a}{b}' for a in start_df for b in ('', '_2')]

output:

   col1  col1_2  col2  col2_2
0     1       1     3       3
1     2       2     4       4

Generalization:

n = 5

end_df = start_df[start_df.columns.repeat(n)]
end_df.columns = [f'{a}{b}' for a in start_df
                            for b in [''] [f'_{x 1}' for x in range(1,n)]]

Example n=5:

   col1  col1_2  col1_3  col1_4  col1_5  col2  col2_2  col2_3  col2_4  col2_5
0     1       1       1       1       1     3       3       3       3       3
1     2       2       2       2       2     4       4       4       4       4

CodePudding user response:

Use .insert() function:

import pandas as pd

d = {'col1': [1, 2], 'col2': [3, 4]}
start_df = pd.DataFrame(data=d)

for i, col in enumerate(start_df.columns):
    start_df.insert(i 1, col '_2', start_df[col])
start_df

output:

Out[1]:
   col1  col1_2  col2_2  col2
0     1       1       3     3
1     2       2       4     4
  •  Tags:  
  • Related