Appending columns to other columns in Pandas-CodePudding

Given the dataframe:


d = {'col1': [1, 2, 3, 4, 7], 'col2': [4, 5, 6, 9, 5], 'col3': [7, 8, 12, 1, 11], 'col4': [12, 13, 14, 15, 16]}

What is the easiest way to append the third column to the first and the fourth column to the second?

The result should look like.


d = {'col1': [1, 2, 3, 4, 7, 7, 8, 12, 1, 11], 'col2': [4, 5, 6, 9, 5, 12, 13, 14, 15, 16],

I need to use this for a script with different column names, thus referencing columns by name is not possible. I have tried something along the lines of df.iloc[:,x] to achieve this.

CodePudding user response：

You can change the column names and concat:

pd.concat([df[['col1', 'col2']],
           df[['col3', 'col4']].set_axis(['col1', 'col2'], axis=1)])

Add ignore_index=True to reset the index in the process.

Output:

   col1  col2
0     1     4
1     2     5
2     3     6
3     4     9
4     7     5
0     7    12
1     8    13
2    12    14
3     1    15
4    11    16

Or, using numpy:

N = 2
pd.DataFrame(
    df
    .values.reshape((-1,df.shape[1]//2,N))
    .reshape(-1,N,order='F'),
    columns=df.columns[:N]
 )

CodePudding user response：

You can use:

out = pd.concat([subdf.set_axis(['col1', 'col2'], axis=1)
                for _, subdf in df.groupby(pd.RangeIndex(df.shape[1]) // 2, axis=1)])
print(out)

# Output
   col1  col2
0     1     4
1     2     5
2     3     6
3     4     9
4     7     5
0     7    12
1     8    13
2    12    14
3     1    15
4    11    16

CodePudding user response：

This may not be the most efficient solution but, you can do it using the pd.concat() function in pandas.

First convert your initial dict d into a pandas Dataframe and then apply the concat function.

  d = {'col1': [1, 2, 3, 4, 7], 'col2': [4, 5, 6, 9, 5], 'col3': [7, 8, 12, 1, 11], 'col4': [12, 13, 14, 15, 16]}
  df = pd.DataFrame(d)
  d_2 = {'col1':pd.concat([df.iloc[:,0],df.iloc[:,2]]),'col2':pd.concat([df.iloc[:,1],df.iloc[:,3]])}

d_2 is your required dict. Convert it to a dataframe if you need it to,

df_2 = pd.DataFrame(d_2)