One fairly recurring pattern I have is creating a dataframe that combines another dataframe with its columns reversed. Here's a small example:
import pandas as pd
df = pd.DataFrame({"a": range(5), "b": range(6, 1, -1)})
combined = pd.concat([df, df.rename(columns={"a": "b", "b": "a"})], ignore_index=True)
Is there a more efficient approach to achieving this operation (esp. with many, many rows)?
CodePudding user response:
Convert values to numpy array and use np.concatenate
with reverse order of array by indexing a[:, ::-1]
, last pass to DataFrame
constructor:
a = df.to_numpy()
combined = pd.DataFrame(np.concatenate([a, a[:, ::-1]]), columns=df.columns)
print(combined)
a b
0 0 6
1 1 5
2 2 4
3 3 3
4 4 2
5 6 0
6 5 1
7 4 2
8 3 3
9 2 4
CodePudding user response:
You can use the underlying numpy array and vstack
on the array and its reversed version, then generate a new DataFrame:
import numpy as np
a = df.to_numpy()
pd.DataFrame(np.vstack([a, a[:, ::-1]]), columns=df.columns)