I hope you can give me a hand here. So, I have this dataframe:
| | a | b | a.1 | b.1 | a.2 | b.2|
|--:|-----:|-----:|-----:|-----:|-----:|------|
| 0 | a111 | b111 | c222 | d222 | e333 | f333 |
Make a subset is not an option because is too many columns.
The output I expect is like this:
| | a | b |
|---|------|------|
| 0 | a111 | b111 |
| 1 | c222 | d222 |
| 2 | e333 | f333 |
Thanks in advance.
The code to replicate the dataframe:
list_demo = []
a = "a111"
b = "b111"
c = "c222"
d = 'd222'
e = 'e333'
f = "f333"
list_demo.append([a,b,c,d,e,f])
df = pd.DataFrame(list_demo)
df.columns = ['a', 'b', 'a.1', 'b.1', 'a.2', 'b.2']
CodePudding user response:
You can use:
out = (df
.set_axis(df.columns.str.split('.', expand=True), axis=1)
.stack()
.droplevel(1)
)
Output:
a b
0 a111 b111
0 c222 d222
0 e333 f333
Or, if you have a single row:
(df.set_axis(df.columns.str.split('.', expand=True), axis=1)
.stack()
.reset_index(drop=True)
)
Output:
a b
0 a111 b111
1 c222 d222
2 e333 f333