I have a pandas data frame that looks like the following
Name | Col 1 | Col 2 |
---|---|---|
A | 1 | 2 |
B | 3 | 4 |
I'd like to change the dataset so that it has 2 columns, name and value. But I'd like to create a new row for each existing row combined with each column.
Like this:
Name | Val |
---|---|
A-Col1 | 1 |
A-Col2 | 2 |
B-Col1 | 3 |
B-Col2 | 4 |
CodePudding user response:
here is one way do it, using melt
df2=df.melt(id_vars='Name')
df2['Name'] = df2['Name'] '-' df2['variable']
df2=df2.drop(columns='variable')
df2
Name value
0 A-Col 1 1
1 B-Col 1 3
2 A-Col 2 2
3 B-Col 2 4
CodePudding user response:
try this one
def combine_columns(df):
df['combined'] = df.apply(lambda row: row['name'] ' ' row['value'], axis=1)
return df
CodePudding user response:
You can get pretty close very quickly using unstack and setting the index
to Name
:
df.set_index("Name").unstack().swaplevel()
gives
Name
A Col 1 1
B Col 1 3
A Col 2 3
B Col 2 4
Now you have the right structure you just need to sort and merge. Combining in the sort and merge steps you can use:
tmp = (
df.set_index("Name")
.unstack()
.swaplevel()
.sort_index()
.reset_index()
.rename({0: "val"}, axis=1)
)
tmp["Name"] = tmp["Name"] "-" tmp["level_1"]
df = tmp.drop("level_1", axis=1)
to get
Name val
0 A-Col 1 1
1 A-Col 2 3
2 B-Col 1 3
3 B-Col 2 4
From your expected output you could use:
tmp["Name"] = tmp["Name"] "-" tmp["level_1"].str.replace(" ", "")
for the Name
line as this will remove the space to give
Name val
0 A-Col1 1
1 A-Col2 3
2 B-Col1 3
3 B-Col2 4