I have one Dataframe1
from which I need to form new Dataframe2
as given below. Column n1
will pick the value from Status
column in Dataframe1
if Dataframe1.name = A
and similarly column n2
will take the value from column Status
if Dataframe1.Name = B
. Also, Timestamp
and id
will have unique values. Can anybody please help?
Input Dataframe1
:
id | Timestamp | Name | Status |
---|---|---|---|
1 | 02:15:00 | A | FALSE |
1 | 02:15:00 | B | TRUE |
2 | 03:00:00 | A | TRUE |
2 | 03:00:00 | B | FALSE |
Output Dataframe2
:
id | Timestamp | n1 | n2 |
---|---|---|---|
1 | 02:15:00 | FALSE | TRUE |
2 | 03:00:00 | TRUE | FALSE |
CodePudding user response:
What you are trying to do is taking a pivot of the data with special names. If you rename A
and b
values as you want to n1
and n2
only thing you have to do is to use the pandas.pivot_table
function.
because as its aggregation function it uses mean strings don't work out of the box. You have to provide your own aggregation function. Because in our situation every row is unique we can just give the aggregation function to take the value of that row.
dataframe1['Name'] = dataframe1['Name'].replace({'A': 'n1', 'b': 'n2'})
dataframe1.pivot_table(index=['id', 'Timestamp'],
columns='Name',
values='Status',
aggfunc=lambda x:x).reset_index()
CodePudding user response:
You can use pandas.pivot_table
:
df2 = df.pivot_table(index=['id','Timestamp'], columns='Name', values='Status').reset_index().set_index('id')
df2.columns = ['Timestamp','n1','n2']
Output:
>>> df2
Timestamp n1 n2
id
1 02:15:00 FALSE TRUE
2 03:00:00 TRUE FALSE
CodePudding user response:
using pivot_table and then adjusting the result header.
import pandas as pd
df = pd.read_excel('test.xls', index_col = False)
df2 = df.pivot_table(index = ['id', 'Timestamp'], columns = 'Name', values = 'Status').reset_index().rename_axis(None, axis=1).rename(columns = {'A': 'n1', 'B': 'n2'})
print(df2)