Home > Blockchain >  How to make new column based on value of raws
How to make new column based on value of raws

Time:10-06

I have a csv file like the list below (which has thousands of rows);

name,location,time
james,ond,5
conor,aas,2
james,jja,3
elisa,aab,1
mike,sjs,1
elisa,ond,5
elisa,mmm,2

How can I turn this into (basicly make the values in second column (location) as new columns, search in the names for each and make the names don't repeat and put 0 as values on new columns if it is not there. The desired outcome with the code is:

name,ond,aas,jja,aab,sjs,mmm
james,5,0,3,0,0,0
conor,0,2,0,0,0,0
elisa,5,0,0,1,0,2
mike,0,0,0,0,1,0

What I tried so far to do this is to load it twice as two different databases and get a value but it fails all the time giving exactly the same result before.

import pandas as pd
df1 = pd.read_csv("file.csv")
df2 = pd.read_csv("file.csv")

df1['time'] = df2['time'].where(df1[['name','location']].isin(df2).all(axis=1)).fillna('0')

or the below code to extract the values but still does not work:

df1.merge(df2, on=['name','location'], how='left').fillna(0)

CodePudding user response:

You can use pd.crosstab:

pd.crosstab(index=df['name'], columns=df['location'], values=df['time'], aggfunc=lambda x: x).\
            fillna(0).reset_index()

Output:

location    name    aab aas jja mmm ond sjs
0          conor    0.0 2.0 0.0 0.0 0.0 0.0
1          elisa    1.0 0.0 0.0 2.0 5.0 0.0
2          james    0.0 0.0 3.0 0.0 5.0 0.0
3           mike    0.0 0.0 0.0 0.0 0.0 1.0

Here location is the column name, you can get rid of it via df.columns.name = None

CodePudding user response:

Maybe this one is the solution you are looking for:

Given the dataframe df

    name location  time
0  james      ond     5
1  conor      aas     2
2  james      jja     3
3  elisa      aab     1
4   mike      sjs     1
5  elisa      ond     5
6  elisa      mmm     2

import pandas as pd
pivot = pd.pivot_table(df, values='time', index='name', columns='location', aggfunc='sum', fill_value=0)

Output:

location  aab  aas  jja  mmm  ond  sjs
name                                  
conor       0    2    0    0    0    0
elisa       1    0    0    2    5    0
james       0    0    3    0    5    0
mike        0    0    0    0    0    1

Note that if you have different values for the same combination of name and location, you have to select an appropriate aggfunc, to properly display the value you need. Ypu can also pass a dictionary to the aggfunc value, as explained in Discussion on aggfunc

  • Related