Home > OS >  How to change values using multiple columns in pandas?
How to change values using multiple columns in pandas?

Time:10-06

I have 2 columns

Platform = [ Wii,Ds,Wii,3ds,GBA,GBA,3ds]

Year = [2006,2004,NaN,2011,1986,NaN,2012]

How do I change NaN value of Wii to 2008?

CodePudding user response:

There are multiple solutions, but if you have a large number of data set I would suggest using NumPy vectorization ref1, ref2, but here I am applying Pandas apply method:

  1. find out the rows where platform is "Wii"
  2. find out the column name that needs to be changed
  3. check the value to which you need to be updated from (here with "NaN" value) with the updated value (2008)
import pandas as pd
import numpy as np
Platform = [ 'Wii','Ds','Wii','3ds','GBA','GBA','3ds']

Year = [2006,2004,np.NaN,2011,1986,np.NaN,2012]
df=pd.DataFrame({'platform':Platform,'year':Year})

print(df)

df.loc[df["platform"]=="Wii","year"]=df[df["platform"]=="Wii"]["year"].apply(lambda x:2008 if pd.isna(x) else x)


print(df)

Output:

platform    year
0      Wii  2006.0
1       Ds  2004.0
2      Wii     NaN
3      3ds  2011.0
4      GBA  1986.0
5      GBA     NaN
6      3ds  2012.0
platform    year
0      Wii  2006.0
1       Ds  2004.0
2      Wii  2008.0
3      3ds  2011.0
4      GBA  1986.0
5      GBA     NaN
6      3ds  2012.0

CodePudding user response:

If your dataframe is stored in a variable named df then this will work:

import pandas as pd
import numpy as np

data = [['Wii', '2006'], ['Ds', '2004'], ['Wii', np.nan], ['3ds', '2011'], ['GBA', '1986'], ['GBA', np.nan], ['3ds', '2012'], ['Wii', np.nan]]

df = pd.DataFrame(data, columns=['Platform', 'Year'])
df['Year'] = np.where((df['Platform'] == 'Wii') & (df['Year'].isna()), '2008', df['Year'])

print(df)

Output:

  Platform  Year
0      Wii  2006
1       Ds  2004
2      Wii  2008
3      3ds  2011
4      GBA  1986
5      GBA   NaN
6      3ds  2012
7      Wii  2008
  • Related