I have a dataset which looks like following-
| year | state |district| Party|
|---------------|-------------|--------|------|
| 2010 | haryana |kaithal | INC|
| 2010 | haryana |kaithal | bjp|
| 2010 | haryana |kaithal |NOTA|
| 2010 | goa |panji | AAP|
| 2010 | goa |panji | INC|
| 2010 | goa |panji | BJP|
| 2013 | up |meerut | INC|
| 2013 | up |meerut | SP |
| 2015 | haryana |kaithal |INC |
| 2015 | haryana |kaithal |BJP |
| 2015 | haryana |kaithal |AAP |
I want to rename INC to major for year 2010 and BJP to major in 2015. I need data in the following manner-
year | state | district | Party |
---|---|---|---|
2010 | haryana | kaithal | major |
2010 | haryana | kaithal | bjp |
2010 | haryana | kaithal | NOTA |
2010 | goa | panji | AAP |
2010 | goa | panji | major |
2010 | goa | panji | BJP |
2013 | up | meerut | major |
2013 | up | meerut | SP |
2015 | haryana | kaithal | INC |
2015 | haryana | kaithal | major |
2015 | haryana | kaithal | AAP |
I am using the code-
for state in df['state']:
if state=='haryana':
for year in df['year']:
if year==2010:
df['party'].replace('INC','major',inplace=True)
else:
continue
if year==2015:
df['party'].replace('BJP','major',inplace=True)
else:
continue
But this code is taking a lot of time to run and not giving the desired results as it is just considering replacing INC to major in all years and does not replace BJP.
CodePudding user response:
You can use boolean indexing with pandas.DataFrame.loc
:
m1= (df["year"].eq(2010)) & (df["Party"].eq("INC"))
m2= (df["year"].eq(2015)) & (df["Party"].eq("BJP"))
df.loc[m1|m2, "Party"] = "major"
# Ouptut :
print(df.to_string())
year state district Party
0 2010 haryana kaithal major
1 2010 haryana kaithal bjp
2 2010 haryana kaithal NOTA
3 2010 goa panji AAP
4 2010 goa panji major
5 2010 goa panji BJP
6 2013 up meerut INC
7 2013 up meerut SP
8 2015 haryana kaithal INC
9 2015 haryana kaithal major
10 2015 haryana kaithal AAP
CodePudding user response:
Chain 3 conditions for compare each 3 values and set new values in DataFrame.loc
:
m1 = (df.state=='haryana') & (df['year'] == 2010) & (df['party'] == 'INC')
m2 = (df.state=='haryana') & (df['year'] == 2015) & (df['party'] == 'BJP')
m = m1 | m2
Or:
m = (df.state=='haryana') & ((df['year'] == 2010) & (df['party'] == 'INC') |
(df['year'] == 2015) & (df['party'] == 'BJP'))
df.loc[m, 'party'] = 'major'
EDIT: You can check ouput of masks if working well:
m1=(df['STATE']=='BIHAR') & (df['YEAR']==2010) & ((df['PARTY']=='BJP')|(df['PARTY']=='JD(U)'))
m2=(df['STATE']=='BIHAR') & (df['YEAR']==2015) & ((df['PARTY']=='RJD')|(df['PARTY']=='JD(U)'))
m3=(df['STATE']=='BIHAR') & (df['YEAR']==2020) & ((df['PARTY']=='BJP')|(df['PARTY']=='JD(U)')) m=m1|m2|m3 df.loc[m, 'PARTY']= 'MAJOR'
print (df.assign(m1=m1, m2=m2, m3=m3,triple= m1 | m2 | m3,
BIHAR = (df['STATE']=='BIHAR'),
Y2010 = (df['YEAR']==2010)))