I am a beginner to python
I have two dataframes,named GB and New_df, each with 651 rows. Essentially I would like to replace the "MP" column in GB with the "MP" column in New_df.
I have tried both of the following
gb["MP"] = new_df["MP"]
gb["MP"] = [x for x in new_df["MP"]]
But then the values are out of allignment with the other columns in the datafraame. Here is the full code:
import pandas as pd
gb = pd.read_csv("https://www.electoralcalculus.co.uk/electdata_1992ob.txt", sep=";")
gb = gb.sort_values(by="Name", ascending=True)
gb = gb.reset_index(drop=True)
new_df = pd.read_html('https://en.wikipedia.org/wiki/List_of_MPs_elected_in_the_1992_United_Kingdom_general_election',
match='Constituency')
new_df = new_df[0]
new_df.columns = ["Name", "MP", "Party"]
new_df = new_df[~new_df["MP"].str.contains("\[edit\]")]
new_df = new_df[~new_df["MP"].str.contains("MP")]
new_df = new_df.sort_values(by="Name", ascending=True)
new_df = new_df.reset_index(drop=True)
gb["MP"] = new_df["MP"]
CodePudding user response:
Probably, your indices are not aligned. Consider this example:
#Preparing sample data
string = """A B C D
1 a a 1
1 a a 2
1 a a 3
2 a a 1
2 a a -1
3 a a -1
3 a a -2
3 a a -3"""
import numpy as np
data = [x.split() for x in string.split('\n')]
import pandas as pd
df = pd.DataFrame(np.array(data[1:]), columns = data[0])
df['D'] = df['D'].astype(int)
#making new_df
new_df = df.copy()
#shuffling index of new df
s = df.index.to_list()
np.random.shuffle(s)
new_df.index = s
#Assigning back
df['new'] = new_df['A']
Output:
A B C D new
0 1 a a 1 2
1 1 a a 2 1
2 1 a a 3 3
3 2 a a 1 2
4 2 a a -1 3
5 3 a a -1 3
6 3 a a -2 1
7 3 a a -3 1
Solution is to first sort your new df index:
new_df = new_df.sort_index()
then try assignment.