i have two data frame , A and B . In A Data frame two column value and filed . and in B data frame have also value and filed column. i want to match the 'value' column of B to A of 'Filed' column, replace the Filed of A to value of B. A=
Value Filed
valid username username
valid username input_txtuserid
Password input_txtpassword
Password txtPassword
Login input_submit_log_in
LOG IN SIGNIN
B=
Value Filed
input_txtuserid "JOHN"
input_txtpassword "78945"
input_submit_log_in "Sucessfully"
Password txtPassword
City "London"
PLACE "4-A avenue Street"
i want my dataframe C to look like this
C=
Value Filed
valid username "JOHN"
Password "78945"
Login "Sucessfully"
i written below code , but i getting KeyError: 'City',
_map = dict(zip(A.Filed.values, A.Value.values))
def get_correct_value(row, _map=_map):
new_value = _map[row.Value]
filed = row.Filed
return new_value, filed
C = B.apply(get_correct_value, axis=1, result_type='expand')
C.columns = ['Value','Filed']
i want to ignore the keyword which is not available in A dataframe. '
CodePudding user response:
I'm going to assume that in the DataFrames are strings as we typically don't use Dataframes to carry variables. With this I created a sample with your dataframe values.
data_a = {"Value": ["valid username", "valid username", "Password", "Password", "Login", "LOG IN"],
"Filed": ["username", "input_txtuserid", "input_txtpassword", "txtPassword", "input_submit_log_in", "SIGNIN"]}
data_b = {"Value": ["input_txtuserid", "input_txtpassword", "input_submit_log_in", "Password", "City", "PLACE"],
"Filed": ["JOHN", "78945", "Sucessfully", "txtPassword", "London", "4-A avenue Street"]}
A = pd.DataFrame(data_a)
B = pd.DataFrame(data_b)
Below the code to create C:
# Merging A and B, using a left join on the columns Filed for A and Value for B. Creatingg an indicator where exists
C = pd.merge(A, B, left_on=['Filed'], right_on=['Value'], how='left', indicator='Exist')
# If exists put true, otherwise false
C['Exist'] = np.where(C.Exist == 'both', True, False)
# Dropping all False so those that dont exist in both dataframes
C.drop(C[C['Exist'] == False].index, inplace=True)
# Making sure C has the right column and column names.
C = C[['Value_y', 'Filed_y']]
C.rename(columns = {"Value_y": "Value",
"Filed_y": "Filed"}, inplace = True)
Hope that helps! Please Mark this as answer if it does :)