I want to find matching values from two data frames and return a third value.
For example, if cpg_symbol["Gene_Symbol"]
corresponds with diff_meth_kirp_symbol.index
, I want to assign cpg_symbol.loc["Composite_Element_REF"]
as index.
My code returned an empty dataframe.
diff_meth_kirp.index = diff_meth_kirp.merge(cpg_symbol, left_on=diff_meth_kirp.index, right_on="Gene_Symbol")[["Composite_Element_REF"]]
Example:
diff_meth_kirp
Hello | My | name | is | |
---|---|---|---|---|
First | 0 | 1 | 2 | 3 |
Second | 4 | 5 | 6 | 7 |
Third | 8 | 9 | 10 | 11 |
Fourth | 12 | 13 | 14 | 15 |
Fifth | 16 | 17 | 18 | 19 |
Sixth | 20 | 21 | 22 | 23 |
cpg_symbol
Composite_Element_REF | Gene_Symbol | |
---|---|---|
cg1 | First | |
cg2 | Third | |
cg3 | Fifth | |
cg4 | Seventh | |
cg5 | Ninth | |
cg6 | First |
Expected output:
Hello | My | name | is | |
---|---|---|---|---|
cg1 | 0 | 1 | 2 | 3 |
cg2 | 8 | 9 | 10 | 11 |
cg3 | 16 | 17 | 18 | 19 |
cg6 | 0 | 1 | 2 | 3 |
CodePudding user response:
Your code works well for me but you can try this version:
out = (diff_meth_kirp.merge(cpg_symbol.set_index('Gene_Symbol'),
left_index=True, right_index=True)
.set_index('Composite_Element_REF')
.rename_axis(None).sort_index())
print(out)
# Output
Hello My name is
cg1 0 1 2 3
cg2 8 9 10 11
cg3 16 17 18 19
cg6 0 1 2 3
Input dataframes:
data1 = {'Hello': {'First': 0, 'Second': 4, 'Third': 8, 'Fourth': 12, 'Fifth': 16, 'Sixth': 20},
'My': {'First': 1, 'Second': 5, 'Third': 9, 'Fourth': 13, 'Fifth': 17, 'Sixth': 21},
'name': {'First': 2, 'Second': 6, 'Third': 10, 'Fourth': 14, 'Fifth': 18, 'Sixth': 22},
'is': {'First': 3, 'Second': 7, 'Third': 11, 'Fourth': 15, 'Fifth': 19, 'Sixth': 23}}
diff_meth_kirp = pd.DataFrame(data1)
data2 = {'Composite_Element_REF': {0: 'cg1', 1: 'cg2', 2: 'cg3', 3: 'cg4', 4: 'cg5', 5: 'cg6'},
'Gene_Symbol': {0: 'First', 1: 'Third', 2: 'Fifth', 3: 'Seventh', 4: 'Ninth', 5: 'First'}}
cpg_symbol = pd.DataFrame(data2)