I have a dictionary that I want to apply to a DataFrame column to create a new column. I made the dictionary from another DataFrame that has columns named 'ID' and 'SMILES', like this:
dictionary = smiles_df.set_index('ID').T.to_dict()
The dictionary looks like this:
'N[C@@H](Cc1c[nH]c2ccccc12)C(=O)NO': {'SMILES': '[NH3 ]C(Cc1c[nH]c2ccccc12)C(O)=NO'},
'Clc1ccc(Nc2nnc(Cc3ccncc3)c3ccccc23)cc1': {'SMILES': 'Clc1ccc(Nc2nnc(Cc3ccncc3)c3ccccc23)cc1'},
'Oc1ccc(cc1)-c1nc(c([nH]1)-c1ccc(F)cc1)-c1ccncc1': {'SMILES': '[O-]c1ccc(-c2nc(-c3ccncc3)c(-c3ccc(F)cc3)[n-]2)cc1'},
I apply the dictionary like this:
df['processed_SMILES'] = df['SMILES'].map(dictionary)
The output in the df['processed_SMILES']
looks like this:
{'SMILES': 'CC(=O)CCCCn1c(=O)c2c(ncn2C)n(C)c1=O'}
When I want it to look like this:
'CC(=O)CCCCn1c(=O)c2c(ncn2C)n(C)c1=O'
How do I correct this?
CodePudding user response:
Use instead:
dictionary = smiles_df.set_index('ID')['SMILES'].to_dict()
Or:
dictionary = dict(zip(smiles_df['ID'], smiles_df['SMILES']))