I have table with two columns:
Index | Col_1 | Key |
---|---|---|
1 | 3,5 | 1001 |
2 | 4,5,6 | 2004 |
3 | 7845 | |
4 | 1,6 | 9658 |
5 | 2,3 | 5236 |
6 | 6066 |
Need to check Col_1 and save the Key for each index number mentioned in the Col_1.
For Example: in Index 1, Col_1 has values 3,5 so need to save the Key of Index 3,5 in Key_Checked column. The Key saved will be 7845,5236
The output table should be:
Index | Col_1 | Key | Key_Checked |
---|---|---|---|
1 | 3,5 | 1001 | 7845,5236 |
2 | 4,5,6 | 2004 | 9658,5236,6066 |
3 | 7845 | ||
4 | 1,6 | 9658 | 1001 |
5 | 2,3 | 5236 | 2004,7845 |
6 | 6066 |
CodePudding user response:
You can use:
# ensure we have strings everywhere
df = df.astype(str)
df['Key_Checked'] = (df['Col_1'].str.split(',').explode()
.map(df.set_index('Index')['Key']).dropna()
.groupby(level=0).agg(','.join)
.reindex(df.index, fill_value='')
)
output:
Index Col_1 Key Key_Checked
0 1 3,5 1001 7845,5236
1 2 4,5,6 2004 9658,5236,6066
2 3 7845
3 4 1,6 9658 1001,6066
4 5 2,3 5236 2004,7845
5 6 6066
CodePudding user response:
In the case if "Index" starts from 1 and has the type of "int64".
import pandas as pd
import numpy as np
df = pd.DataFrame({"Col_1": ["3,5", "4,5,6", np.NAN, "1,6", "2,3", np.NAN], "Key": ["1001", "2004", "7845", "9658", "5236", "6066"]}, index=range(1, 7))
df.fillna("", inplace=True)
df["Key_Checked"] = df.apply(lambda x: "" if not x["Col_1"] else ",".join(df.loc[[int(i) for i in x["Col_1"].split(",")], "Key"]), axis=1)