How to make if/else chain with lists more efficient-CodePudding

I have a huge dataframe with ~100,000 rows. I have this code but it is taking way too long to execute. Is there any way to make this more efficient?

df["Grade Band"] = ""
k5 = ["0","1","2","3","4","5"]
ms = ["6", "7" ,"8"]
hs = ["9","10","11","12"]

for x in df["Grade Roll"]:
    if x == "Other":
        df["Grade Band"] == "Undefined"
    elif x in k5:
        df["Grade Band"] == "K5"
    elif x in ms:
        df["Grade Band"] == "MS"
    elif x in hs:
            df["Grade Band"] == "HS"

CodePudding user response：

This should be pretty fast:

df.loc[df["Grade Roll"] == "Other", "Grade Roll"] = "Undefined"
df.loc[df["Grade Roll"].isin(k5), "Grade Roll"] = "K5"
df.loc[df["Grade Roll"].isin(ms), "Grade Roll"] = "MS"
df.loc[df["Grade Roll"].isin(hs), "Grade Roll"] = "HS"

If you wanted it to be less repetitive, you could store your arrays in a dict:

d = {
    "K5": ["0","1","2","3","4","5"],
    "MS": ["6", "7" ,"8"],
    "HS": ["9","10","11","12"],
    "Undefined": ["Other"]
}

for k, v in d.items():
    df.loc[df["Grade Roll"].isin(v), "Grade Roll"] = k

CodePudding user response：

Use a map.

gradeMap = { "Other": "Undefined" }
gradeMap.update(dict.fromkeys(k5, "K5"))
gradeMap.update(dict.fromkeys(ms, "MS"))
gradeMap.update(dict.fromkeys(hs, "HS"))

for x in df["Grade Roll"]:
    if x in gradeMap:
        gradeMapVal = gradeMap[x]
    else:
        raise Exception('The grade '   x   ' was not found in the grade map.')