Home > Mobile >  How do I recode a character variable containing certain letters?
How do I recode a character variable containing certain letters?

Time:04-29

I have a dataframe in which I have merged rows according to certain variables. This has worked well, but I now have the problem that for some character variables, the values are duplicates.

I have two values, either "Con" or "Lab" and now have rows (which were merged) that now show "ConCon" or "LabLabLab".

My question is how do I recode these values? Ideally I need a command where a value containing "Lab" (e.g. "LabLabLabLab") is turned into Lab.

Any input would be greatly appreciated. Thank you!

CodePudding user response:

In R:

df <- data.frame(id = 1:5, party = c("Con", "ConCon", "LabLabLab", "LabLabLabLab", "ConConCon"))
df$party <- gsub("^(Con|Lab).*", "\\1", df$party)
df
##   id party
## 1  1   Con
## 2  2   Con
## 3  3   Lab
## 4  4   Lab
## 5  5   Con

CodePudding user response:

Assuming you can't pass the "LabCon" case, you can do:

legal_words = ["Con", "Lab"]
to_change_words = ["Con", "ConCon", "LabLabLab", "LabLab", "Lab"]

for i,word in enumerate(to_change_words):
    for legal in legal_words:
        if legal in word:
            to_change_words[i] = legal

print(to_change_words)

And this will output

['Con', 'Con', 'Lab', 'Lab', 'Lab']
  • Related