Home > Back-end >  Separate row in dataframe from format with four numbers to two
Separate row in dataframe from format with four numbers to two

Time:07-25

Been trying to figure out how to solve this for a moment without any results. I would like to transform the first dataframe to the second dataframe below. From "0000" to "01" which means I want to take away the two last numbers.

#1
Code    Region   Party
0176      US       M
0176      US       A
0176      US       L
0176      US       T
0176      US       S
#With 8 000 more rows
#2
Code    Region   Party
01        US       M
01        US       A
01        US       L
01        US       T
01        US       S
#With 8 000 more rows

I believe separate(Code) is something on the way. Worth to mention there are more regions that just US.

CodePudding user response:

You could also you the following code to remove the last 2 characters from every element like this:

df$Code <- substr(df$Code,1,nchar(df$Code)-2)

Output:

  Code Region Party
1   01     US     M
2   01     US     A
3   01     US     L
4   01     US     T
5   01     US     S

CodePudding user response:

Using Base R gsub function will take away the two last numbers

df$Code <- gsub("\\d{2}$" , "" , df$Code)

CodePudding user response:

Using {tidyverse} packages, you can do this with mutate(df1, Code = str_sub(Code, 1, 2))

  •  Tags:  
  • r
  • Related