I have a data set that looks like this
ID a b c d source file
1 3 4 7 23 feb2010.txt
2 2 1 2 47 feb2010.txt
1 3 4 7 26 march2010.txt
2 2 1 2 33 march2010.txt
1 3 4 7 28 april2010.txt
2 2 1 2 32 april2010.txt
I'd like the column names to read
ID a b c Feb10 March10 April10
1 3 4 7 23 26 28
2 2 1 2 47 33 32
My actual data set has more than just 2 unique ids. It has thousands of unique ids. Any help as to how to change this would be very much appreciated as most of the code I have tried hasn't worked yet.
Thank you!
CodePudding user response:
You can use pivot_wider()
from the tidyverse.
library(dplyr)
library(tidyr)
library(lubridate)
df %>%
mutate(source_file = tools::file_path_sans_ext(source_file),
source_file = format(my(source_file), format = "%b%y")) %>%
pivot_wider(names_from = "source_file", values_from = "d")
Which gives you the following:
# A tibble: 2 x 7
ID a b c Feb10 Mar10 Apr10
<int> <int> <int> <int> <int> <int> <int>
1 1 3 4 7 23 26 28
2 2 2 1 2 47 33 32
Data:
df <- read.table(textConnection("ID a b c d source_file
1 3 4 7 23 feb2010.txt
2 2 1 2 47 feb2010.txt
1 3 4 7 26 march2010.txt
2 2 1 2 33 march2010.txt
1 3 4 7 28 april2010.txt
2 2 1 2 32 april2010.txt"), header = TRUE)
CodePudding user response:
Using dcast
from data.table
library(data.table)
dcast(setDT(df1), ID a b c ~ trimws(source_file,
whitespace = "\\.txt"), value.var = "d")
Key: <ID, a, b, c>
ID a b c april2010 feb2010 march2010
<int> <int> <int> <int> <int> <int> <int>
1: 1 3 4 7 28 23 26
2: 2 2 1 2 32 47 33