I am trying to create variable that indicates the market sentiment for different date ranges in my data set. Those are the date ranges which I would like to have in a dedicated column tripsentiment$sentiment
:
My approach was the following code since I did not manage to use ifelse for different categories and could not find a better solution:
tripsentiment$sentiment <- cut.POSIXt(tripsentiment$date,
breaks = as.POSIXct(as.Date(c("10-03-2016", "13-06-2016",
"16-07-2017", "01-09-2017",
"14-09-2017", "13-01-2018",
"01-04-2018", "05-05-2018",
"02-04-2019"))),
labels = c("BULL", "BEAR", "BULL", "BEAR", "BULL", "BEAR",
"BULL", "BEAR"))
ethamount date dollvalue id ispurchase dollarcum purchcost ROI sentiment
1 8.7796312554873e-5 2016-03-11 01:00:00 -0.0010491659350307322 883 1 0 1.049166e-03 0.0 <NA>
2 0.001 2016-03-18 01:00:00 -0.010740000000000001 36927 1 0 1.074000e-02 0.0 <NA>
3 75.4154 2016-03-25 01:00:00 -804.682318 2637 1 0 8.046823e 02 0.0 <NA>
4 0.10662867986198896 2016-05-02 02:00:00 -1.0662867986198896 72274 1 0 1.066287e 00 0.0 <NA>
5 0.01 2016-05-02 02:00:00 -0.1 94359 1 0.010899999999999993 1.000000e-01 10.9 <NA>
6 0.1 2016-05-04 02:00:00 -0.9460000000000002 3083 1 0 9.460000e-01 0.0 <NA>
However the result is a column of NAs and I can simply not figure out why.
dput output looks as follows:
structure(list(ethamount = c("8.7796312554873e-5", "0.001", "75.4154",
"0.10662867986198896", "0.01", "0.1"), date = structure(c(1457654400,
1458259200, 1458864000, 1462147200, 1462147200, 1462320000), class = c("POSIXct",
"POSIXt")), dollvalue = c("-0.0010491659350307322", "-0.010740000000000001",
"-804.682318", "-1.0662867986198896", "-0.1", "-0.9460000000000002"
), id = c("883", "36927", "2637", "72274", "94359", "3083"),
ispurchase = c("1", "1", "1", "1", "1", "1"), dollarcum = c("0",
"0", "0", "0", "0.010899999999999993", "0"), purchcost = c(0.00104916593503073,
0.01074, 804.682318, 1.06628679861989, 0.1, 0.946), ROI = c(0,
0, 0, 0, 10.9, 0), sentiment = structure(c(NA_integer_, NA_integer_,
NA_integer_, NA_integer_, NA_integer_, NA_integer_), .Label = c("1",
"2", "3", "4", "5", "6", "7", "8"), class = "factor")), row.names = c(NA,
6L), class = "data.frame")
I assume it is some sort of formatting issue with as.date but I can not quiet figure it out, therefore I ask for help here and appreciate every hint. Thank you very much in advance.
CodePudding user response:
The as.Date
requires format
argument if it is not in the default format %Y-%m-%d
. Here, the format is %d-%m-%Y
brks <- as.Date(c("10-03-2016", "13-06-2016",
"16-07-2017", "01-09-2017",
"14-09-2017", "13-01-2018",
"01-04-2018", "05-05-2018",
"02-04-2019"), format = '%d-%m-%Y')
-output
> brks
[1] "2016-03-10" "2016-06-13" "2017-07-16" "2017-09-01" "2017-09-14" "2018-01-13" "2018-04-01" "2018-05-05" "2019-04-02"
> cut(tripsentiment$date, breaks = as.POSIXct(brks), labels = c("BULL", "BEAR", "BULL", "BEAR", "BULL", "BEAR",
"BULL", "BEAR"))
[1] BULL BULL BULL BULL BULL BULL
Levels: BULL BEAR