There are some mistakes in the Yahoo finance historical price for the Brazilian stock "PFRM3.SA". I was trying to conditionally change a value using dplyr
. I managed to do it using case_when()
but I couldn't make it work using replace()
, which I think should be a more efficient solution.
library(tidyverse)
library(tidyquant)
price <- tq_get("PFRM3.SA", from = "2020-01-01", get = "stock.prices") #Data
#visual inspection
price %>%
ggplot(aes(x = date, y = close))
geom_line()
#the decimal digit was misplaced a few times
#this works fine
price %>%
mutate(close = case_when(close > 100 ~ close/100, TRUE ~ close))
#However I don't understand why this do not work
price %>%
mutate(close = replace(close, close > 100, close/100))
The last line of code generates this error:
Warning messages:
1: Problem with `mutate()` input `close`.
ℹ number of items to replace is not a multiple of replacement length
ℹ Input `close` is `replace(close, close > 100, close/100)`.
2: In x[list] <- values :
number of items to replace is not a multiple of replacement length
I don't understand why replace()
doesn't return the original close value when the condition isn't met. Also, I don't know how to make it work. Thanks
CodePudding user response:
- The
replace()
function in R syntax is very simple and easy to implement. - It includes the vector, index vector, and the replacement values as well as shown below.
replace(x, list, values)
x = vector having some values -> close
list = this can be an index vector -> close > 100
Values = the replacement values -> close[close>100]/100
- Explanation:
You are conditionally indexing the close
column, therefore in the Values
argument of replace
(= the replacement values) you have to conditionally index the replacement value; here close/100
with close[close>100]/100
:
price %>%
mutate(close = replace(close, close > 100, close[close>100]/100)) %>%
data.frame()
symbol date open high low close volume adjusted
1 PFRM3.SA 2020-01-02 6.59 6.67 6.49 6.60 549100 6.263065
2 PFRM3.SA 2020-01-03 6.50 6.99 6.40 6.96 1202000 6.604687
3 PFRM3.SA 2020-01-06 7.07 7.71 7.07 7.69 2692600 7.297420
4 PFRM3.SA 2020-01-07 7.42 7.50 7.05 7.05 2325900 6.690092
5 PFRM3.SA 2020-01-08 7.01 7.28 6.80 7.13 1588900 6.766008
6 PFRM3.SA 2020-01-09 7.21 7.22 6.98 7.12 638800 6.756518
7 PFRM3.SA 2020-01-10 7.12 7.23 6.96 6.96 658000 6.604687
8 PFRM3.SA 2020-01-13 7.01 7.17 6.96 7.10 586700 6.737539
9 PFRM3.SA 2020-01-14 7.12 7.15 6.95 7.00 663700 6.642644
10 PFRM3.SA 2020-01-15 7.02 7.39 7.02 7.27 1173000 6.898861
11 PFRM3.SA 2020-01-16 7.35 7.45 7.09 7.16 816300 6.794476
12 PFRM3.SA 2020-01-17 7.24 7.29 7.07 7.17 669900 6.803966
13 PFRM3.SA 2020-01-20 7.20 7.29 7.10 7.23 548300 6.860903
14 PFRM3.SA 2020-01-21 7.23 7.40 7.13 7.17 722500 6.803966
15 PFRM3.SA 2020-01-22 7.23 7.29 6.88 7.09 697400 6.728050
16 PFRM3.SA 2020-01-23 7.09 7.15 6.95 7.04 670500 6.680603
17 PFRM3.SA 2020-01-24 7.07 7.29 7.02 7.15 552000 6.784987
18 PFRM3.SA 2020-01-27 7.08 7.08 6.70 6.95 681300 6.595197
19 PFRM3.SA 2020-01-28 6.94 7.06 6.61 6.90 703200 6.547750
20 PFRM3.SA 2020-01-29 6.92 7.03 6.72 6.72 406900 6.376938
21 PFRM3.SA 2020-01-30 6.60 6.65 6.40 6.64 786600 6.301023
22 PFRM3.SA 2020-01-31 6.51 6.83 6.49 6.49 379200 6.158680
23 PFRM3.SA 2020-02-03 6.45 6.69 6.37 6.66 491100 6.320002
24 PFRM3.SA 2020-02-04 6.71 6.90 6.68 6.90 388800 6.547750
25 PFRM3.SA 2020-02-05 6.85 6.94 6.64 6.81 499100 6.462344
26 PFRM3.SA 2020-02-06 6.89 6.89 6.72 6.83 302400 6.481323
27 PFRM3.SA 2020-02-07 6.66 6.82 6.57 6.60 280100 6.263065
28 PFRM3.SA 2020-02-10 6.60 6.60 6.23 6.30 537700 5.978380
29 PFRM3.SA 2020-02-11 6.35 6.52 6.26 6.50 455100 6.168170
30 PFRM3.SA 2020-02-12 6.51 6.57 6.37 6.45 361200 6.120722
31 PFRM3.SA 2020-02-13 6.35 6.38 6.18 6.30 338800 5.978380
32 PFRM3.SA 2020-02-14 6.29 6.35 6.12 6.26 366100 5.940423
33 PFRM3.SA 2020-02-17 6.32 6.44 6.20 6.20 582900 5.883485
34 PFRM3.SA 2020-02-18 6.20 6.50 6.15 6.48 341200 6.149191
35 PFRM3.SA 2020-02-19 6.50 6.83 6.40 6.67 722800 6.329491
36 PFRM3.SA 2020-02-20 6.70 6.78 6.62 6.66 206200 6.320002
37 PFRM3.SA 2020-02-21 6.61 6.62 6.41 6.52 261500 6.187149
38 PFRM3.SA 2020-02-26 6.24 6.24 5.89 5.99 514500 5.684206
39 PFRM3.SA 2020-02-27 5.85 6.04 5.61 5.75 654500 5.456458
40 PFRM3.SA 2020-02-28 5.70 5.76 5.51 5.76 538900 5.465948
41 PFRM3.SA 2020-03-02 5.82 6.10 5.71 6.05 294000 5.741143
42 PFRM3.SA 2020-03-03 6.16 6.25 5.92 6.05 415000 5.741143
43 PFRM3.SA 2020-03-04 6.09 6.16 5.90 6.05 465100 5.741143
44 PFRM3.SA 2020-03-05 6.05 6.06 5.60 5.75 415300 5.456458
45 PFRM3.SA 2020-03-06 5.46 5.60 5.29 5.60 609400 5.314116
46 PFRM3.SA 2020-03-09 4.89 5.16 4.63 4.75 920400 4.507509
47 PFRM3.SA 2020-03-10 5.12 5.32 4.77 5.32 1060000 5.048410
48 PFRM3.SA 2020-03-11 5.28 5.41 4.69 5.21 870500 4.944026
49 PFRM3.SA 2020-03-12 4.50 4.52 4.02 4.02 515900 3.814776
50 PFRM3.SA 2020-03-13 4.50 4.68 4.10 4.49 706000 4.260782
51 PFRM3.SA 2020-03-16 4.01 4.65 3.90 4.35 564900 4.127929
52 PFRM3.SA 2020-03-17 4.50 4.52 3.96 4.24 672800 4.023544
53 PFRM3.SA 2020-03-18 3.80 3.95 3.29 3.50 870200 3.321322
54 PFRM3.SA 2020-03-19 3.20 3.58 2.89 3.58 1378800 3.397238
55 PFRM3.SA 2020-03-20 3.67 3.96 3.22 3.22 1539200 3.055617
56 PFRM3.SA 2020-03-23 3.23 3.34 2.87 3.10 1183900 2.941742
57 PFRM3.SA 2020-03-24 3.22 3.42 3.13 3.13 1064000 2.970211
58 PFRM3.SA 2020-03-25 3.21 3.66 3.15 3.52 890600 3.340301
59 PFRM3.SA 2020-03-26 3.59 4.30 3.50 4.18 1004100 3.966608
60 PFRM3.SA 2020-03-27 3.90 4.33 3.90 4.25 572100 4.033034
61 PFRM3.SA 2020-03-30 4.26 4.68 4.17 4.27 771200 4.052013
62 PFRM3.SA 2020-03-31 4.20 4.38 4.07 4.30 371300 4.080482
63 PFRM3.SA 2020-04-01 4.11 4.25 3.98 4.11 411100 3.900181
64 PFRM3.SA 2020-04-02 415.00 425.00 402.00 4.18 282600 396.660767
65 PFRM3.SA 2020-04-03 410.00 420.00 383.00 4.00 351100 379.579681
66 PFRM3.SA 2020-04-06 421.00 423.00 395.00 3.99 639500 378.630737
67 PFRM3.SA 2020-04-07 418.00 434.00 397.00 4.05 455800 384.324432
68 PFRM3.SA 2020-04-08 410.00 430.00 400.00 4.28 443600 406.150269
69 PFRM3.SA 2020-04-09 431.00 447.00 424.00 4.29 417900 407.099213
70 PFRM3.SA 2020-04-13 429.00 445.00 423.00 4.39 287500 416.588715
71 PFRM3.SA 2020-04-14 450.00 455.00 442.00 4.43 273200 420.384491
72 PFRM3.SA 2020-04-15 432.00 452.00 421.00 4.44 278800 421.333466
73 PFRM3.SA 2020-04-16 4.50 4.53 4.26 4.34 530800 4.118440
74 PFRM3.SA 2020-04-17 4.54 4.78 4.44 4.64 908400 4.403124
75 PFRM3.SA 2020-04-20 4.46 4.76 4.43 4.60 447200 4.365166
76 PFRM3.SA 2020-04-22 4.60 5.04 4.57 4.89 788600 4.640362
77 PFRM3.SA 2020-04-23 4.97 5.17 4.63 4.77 576500 4.526488
78 PFRM3.SA 2020-04-24 4.62 4.72 4.22 4.49 703200 4.260782
79 PFRM3.SA 2020-04-27 4.61 4.65 4.35 4.63 296400 4.393635
80 PFRM3.SA 2020-04-28 4.65 4.82 4.41 4.58 583300 4.346187
81 PFRM3.SA 2020-04-29 4.62 4.75 4.50 4.74 391800 4.498019
82 PFRM3.SA 2020-04-30 4.75 4.79 4.47 4.59 288300 4.355677
83 PFRM3.SA 2020-05-04 4.41 4.63 4.33 4.63 288400 4.393635
84 PFRM3.SA 2020-05-05 4.63 4.76 4.54 4.70 389700 4.460061
85 PFRM3.SA 2020-05-06 4.69 5.26 4.57 5.22 1389800 4.953515
86 PFRM3.SA 2020-05-07 5.35 5.37 4.61 4.82 1877800 4.573936
87 PFRM3.SA 2020-05-08 4.93 4.98 4.50 4.72 1040300 4.479040
88 PFRM3.SA 2020-05-11 4.70 4.74 4.40 4.49 532600 4.260782
89 PFRM3.SA 2020-05-12 4.50 4.58 4.22 4.36 585100 4.137419
90 PFRM3.SA 2020-05-13 4.40 4.40 4.13 4.19 456100 3.976097
91 PFRM3.SA 2020-05-14 4.13 4.27 3.97 4.27 504600 4.052013
92 PFRM3.SA 2020-05-15 4.24 4.36 4.13 4.19 255500 3.976097
93 PFRM3.SA 2020-05-18 4.30 4.32 4.12 4.19 351000 3.976097
94 PFRM3.SA 2020-05-19 4.19 4.30 4.14 4.18 335200 3.966608
95 PFRM3.SA 2020-05-20 4.21 4.25 4.10 4.20 347500 3.985587
96 PFRM3.SA 2020-05-21 4.16 4.23 4.10 4.20 234300 3.985587
97 PFRM3.SA 2020-05-22 4.20 4.39 4.11 4.39 368500 4.165887
98 PFRM3.SA 2020-05-25 4.42 4.66 4.42 4.64 680200 4.403124
99 PFRM3.SA 2020-05-26 4.74 4.84 4.49 4.60 631300 4.365166
100 PFRM3.SA 2020-05-27 4.60 4.74 4.55 4.66 422400 4.422103
101 PFRM3.SA 2020-05-28 4.66 4.78 4.57 4.60 451200 4.365166
102 PFRM3.SA 2020-05-29 4.65 4.65 4.44 4.56 387400 4.327209
103 PFRM3.SA 2020-06-01 4.50 4.93 4.46 4.74 962300 4.498019
104 PFRM3.SA 2020-06-02 4.77 5.05 4.77 5.00 1063000 4.744746
105 PFRM3.SA 2020-06-03 5.05 5.31 4.86 4.86 924300 4.611893
106 PFRM3.SA 2020-06-04 4.85 5.17 4.71 4.86 768600 4.611893
107 PFRM3.SA 2020-06-05 4.94 5.12 4.85 4.85 661700 4.602404
108 PFRM3.SA 2020-06-08 4.95 5.86 4.95 5.79 2498400 5.494416
109 PFRM3.SA 2020-06-09 5.64 5.93 5.50 5.66 1288500 5.371052
110 PFRM3.SA 2020-06-10 5.76 5.81 5.35 5.35 1140600 5.076878
111 PFRM3.SA 2020-06-12 5.25 5.36 4.92 5.27 894600 5.000962
112 PFRM3.SA 2020-06-15 5.06 5.30 5.00 5.22 618700 4.953515
113 PFRM3.SA 2020-06-16 5.31 5.55 5.25 5.41 656000 5.133815
114 PFRM3.SA 2020-06-17 5.41 5.61 5.37 5.59 597600 5.304626
115 PFRM3.SA 2020-06-18 5.61 5.74 5.48 5.65 440600 5.361563
116 PFRM3.SA 2020-06-19 5.71 5.74 5.47 5.50 558500 5.219221
117 PFRM3.SA 2020-06-22 5.52 5.75 5.50 5.72 554600 5.427989
118 PFRM3.SA 2020-06-23 5.75 6.25 5.74 6.00 1652300 5.693695
119 PFRM3.SA 2020-06-24 6.00 6.15 5.77 5.99 830800 5.684206
120 PFRM3.SA 2020-06-25 5.99 6.04 5.79 5.94 827100 5.636758
121 PFRM3.SA 2020-06-26 5.99 6.03 5.59 5.59 767000 5.304626
122 PFRM3.SA 2020-06-29 5.69 5.74 5.33 5.58 739600 5.295136
123 PFRM3.SA 2020-06-30 5.60 5.70 5.45 5.65 457900 5.361563
124 PFRM3.SA 2020-07-01 5.70 5.75 5.49 5.53 634200 5.247689
125 PFRM3.SA 2020-07-02 5.61 5.64 5.41 5.42 448700 5.143305
[ reached 'max' / getOption("max.print") -- omitted 369 rows ]
benchmark:
CodePudding user response:
You just need to add in rowwise
before replace
. Essentially, in your replace
statement, close/100
is returning a new calculated value for every row. When in reality, you only want to replace the values greater than 100. So, with using rowwise
, then it will only calculate per row, rather than for the whole cost
column for the values
parameter.
library(tidyverse)
price %>%
rowwise %>%
mutate(close = replace(close, close > 100, close/100))
Output
# A tibble: 493 × 8
# Rowwise:
symbol date open high low close volume adjusted
<chr> <date> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 PFRM3.SA 2020-01-02 6.59 6.67 6.49 6.6 549100 6.26
2 PFRM3.SA 2020-01-03 6.5 6.99 6.4 6.96 1202000 6.60
3 PFRM3.SA 2020-01-06 7.07 7.71 7.07 7.69 2692600 7.30
4 PFRM3.SA 2020-01-07 7.42 7.5 7.05 7.05 2325900 6.69
5 PFRM3.SA 2020-01-08 7.01 7.28 6.8 7.13 1588900 6.77
6 PFRM3.SA 2020-01-09 7.21 7.22 6.98 7.12 638800 6.76
7 PFRM3.SA 2020-01-10 7.12 7.23 6.96 6.96 658000 6.60
8 PFRM3.SA 2020-01-13 7.01 7.17 6.96 7.1 586700 6.74
9 PFRM3.SA 2020-01-14 7.12 7.15 6.95 7 663700 6.64
10 PFRM3.SA 2020-01-15 7.02 7.39 7.02 7.27 1173000 6.90
# … with 483 more rows
Another option (provided by @DonaldSeinen) is to use transform
, which I've added to the Benchmark below.
transform(price, close = ifelse(close > 100, close / 100, close))
Benchmark
So, it looks like transform
(provided by @DonaldSeinen) is by far the fastest. @TarJae provides a nice visual of this!
library(microbenchmark)
microbenchmark(
replace = price %>%
rowwise %>%
mutate(close = replace(close, close > 100, close / 100)),
case_when = price %>%
mutate(close = case_when(close > 100 ~ close / 100, TRUE ~ close)),
times = 1000L,
control = list(order = 'block')
)
# Unit: microseconds
# expr min lq mean median uq max neval
# replace 4149.300 4997.8325 5785.2692 5469.816 5993.926 28077.15 1000
# case_when 1410.959 1514.8980 1906.9804 1662.030 2138.092 11863.49 1000
# DonaldSeinen 247.421 271.7735 410.3955 315.499 472.813 11546.28 1000
# TarJae 1536.112 1755.8475 2254.4749 2086.729 2560.107 12802.07 1000