Hi I'm new here and also quite new to R. Would be great if anyone could please help me here. I'm trying to make a for loop to get the desired output but a little struggling at the moment.
Let's suppose I have a table below:
> d
names variables value
1 colour c(red, blue) 10
2 colour c(yellow, blue) 32
3 colour c(green, red, pink) 81
4 colour c(pink, purple) 14
5 shape c(circle, triangle) 5
6 shape c(rectangle) 31
7 .....
What I'm trying to do is to create a for loop going over the variables for each name. If a target variable for each name exist, then set the original value to be 0 and make a duplicated row with value taking the negative value of the original value.
As an example, let's say our target variable for colour is 'red'. What I want the output to look like is:
> d1
names variables value
1 colour c(red, blue) 0
2 colour c(yellow, blue) 32
3 colour c(green, red, pink) 0
4 colour c(pink, purple) 14
5 colour1 c(red, blue) -10
6 colour2 c(green, red, pink) -81
7 shape c(circle, triangle) 5
8 shape c(rectangle) 31
7 .....
I hope I'm making sense. Any help or comments would be appreciated.
Thanks!!
CodePudding user response:
You can do this with tidyverse
:
- First we use
str_detect
fromstringr
package (it is in tidyverse) to identify those rows withred
ind variables. - Then we add 1 and 2 (=row_number) to
names
and multiplevalue
by -1 to get the negative values. - we use
bind_rows
fromdplyr
package (it is in tidyverse) to bind to original dataframe. - Then we use an
ifelse
statement to set variables red to 0 in value (assuming the original values are > 0. - we use
as_tibble()
to remove the rownames and finallyarrange
library(tidyverse)
df %>%
filter(str_detect(variables, "red")) %>%
mutate(names = paste0(names, row_number()),
value = value*-1) %>%
bind_rows(df) %>%
mutate(value = ifelse(str_detect(variables, "red") &
value > 0, 0, value)) %>%
as_tibble() %>%
arrange(names)
names variables value
<chr> <chr> <dbl>
1 colour c(red, blue) 0
2 colour c(yellow, blue) 32
3 colour c(green, red, pink) 0
4 colour c(pink, purple) 14
5 colour1 c(red, blue) -10
6 colour2 c(green, red, pink) -81
7 shape c(circle, triangle) 5
8 shape c(rectangle) 31
structure(list(names = c("colour", "colour", "colour", "colour",
"shape", "shape"), variables = c("c(red, blue)", "c(yellow, blue)",
"c(green, red, pink)", "c(pink, purple)", "c(circle, triangle)",
"c(rectangle)"), value = c(10L, 32L, 81L, 14L, 5L, 31L)), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6"))
CodePudding user response:
Here a solution with Base R only, using TarJae's construction of the data, for a dataframe x
:
# Create a vector with the row indeces that contain the target string:
target <- grep( 'red', x[ , 2 ] )
# Put these rows away in a separate data.frame:
stash <- x[ target, ]
# Set the value of the original rows to `0`
x[ target, 3 ] <- 0
# Set the values in the separated rows to their negative value
stash[ , 3 ] <- stash[ , 3 ] * -1
# Modify `names` as desired:
for( i in 1 : length( stash[ , 1 ] ) )
stash[ i, 1 ] <- paste( stash[ i, 1 ], i, sep = "" )
# Insert the modified data (supposes that the dataframe is expected
# to be sorted on the first column):
x <- rbind( x, stash )
x <- x[ order(x[ 1 ] ), ]
That gives you
> x
names variables value
1 colour c(red, blue) 0
2 colour c(yellow, blue) 32
3 colour c(green, red, pink) 0
4 colour c(pink, purple) 14
11 colour1 c(red, blue) -10
31 colour2 c(green, red, pink) -81
5 shape c(circle, triangle) 5
6 shape c(rectangle) 31
This works as well if you replace 'red'
with 'blue'
or with 'rectangle'
. For the extended question in your comment: One could create a vector with the targets (such as c( "red", "triangle" )
, then loop over the code above, writing each output into a list. More sample data and information about your desired output would be needed, though.