I have made a survey and would like to remove all answers from people who answered ''no'' to being a parent?
The dataset is called ''tillid''
The variable is called ''Er du forældre'' and the answer is either ''Ja'' or ''Nej''
CodePudding user response:
Question
Welcome to SO. You should read the How do I ask a good question advice shared by @user2974951 to learn how to ask questions in a way that helps the community respond.
From the [r] tag guidance:
Please use minimal reproducible example(s) others can run using copy & paste. Show desired output. Use dput() for data & specify all non-base packages with library(). Don't embed pictures for data or code, use indented code blocks instead.
In this case you have a data.frame that looks something like this:
> Tillid
Id ... Er du forældre
1 1 Ja
2 2 Nej
3 3
4 4 Nej
5 5 Ja
...
To create a minimal reproducible example use dput on a subset of rows and columns:
> dput(Tillid[1:5, c('Id', 'Er du forældre')])
structure(list(Id = 1:5, `Er du forældre` = c("Ja", "Nej", "",
"Nej", "Ja")), class = "data.frame", row.names = c(NA, -5L))
Anyone can copy this line code and create a dataset that looks like yours.
Next you want to show what you are trying to achieve (based on the minimal example):
> <insert code here>
Id ... Er du forældre
1 1 Ja
3 3
5 5 Ja
Answer
In this case, the code to remove rows with the value 'Nej' is as follows (you might want to assign this to a new variable using <-
)
> Tillid[Tillid$`Er du forældre` != 'Nej', ]
Id Er du forældre
1 1 Ja
3 3
5 5 Ja
If you also want to exclude missing answers (of which you have 114), you could slice to only those rows with the value "Ja":
> Tillid[Tillid$`Er du forældre` == 'Ja', ]
Id Er du forældre
1 1 Ja
5 5 Ja
As mentioned by @mat in his answer, it's good practice to avoid special characters, and spaces, in column names.
CodePudding user response:
tillid <- tillid[tillid$`Er du forældre` != "Nej"]
The !=
means "not equal to".
Alternatively:
tillid <- tillid[tillid$`Er du forældre` == "Ja"]
Note that if you have missing values (NA
), the first alternative will preserve them, whereas the second option will exclude everything that is not equal to Ja
.
I would suggest to avoid special characters (e.g., æ) in your variable names as this can cause some bugs in R.