I'm working on a large data frame with original voting ballots. The data frame contains for each voting ballot a unique ID and the last name of the candidates. I would like to create a dummy that indicates if the names are in alphabetical order.
I already coded a new variable that indicates the initial of the last name (Alfonso = A and a second variable that indicates the number of the initials in the alphabet (A = 1).
Does someone has an idea to write a function that checks for each ballot if the names are in alphabetical order and then gives me back 1 = alphabetical and 0 = not? I don't want to sort the names on the ballots, only to check for the alphabetical order.
CodePudding user response:
If you take the names columns as a vector (df$names
) you can test if they are unsorted, and therefore not (!
) unsorted as
!is.unsorted(df$names)
The is.unsorted()
function literally asks the question of the vector it receives. The !
operator reverses that logic. It will return a TRUE
and FALSE
— you can convert that to a 1
and 0
if you really want with as.numeric(!is.unsorted(df$names))
.
Edit: I reread the question and you didn't need this bit, but I'll leave it here anyway, it may be useful to someone some day.
You can get the position in the alphabet for the first letter of each as
match(toupper(substring(df$names, 1, 1)), LETTERS)
CodePudding user response:
I don't see any usefull solution. If you don't want to sort the names, you basically do not want to check any condition between them. You can create a function or a cycle that check if any value for each group is "<" of the next one, but this is somekind a sorting anyway.
Explain better what you mean with:
I don't want to sort the names on the ballots, only to check for the alphabetical order