Home > database >  How do you find all first letter capitalized words in a string? Like a name ("My Name and anoth
How do you find all first letter capitalized words in a string? Like a name ("My Name and anoth

Time:11-08

GOAL: I want to have a regex that searches a string for all capitalized first letters of words/names in a string. Then replace those capitalized words with "Whatever", if the word is 4 or more characters long. Then replace words 4 or more letters that are lowercased with "whatever".

String Input:

let myString = "This is a regular string.  How does this Work?  Does any Name know How?"

Desired String Output:

let myString = "Whatever is a whatever whatever. How whatever this Whatever? Whatever any Whatever whatever whatever How?"

What I've tried:

let myString = "This is a regular string.  How does this Work?  Does any Name know How?"

let x = myString.replacingOccurrences(of: "\\b\\p{Lu}{4,}\\b",
                                                    with: "Whatever",
                                                    options: .regularExpression)
let finalX = x.replacingOccurrences(of: "\\b\\p{L}{4,}\\b",
                                    with: "whatever",
                                    options: .regularExpression)
print(finalX)

Problem:

The first check is for capitalized letters, the second is for lowercased letters, but it still returns all lowercased.

Would anyone know how to go about this with what I have?

CodePudding user response:

The regular expression you want is the following:

\b[A-Z][a-z]{3,}\b

That will only work with the basic letters A-Z and a-z. If you want to fully support any alphabet then you would need something like this:

\b\p{Lu}\p{Ll}{3,}\b

The \b means "word boundary". The \p{Lu} means "uppercase letters". The \p{Ll} means "lowercase letters".

You only want to use this in the call to replacingOccurrences. There's little reason to do the initial check using contains. That will be looking for the literal text you pass in and of course the regular expression you are using won't be found as literal text in the string.

let myString = "This is a regular string.  How does this Work?  Does any Name know How?"
let x = myString.replacingOccurrences(of: "\\b\\p{Lu}\\p{Ll}{3,}\\b",
                                                    with: "Whatever",
                                                    options: .regularExpression)
print(x)

Output:

Whatever is a regular string. How does this Whatever? Whatever any Whatever know How?

To do both you also need "\\b\\p{Ll}{4,}\\b".

The following changes both sets.

let myString = "This is a regular string.  How does this Work?  Does any Name know How?"
let x = myString
    .replacingOccurrences(of: "\\b\\p{Ll}{4,}\\b",
                                                    with: "whatever",
                                                    options: .regularExpression)
    .replacingOccurrences(of: "\\b\\p{Lu}\\p{Ll}{3,}\\b",
                                                    with: "Whatever",
                                                    options: .regularExpression)
print(x)

Output:

Whatever is a whatever whatever. How whatever whatever Whatever? Whatever any Whatever whatever How?

  • Related