Right now I am starting a new project and I usually had a constant class for symbols like:
object Symbols{
const val CHAR_COLON = ":"
const val CHAR_COMMA = ","
const val CHAR_SEMICOLON = ";"
etc.
}
Would be great if a class like this was already added in Java or Kotlin but I don't seem to find anything like this. Does anyone know of such a thing? Thanks.
CodePudding user response:
I think that is not necessary.
CodePudding user response:
I'm not sure why you want to define constants for individual characters (just use the char
, it's a constant!) but the Unicode standard defines a bunch of categories for characters, and you can access those through the Java Character
class, which also has some methods for getting a character's type.
Kotlin gives you a CharCategory
enum class which contains all these categories, and some functions that make it a little easier - for example you can do this:
println(CharCategory.OTHER_PUNCTUATION.contains(','))
>> true
But for example, '-'
is not in the OTHER_PUNCTUATION
category, aka category Po
. That comes under DASH_PUNCTUATION
, Pd
. If you notice, all the punctuation category codes start with P, so you could do this kind of thing:
val punctuationCategories
= CharCategory.values().filter { it.code.first() == 'P' }
val Char.isPunctuation: Boolean get() = punctuationCategories.any { it.contains(this) }
println('-'.isPunctuation)
>> true
That's just a basic overview and pointing out this stuff is baked into the Unicode standard - I don't really know a lot about it (I can't see a more convenient way to do this but I could be wrong) and I'm not sure if it's helpful, but there it is!
Kotlin does have an object full of constants for "Unicode symbols used in proper Typography" for some reason though! Also a couple with typos in the name so they had to deprecate them, now that's what I call typography
edit: I should point out that Unicode aims to represent every writing system that ever existed, so its scope of what counts as "punctuation" might be a bit broader than what you're looking for! It depends if you're just trying to check a character, or if you want to create a collection of punctuation characters. If you just want to limit things to a particular Charset
, like US_ASCII
or whatever, I'm not sure how you can get access to all the characters that covers