Home > Back-end >  How does the CLDR define its rules for month abbreviations - specifically for en_CA?
How does the CLDR define its rules for month abbreviations - specifically for en_CA?

Time:09-20

The CLDR provides a repository of locale data. In Java, you can access a list of CLDR month abbreviations for a given locale, as follows:

String[] usMonthAbbrevs = new DateFormatSymbols(Locale.US).getShortMonths();
System.out.println(usMonthAbbrevs[0]);

The above (in Java 17, using the default CLDR provider) prints Jan.

I have been assuming (maybe wrongly) that this list of abbreviations is defined in the relevant CLDR XML file for the en locale:

cldr > common > main > en.xml > <month type="1">Jan</month>

But for Canadian English, the same month abbreviations all have periods at the end:

String[] canMonthAbbrevs = new DateFormatSymbols(Locale.CANADA).getShortMonths();
System.out.println(canMonthAbbrevs[0]);

The above prints Jan. - note that period at the end.


Question: Where is the CLDR list, or rule, which specifies this extra period for the en_CA locale?

I did not see anything obvious in the en_CA.xml file. Maybe I missed it, or I did not understand what I was looking at.


To research this, I tried to trace the code back through Java's resource bundle classes.

This led me to the following class:

sun.text.resources.cldr.ext.FormatData_en_CA

I found this class in my Java installation in jmods/jdk.localedata.jmod. After unzipping that jmod file, I opened the following class file:

classes/sun/text/resources/cldr/ext/FormatData_en_CA.class

In there, I saw bytecode which contains the month abbreviations with periods - for example:

* 7: ldc           #9                  // String Jan.

My guess is that all the classes in jdk.localedata.jmod are pre-built (presumably from CLDR data/rules). But I was not able to see how that process happens. Maybe a tool which is part of the JDK build process?

So, my question remains: Where is the CLDR list, or rule, which specifies this extra period for the en_CA locale's month abbreviations?

Why am I asking? Just because I got curious about this, after noticing the difference between the US and Canadian abbreviations.

CodePudding user response:

You got the wrong version. Java 17 supports CLDR 39, not 41. If you look at the version-39 en_CA.xml file, you can clearly see them all laid out there:

<monthWidth type="abbreviated">
    <month type="1">Jan.</month>
    <month type="2">Feb.</month>
    <month type="3">Mar.</month>
    <month type="4">Apr.</month>
    <month type="5">May</month>
    <month type="6">Jun.</month>
    <month type="7">Jul.</month>
    <month type="8">Aug.</month>
    <month type="9">Sep.</month>
    <month type="10">Oct.</month>
    <month type="11">Nov.</month>
    <month type="12">Dec.</month>
</monthWidth>

Evidently, in newer versions, this trailing dot is removed :)

  • Related