Locale
.
Any single character can be used to denote an extension, but there
are two predefined extension codes:
'u'
specifies a Unicode locale extension,
and 'x'
specifies a private use extension.
Unicode locale extensions are defined by the Unicode Common Locale Data Repository (CLDR) project. They are used to specify information that is non-language-specific such as calendars or currency. A private use extension may be used to specify any other information, such as platform (for example, Windows, UNIX, or Linux), or release information (for example, 6u23 or JDK 7).
An extension is specified as a key/value pair, where the key
is a single character (typically 'u'
or 'x'
).
A well-formed value has the following format:
SUBTAG ('-' SUBTAG)*
SUBTAG = [0-9a-zA-Z]{2,8} For key='u' SUBTAG = [0-9a-zA-Z]{1,8} For key='x'
Note that a single-character value is allowed for the private use extension. However, there is a 2-character minimum for values in the Unicode locale extension.
Extension strings are case-insensitive, but the Locale
class maps all keys and values to lowercase.
The
getExtensionKeys()
method returns the set of extension keys, if any, for a
Locale
. The
getExtension(key)
method returns the value string for the specified key, if any.
Unicode Locale Extensions
As previously mentioned, a Unicode locale extension is specified by
the 'u'
key code or the UNICODE_LOCALE_EXTENSION
constant. The value itself is also specified by a key/type pair.
Legal values are defined in the
Key/Type
Definitions table on the Unicode
website. A key code is specified by two alphabetic characters.
The following table lists the Unicode locale extension keys:
Key Code | Description |
---|---|
ca | calendar algorithm |
co | collation type |
ka | collation parameters |
cu | currency type |
nu | number type |
va | common variant type |
The following table shows some examples of key/type pairs for a Unicode locale extension.
Key/Type pair | Description |
---|---|
ca-buddhist | Thai Buddhist calendar |
co-pinyin | Pinyin ordering for Latin |
cu-usd | U.S. dollars |
nu-jpanfin | Japanese financial numerals |
tz-aldav | Europe/Andorra |
The following string represents
the German language locale for the country of Germany using a phonebook
style of ordering for the Linux platform. This example also contains
an attribute named "email"
.
de-DE-u-email-co-phonebk-x-linux
The following Locale
methods can be used to access
information about the Unicode locale extensionss. These methods
are described using the previous German locale example.
getUnicodeLocaleKeys()
– Returns the Unicode locale key codes or an empty set if the
locale has none. For the German example, this would return
a set containing the single string "co"
.
getUnicodeLocaleType(String)
– Returns the Unicode locale type associated with the Unicode locale
key code.
Invoking getUnicodeLocaleType("co")
for the German example would return the string "phonebk"
.
getUnicodeLocaleAttributes()
– Returns the set of Unicode locale attributes associated
with this locale, if any. In the German example, this would return a set
containing the single string "email"
.
'x'
key code or
the PRIVATE_USE_EXTENSION
constant, can be
anything, as long as the value is well formed.
The following are examples of possible private use extensions:
x-jdk-1-7 x-linux