[prev in list] [next in list] [prev in thread] [next in thread] 

List:       opensolaris-i18n-discuss
Subject:    [i18n-discuss] EOF of short form locales [PSARC/2009/528 Self
From:       Ienup Sung <is () sac ! sfbay ! sun ! com>
Date:       2009-10-03 2:39:55
Message-ID: 200910030239.n932dtFq005110 () sac ! sfbay ! sun ! com
[Download RAW message or body]


I'm filing this case for Jan "Honza" Hnatek, Jan "Jenda" Lana, and
Ning "Harry" Fu as a self-review and automatic approval case since
there is no new architectural change with this case and at the same time,
this is yet another locale EOF with similar cases in the past.

If you disagree, let me know and I'll promote this case to a fast track.

Ienup


Template Version: @(#)sac_nextcase 1.68 02/23/09 SMI
This information is Copyright 2009 Sun Microsystems
1. Introduction
    1.1. Project/Component Working Name:
	 EOF of short form locales
    1.2. Name of Document Author/Supplier:
	 Author:  Jan Hnatek, Jan Lana, Ning Harry Fu
    1.3  Date of This Document:
	02 October, 2009
4. Technical Description

OVERVIEW

We have 113 so-called short form locale names being provided and
supported at Solaris such as "ar", "de", "fr_CH", and so on that are
constant sources of customer confusions (and thus also service requests or
customer inquiries) and, for us, a maintenance headache.

The reason for the customer confusions is that such locale names are not
fully qualified with other possible attributes in the names thereby have
inherent ambiguities in identifying intended territory, codeset, or both.

As a background information, all modern POSIX-like operating systems
including Solaris, AIX, HP-UX, and various Linux distributions provide
and support locale names with the following convention [1]:

	language_territory.codeset[@modifier]

where the "language" is a 2-letter language code from ISO 639 [2],
the "territory" is a 2-letter uppercase territory code from ISO 3166 [3],
the "codeset" is one of the widely accepted codeset names in the industry,
and the optional "@modifier" is an arbitrary string denoting an extra
attribute that is specific to the locale such as a collation order.

The reason why this is a maintenance headache for us is due to that for
almost all of them sans "ar" and "ja" locales, we already have corresponding
fully qualified long form locale names being supported in Solaris and, in fact,
the short form locales are implemented and supported simply by having numerous
symbolic links to the long form locales. (There are a few hundreds of symbolic
links in this aspect.)

There are also six long form locale names that are no longer correct due to
the changes in ISO 639 and ISO 3166 and requiring new locale names going
forward [2, 3, 4, 5, 6]:

	Locales with deprecated codes	Locales with new codes
	-----------------------------	----------------------
	no_NO.ISO8859-1@bokmal		nb_NO.ISO8859-1
	no_NO.ISO8859-1@nynorsk		nn_NO.ISO8859-1
	sh_BA.UTF-8			bs_BA.UTF-8
	sh_BA.ISO8859-2@bosnia		bs_BA.ISO8859-2
	sr_CS.UTF-8			sr_ME.UTF-8, sr_RS.UTF-8
	sr_YU.ISO8859-5			sr_ME.ISO8859-5, sr_RS.ISO8859-5

And a long form locale name of th_TH.ISO8859-11 that is merely a symbolic link
which also should be mapped to th_TH.TIS620.

With this project, the project team would like to EOF the 120 locales
described at above and also the actual list shown at the "PROPOSED
INTERFACE CHANGES AND RELEASE BINDING" subsection at below.


DETAILS ON THE EOF

Once approved by ARC and Solaris P-Team, we will do the EOF announcement at
an S10 UR Release Notes with [7] and do the EOF execution at OpenSolaris and
Solaris Next.

The EOF announcement text will describe our intention that the locale names
will be obsoleted in future releases of Solaris and OpenSolaris and provide
a mapping table which will list the locale names that should be used instead.

Since we don't have "ar_EG.ISO8859-6" locale at this point, for a proper
migration of the "ar" locale, we will create "ar_EG.ISO8859-6" locale as soon
as possible as recorded in [8].

While we think that the "ja" can be mapped to "ja_JP.eucJP" locale we have,
since they are slightly different such that the "ja" locale conforms to
the traditional specification from earlier Solaris releases and
the "ja_JP.eucJP" locale conforms to "UI-OSF Japanese Environment
Implementation Agreement Version 1.1", if we get customer escalation,
we may need to revert the EOF and re-deliver the "ja" locale.

(A side note: As a separate project, to aid customer transitions, we will also
try to provide a transparent locale mapping and alias support mechanism at
libc so that even after the EOF execution is done, customers with old locale
names will still be supported as much as possible.)


PROPOSED INTERFACE CHANGES AND RELEASE BINDING

The project team proposes the following locale names and corresponding
locale directories from the system as Obsolete Committed:

	Interface		Proposed Classification
	---------		-----------------------
	ar			Obsolete Committed
	bg_BG			Obsolete Committed
	ca			Obsolete Committed
	ca_ES			Obsolete Committed
	cs			Obsolete Committed
	cs_CZ			Obsolete Committed
	da			Obsolete Committed
	da_DK			Obsolete Committed
	da.ISO8859-15		Obsolete Committed
	de			Obsolete Committed
	de_AT			Obsolete Committed
	de_CH			Obsolete Committed
	de_DE			Obsolete Committed
	de.ISO8859-15		Obsolete Committed
	de.UTF-8		Obsolete Committed
	el			Obsolete Committed
	el_GR			Obsolete Committed
	el.sun_eu_greek		Obsolete Committed
	el.UTF-8		Obsolete Committed
	en_AU			Obsolete Committed
	en_CA			Obsolete Committed
	en_GB			Obsolete Committed
	en_IE			Obsolete Committed
	en_NZ			Obsolete Committed
	en_US			Obsolete Committed
	es			Obsolete Committed
	es_AR			Obsolete Committed
	es_BO			Obsolete Committed
	es_CL			Obsolete Committed
	es_CO			Obsolete Committed
	es_CR			Obsolete Committed
	es_EC			Obsolete Committed
	es_ES			Obsolete Committed
	es_GT			Obsolete Committed
	es.ISO8859-15		Obsolete Committed
	es_MX			Obsolete Committed
	es_NI			Obsolete Committed
	es_PA			Obsolete Committed
	es_PE			Obsolete Committed
	es_PY			Obsolete Committed
	es_SV			Obsolete Committed
	es.UTF-8		Obsolete Committed
	es_UY			Obsolete Committed
	es_VE			Obsolete Committed
	et			Obsolete Committed
	et_EE			Obsolete Committed
	fi			Obsolete Committed
	fi_FI			Obsolete Committed
	fi.ISO8859-15		Obsolete Committed
	fr			Obsolete Committed
	fr_BE			Obsolete Committed
	fr_CA			Obsolete Committed
	fr_CH			Obsolete Committed
	fr_FR			Obsolete Committed
	fr.ISO8859-15		Obsolete Committed
	fr.UTF-8		Obsolete Committed
	he			Obsolete Committed
	he_IL			Obsolete Committed
	hr_HR			Obsolete Committed
	hu			Obsolete Committed
	hu_HU			Obsolete Committed
	is_IS			Obsolete Committed
	it			Obsolete Committed
	it.ISO8859-15		Obsolete Committed
	it_IT			Obsolete Committed
	it.UTF-8		Obsolete Committed
	ja			Obsolete Committed
	ko			Obsolete Committed
	ko.UTF-8		Obsolete Committed
	lt			Obsolete Committed
	lt_LT			Obsolete Committed
	lv			Obsolete Committed
	lv_LV			Obsolete Committed
	mk_MK			Obsolete Committed
	nl			Obsolete Committed
	nl_BE			Obsolete Committed
	nl.ISO8859-15		Obsolete Committed
	nl_NL			Obsolete Committed
	no			Obsolete Committed
	no_NO			Obsolete Committed
	no_NO.ISO8859-1@bokmal	Obsolete Committed
	no_NO.ISO8859-1@nynorsk	Obsolete Committed
	no_NY			Obsolete Committed
	pl			Obsolete Committed
	pl_PL			Obsolete Committed
	pl.UTF-8		Obsolete Committed
	pt			Obsolete Committed
	pt_BR			Obsolete Committed
	pt.ISO8859-15		Obsolete Committed
	pt_PT			Obsolete Committed
	ro_RO			Obsolete Committed
	ru			Obsolete Committed
	ru.koi8-r		Obsolete Committed
	ru_RU			Obsolete Committed
	ru.UTF-8		Obsolete Committed
	sh			Obsolete Committed
	sh_BA			Obsolete Committed
	sh_BA.ISO8859-2@bosnia	Obsolete Committed
	sh_BA.UTF-8		Obsolete Committed
	sk_SK			Obsolete Committed
	sl_SI			Obsolete Committed
	sq_AL			Obsolete Committed
	sr_CS			Obsolete Committed
	sr_CS.UTF-8		Obsolete Committed
	sr_SP			Obsolete Committed
	sr_YU			Obsolete Committed
	sr_YU.ISO8859-5		Obsolete Committed
	sv			Obsolete Committed
	sv.ISO8859-15		Obsolete Committed
	sv_SE			Obsolete Committed
	sv.UTF-8		Obsolete Committed
	th			Obsolete Committed
	th_TH			Obsolete Committed
	th_TH.ISO8859-11	Obsolete Committed
	tr			Obsolete Committed
	tr_TR			Obsolete Committed
	zh			Obsolete Committed
	zh.GBK			Obsolete Committed
	zh_TW			Obsolete Committed
	zh.UTF-8		Obsolete Committed

The project team proposes Micro/Patch release binding for EOF announcement
and Minor for EOF execution.


REFERENCES

[1] The Open Group, UNIX Internationalization Guide, Sep. 2003.
    http://www.opengroup.org/bookstore/catalog/g032.htm
[2] ISO 639-1 and ISO 639-2 code lists:
    http://www.loc.gov/standards/iso639-2/php/code_list.php
[3] ISO 3166 code lists:
    http://www.iso.org/iso/country_codes/iso_3166_code_lists.htm
[4] Changes to ISO 639-1 and ISO 639-2:
    http://www.loc.gov/standards/iso639-2/php/code_changes.php
[5] Updates on ISO 3166:
    http://www.iso.org/iso/country_codes/updates_on_iso_3166.htm
[6] Current Codes section of ISO 3166-3 at wikipedia.org:
    http://en.wikipedia.org/wiki/ISO_3166-3
[7] EOF-announcement.txt file at the materials directory of the case.
[8] CR 6884493 Add ar_EG.ISO8859-6 locale

6. Resources and Schedule
    6.4. Steering Committee requested information
   	6.4.1. Consolidation C-team Name:
		G11N
    6.5. ARC review type: Automatic
    6.6. ARC Exposure: open

_______________________________________________
i18n-discuss mailing list
i18n-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/i18n-discuss
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic