|
Internationalization and Localization with PHPby Adam Trachtenberg, coauthor of PHP Cookbook11/28/2002 |
While everyone who programs in PHP has to learn some English eventually to get a handle on its function names and language constructs, PHP can create applications in just about any human language. Some applications need to be used by speakers of many different languages. PHP's internationalization and localization support makes it easier to make an application written for French speakers useful for German speakers.
Internationalization (often abbreviated I18N--there are 18 letters between the first "i" and the last "n") is the process of taking an application designed for just one locale and restructuring it so that it can be used in many different locales. Localization (often abbreviated L10N--there are 10 letters between the first "l" and the "n") is the process of adding support for a new locale to an internationalized application.
Localizing different kinds of content requires different techniques. This article covers an object-oriented method for localizing plain text messages and images. The PHP Cookbook contains additional recipes for dates, times, and currency. There are also recipes on using GNU gettext and other I18N and L10N topics.
|
Related Reading
PHP Cookbook |
Locales
A locale is a group of settings that describe text formatting and language
customs in a particular area of the world. A locale name generally has three
components. The first, an abbreviation that indicates a language, is mandatory.
For example, "en" stands for English and "pt" for Portuguese. An optional
country specifier comes next, after an underscore, to distinguish between
different versions of the same language spoken in different countries. For
example, "en_US" and "en_GB" specify U.S. and British English respectively,
while "pt_BR" and "pt_PT" identify Brazilian and Portugese Portuguese.
Finally, after a period, comes an optional character-set specifier. Taiwanese
Chinese using the Big5 character set is encoded as "zh_TW.Big5". Note that
while most locale names follow these conventions, some don't.
Message Catalog
To incorporate I18N support into your program, maintain a message catalog of words and phrases and retrieve the appropriate string from the message catalog before printing it. Here's a simple message catalog with foods in American and British English and a function to retrieve words from the catalog:
<?php
$messages = array (
'en_US'=> array(
'My favorite foods are' =>
'My favorite foods are',
'french fries' => 'french fries',
'biscuit' => 'biscuit',
'candy' => 'candy',
'potato chips' => 'potato chips',
'cookie' => 'cookie',
'corn' => 'corn',
'eggplant' => 'eggplant'
),
'en_GB'=> array(
'My favorite foods are' =>
'My favourite foods are',
'french fries' => 'chips',
'biscuit' => 'scone',
'candy' => 'sweets',
'potato chips' => 'crisps',
'cookie' => 'biscuit',
'corn' => 'maize',
'eggplant' => 'aubergine'
)
);
function msg($s) {
global $LANG;
global $messages;
if (isset($messages[$LANG][$s])) {
return $messages[$LANG][$s];
} else {
error_log("l10n error:LANG:" .
"$lang,message:'$s'");
}
}
?>
This short program uses the message catalog to print out a list of foods:
<?php
$LANG ='en_GB';
print msg('My favorite foods are').":\n";
print msg('french fries')."\n";
print msg('potato chips')."\n";
print msg('corn')."\n";
print msg('candy')."\n";
?>
My favourite foods are:
chips
crisps
maize
sweets
To have the program output in American English instead of British English,
just set $LANG to en_US.
Variable Phrases
You can combine the msg() message retrieval function with
printf() to store phrases that require values to be substituted
into them. Consider the English sentence "I am 12 years old." In Spanish, the
corresponding phrase is "Tengo 12 aņos." The Spanish phrase can be built by
stitching together translations of "I am," the numeral 12, and "years old."
It's easier, though, to store them in the message catalogs as
printf()-style format strings:
<?php
$messages = array(
'en_US' => array(
'I am X years old.' =>
'I am %d years old.'),
'es_US' => array(
'I am X years old.' =>
'Tengo %d aņos.')
);
?>
You can then pass the results of msg() to
printf() as a format string:
<?php
$LANG ='es_US';
printf(msg('I am X years old.'), 12);
?>
Tengo 12 aņos.
For phrases that require the substituted values to be in a different order
in different languages, printf() supports changing the order of
the arguments:
<?php
$messages = array(
'en_US' => array(
'I am X years and Y months old.' =>
'I am %d years and %d months old.'),
'es_US' => array(
'I am X years and Y months old.'=>
'Tengo %2$d meses y %1$d aņos.')
);
?>
With either language, call sprintf() with the same order of
arguments (i.e., first years, then months):
<?php
$LANG ='es_US';
printf(msg('I am X years and Y months old.'),12,7);
?>
Tengo 7 meses y 12 aņos.
In the format string, %2$ tells printf() to use
the second argument, and %1$ tells it to use the first.
Pages: 1, 2 |





