logo
  Internationalization
logo

CustomMailer can be used to send email in many world-wide languages, but it may take a little work by your and/or your recipients depending upon the language.  All the languages of Western Europe and the Americas generally work OK since the 8-bit character encodings and fonts necessary for these are standard in Windows and are supported directly by CustomMailer.  However, for Eastern European, Asian, and African languages such as Greek, Russian, Chinese, Japanese, Korean, Arabic, Hebrew, etc. extra considerations are necessary to support the special fonts and encoding schemes that are required.  CustomMailer is written in Java, so all its strings are double byte (Unicode), but more things need to be true for an end-to-end double byte solution.  The following sections describe each aspect of using CustomMailer for sending mail internationally.

User interface
At present the user interface for CustomMailer (menus, dialogs, prompts, etc.) as well as the documentation for CustomMailer are all in English.  Our focus in CustomMailer has been on making sure it supports sending internationalized mail content, and we regret we have not had the resources to translate the documentation or application GUI itself into other languages.

Message template and mailing list input
As delivered, CustomMailer allows you to read message template and mailing list files encoded with the following character sets: US-ASCII, ISO-8859-1, Windows-1252 (also known as Cp1252), UTF-8, and Unicode.  The first four of these are 8-bit (single byte) standards, whereas Unicode is a 16-bit (double byte) standard.  The following table shows the 8-bit character sets supported by CustomMailer.

                            Table - Eight-bit character sets supported by CustomMailer
                                                                           US-ASCII = yellow
                                                                      ISO-8859-1 = yellow + blue
                                                            Windows-1252 = yellow + blue + green

0
NUL
1
SOH
2
STX
3
ETX
4
EOT
5
ENQ
6
ACK
7
BEL
8
BS
9
TAB
10
LF
11
VT
12
FF
13
CR
14
SO
15
SI
16
DLE
17
DC1
18
DC2
19
DC3
20
DC4
21
NAK
22
SYN
23
ETB
24
CAN
25
EM
26
SUB
27
ESC
28
FS
29
GS
30
RS
31
US
32
space
33
!
34
"
35
#
36
$
37
%
38
&
39
'
40
(
41
)
42
*
43
+
44
,
45
-
46
.
47
/
48
0
49
1
50
2
51
3
52
4
53
5
54
6
55
7
56
8
57
9
58
:
59
;
60
<
61
=
62
>
63
?
64
@
65
A
66
B
67
C
68
D
69
E
70
F
71
G
72
H
73
I
74
J
75
K
76
L
77
M
78
N
79
O
80
P
81
Q
82
R
83
S
84
T
85
U
86
V
87
W
88
X
89
Y
90
Z
91
[
92
\
93
]
94
^
95
_
96
`
97
a
98
b
99
c
100
d
101
e
102
f
103
g
104
h
105
i
106
j
107
k
108
l
109
m
110
n
111
o
112
p
113
q
114
r
115
s
116
t
117
u
118
v
119
w
120
x
121
y
122
z
123
{
124
|
125
}
126
~
127
del
128
129

130

131
ƒ
132

133

134

135

136
ˆ
137

138
Š
139

140
Œ
141

142
Ž
143

144

145

146

147

148

149

150

151

152
˜
153

154
š
155

156
œ
157

158
ž
159
Ÿ
160
 
161
¡
162
¢
163
£
164
¤
165
¥
166
¦
167
§
168
¨
169
©
170
ª
171
«
172
¬
173
­
174
®
175
¯
176
°
177
±
178
²
179
³
180
´
181
µ
182

183
·
184
¸
185
¹
186
º
187
»
188
¼
189
½
190
¾
191
¿
192
À
193
Á
194
Â
195
Ã
196
Ä
197
Å
198
Æ
199
Ç
200
È
201
É
202
Ê
203
Ë
204
Ì
205
Í
206
Î
207
Ï
208
Ð
209
Ñ
210
Ò
211
Ó
212
Ô
213
Õ
214
Ö
215
×
216
Ø
217
Ù
218
Ú
219
Û
220
Ü
221
Ý
222
Þ
223
ß
224
à
225
á
226
â
227
ã
228
ä
229
å
230
æ
231
ç
232
è
233
é
234
ê
235
ë
236
ì
237
í
238
î
239
ï
240
ð
241
ñ
242
ò
243
ó
244
ô
245
õ
246
ö
247
÷
248
ø
249
ù
250
ú
251
û
252
ü
253
ý
254
þ
255
ÿ

Since it is a Windows application, CustomMailer can read your message template and mailing list files using the Windows-1252 character set, hence, the entire table shown above.  This is the "Latin 1" character set that generally speaking supports all the languages of Western Europe and the Americas.  CustomMailer also lets you enter message templates and mailing lists directly into CustomMailer from the keyboard or using copy and paste from other applications using the Windows-1252 character set.

CustomMailer can also read message template and mailing list files encoded with Unicode.  Using Unicode, CustomMailer can read content in essentially all world-wide alphabets (including Latin, Greek, Cyrillic, Hebrew, Arabic, Devanagari, etc.) and ideographs (including Chinese, Japanese, Korean, etc.).  At present, Unicode data can only be read in from files.  CustomMailer does not by itself support the entry of Unicode message templates and mailing lists directly into CustomMailer from the keyboard or by using copy and paste from other applications (however, see "Additional character sets" below).

The character set used to read your message template is specified using the variable messageTemplateCharset= in the file CustomMailerPreferences.txt located in the CustomMailer 4.0\CustomMailerApp folder (at present you must edit this value manually, a more convenient user interface for changing this value is planned for a future release).  If no value is given (the default), then CustomMailer operates in "automatic" mode.  In this case, CustomMailer attempts to read your message template with "Unicode", and if successful it assumes the message template character set is "Unicode", otherwise it will use "Windows-1252".  If a character set is specified  in the Preferences file, for example, messageTemplateCharset=ISO-8859-1, then CustomMailer will read your message template file using the specified character set, and if your file contains characters outside this character set, it is an error.  When saving a message template, CustomMailer will use the messageTemplateCharset value if specified, otherwise it will use the character set of the most recently read message template, otherwise it will use "Windows-1252".  NOTE: The "Unicode" setting will read either "UnicodeLittle" vs. "UnicodeBig" files, however CustomMailer saves all Unicode files as "UnicodeLittle", as is appropriate on Windows systems.

The character set used to read  your mailing list is specified using the variable mailingListCharset= in the Preferences file, and it works the same way as in the messageTemplateCharset above.  That is, if a value is not specified (the default), then CustomMailer operates in "automatic" mode, reading the mailing list in "Unicode" if it can, otherwise using "Windows-1252".  If a value is specified, the specified character set will be used.  When saving the mailing list, CustomMailer will use the mailingListCharset value if specified, otherwise it will use the character set of the most recently read mailing list, otherwise it will use "Windows-1252".

There are various applications you can use to create (or convert to) Unicode files.  For example, you can create (or import) your message template or mailing list into Microsoft Word, then selected the "Save As..." menu command, and then change the "Save as Type" pop-up menu to "Unicode text (*.txt)". 

As delivered, CustomMailer does not support reading message templates and mailing lists in other international character sets such as ISO-8859-n (n>1), Big5, GB2312, Shift-JIS, etc.  However, if desired CustomMailer can be configured to provide this support, see "Additional character sets" below.

Display and fonts
As delivered, CustomMailer is able to display all alphabet-based languages in both the message template and mailing list.  This includes languages that use alphabets such as Latin, Greek, Cyrillic, Hebrew, Arabic, Devanagari, etc.  However, the default fonts of CustomMailer does not support the display of ideograph-based languages such as Chinese, Japanese, Korean, Thai, etc.  If you read in a Unicode message template file that uses an ideograph-based character set, you will see the ideographs replaced by question marks or little box characters in CustomMailer.  However, the underlying content will still be correct and you can send messages written in these languages successfully to your recipients.  If you would like to enable CustomMailer to display message templates and mailing lists using ideographic characters, you can modify CustomMailer to support this as described in "Additional character sets" below.

Sending mail
As delivered, CustomMailer supports the sending of plain text mail messages using the following character encodings: US-ASCII, ISO-8859-1, Windows-1252, and UTF-8.  By default, CustomMailer will examine each email message and selected the appropriate character encoding for sending it.  If your message is entirely within the yellow area in the table above, CustomMailer will send it as "US-ASCII".  If your message falls within the yellow and blue areas, CustomMailer will send it as "ISO-8859-1".  If your message uses characters in the yellow, blue, and green areas, CustomMailer will send it as "Windows-1252" (and not "Cp1252", which we found some systems do not recognize).  If your message contains Unicode characters not in the table above, then CustomMailer will automatically convert it to the "UTF-8" encoding scheme.  UTF-8 is the Internet standard for encoding16-bit Unicode characters as 8-bit characters, which is necessary for sending Unicode data over the Internet since the SMTP standard only supports 8-bit codes.  CustomMailer will encode not only your message body, but also the SUBJECT and ORGANIZATION fields, as well as the proper name portions of the TO, FROM, CC, BCC, REPLY TO, and RETURN TO fields.

Most Internet mail systems support full 8-bit transmission, but a few older systems only support 7-bit transmission.  If CustomMailer determines that a given recipient's SMTP server can only receive 7-bit data, CustomMailer converts any non-ASCII 8-bit characters using a 7-bit transfer encoding method known as "quoted-printable".  However, since there are a few 8-bit systems that have difficulties with quoted-printable encoding, CustomMailer always sends 8-bit characters to 8-bit systems.  Since US-ASCII encoded messages fit in 7 bits, they will work on both 7-bit and 8-bit systems and are sent as is.

CustomMailer also allows you to force the character encoding it uses for sending mail.   To send your message with a specific encoding, locate the line mailSendingCharset= in the CustomMailer 4.0\CustomMailerApp folder and change it to, for example, mailSendingCharset=ISO-8859-1.  This will force the message to be sent in the specified character encoding.  Any characters that fall outside the character set of a less-capable encoding are replaced by question marks ("?").  Forcing the character encoding to a more-capable encoding will also work.  If the more-capable encoding is UTF-8, the message will be reencoded and sent as UTF-8, regardless of whether UTF-8 was required.  If Windows-1252 is specified for an otherwise US-ASCII or ISO-8859-1 message or if ISO-8859-1 is specified for an otherwise US-ASCII message, the characters won't have to be reencoded but the message will be designated as using the specified character set.

CustomMailer handles HTML messages differently.  HTML supports its own internal character encoding scheme using what's known as "numeric character references",  for example: &#133.  CustomMailer automatically converts all non-ASCII characters in your HTML message to numeric character references.  The result is a US-ASCII encoded message which can be sent unmodified to either 7-bit or 8-bit systems.  In this way HTML messages can contain any non-ASCII characters (including Unicode) and they will be transmitted correctly.  This special handling only applies to the HTML portion of the message, and the alternate text portion of an HTML message as well as all the header fields are handled like a plain text message, encoded as described above.

As delivered, CustomMailer does not support sending messages in other international character sets such as ISO-8859-n (n>1), Big5, GB2312, Shift-JIS, etc.  However, if desired CustomMailer can be configured to provide this support, see "Additional character sets" below.

Receiving mail
Finally, having received a message from you, your recipients must have their mail readers set up to recognize the proper encodings and display with the appropriate fonts.  Most modern mail readers automatically shift to the proper character set based on the character set specified by CustomMailer in the MIME header of the email.  No Windows, Macintosh, or Unix systems that we know of have a problem receiving US-ASCII or ISO-8859-1.  All current Windows, Macintosh, and Unix systems also can handle Windows-1252, though sometimes older systems have trouble recognizing the unique Windows-1252 characters or the supposedly equivalent "Cp1252" name for the Windows-1252 character set.  For this reason, staying within the ISO-8859-1 character set (or even US-ASCII) is a good idea if absolutely reliable mail is a must.

All current operating systems and mail readers are also capable of reading UTF-8.  However, some mail readers may need to be specially configured to recognize UTF-8 and display the resulting characters using appropriate fonts.  Almost all modern mail readers do this automatically.  Those that don't generally have a preferences or menu option to select the character coding explicitly.  Note that in many modern mail programs (for example, the Netscape mail reader) there is a way to specify the default Character Coding, but this just specifies the character coding that will be used when the mail message itself does not say what character set it is using, which is not an issue since CustomMailer always specifies the character set in its messages explicitly.

It is also necessary to make sure the font used by the mail reader to display the message supports the intended characters.  Almost all fonts support US-ASCII and ISO-8859-1.  Many fonts also support Windows-1252, but there are some that substitute little boxes, question marks, or spaces for the unique Windows-1252 characters.  For Unicode messages, the recipient needs to set up an appropriate font for their language.  For example, under the Netscape mail reader, in Preferences: Appearance: Fonts you can set the fonts for Unicode to any mainstream font like Courier, Times New Roman, or Helvetica and all the alphabetic languages will display correctly (including Greek, Russian, Arabic, Hebrew, Devanagari, etc.).  But the ideographic languages require special fonts.  For example, if you set the Netscape Unicode Font to MS Song (the Microsoft font used for Chinese), then you will see a Chinese message in Chinese ideographs, though then the other non-8-bit languages such as Greek, Russian, Arabic, Hebrew, Devanagari, etc. will no longer work. 

Fortunately, you can generally rely on the fact that almost any non-Western language recipient already receives lots of email from all over the world and therefore has already configured their email reader to display UTF-8 encoded mail in a font appropriate for their native language.

If you need more help, here are a some useful links about how to set up a browser/mail reader for Chinese and other Asian languages.
     http://www2.meu.unimelb.edu.au/Webmentor/courses/nalsas/info/Demo/DemoMain.htm
     http://chinese.yahoo.com/docs/info/download.html
     http://www.hknet.com/HKNet/chinfaq.html
     http://www.people.virginia.edu/~mk3u/mk_lab/Chinese_opening.htm


Additional character sets (ISO-8859-n, Big5, GB2312, Shift-JIS, etc.)
As delivered, CustomMailer does not support other international character sets such as ISO-8859-n (n>1), Big5, GB2312, Shift-JIS, etc. for reading your message templates and mailing lists or for sending mail.  This is because the Java Runtime Environment (JRE 1.1.8) we ship with CustomMailer does not include Sun's internationalization libraries for encoding and decoding these character sets.  In addition, by default CustomMailer does not use the fonts your system may have for displaying ideographic character sets for your locale.  However, CustomMailer can be extended to provide both capabilities by following the steps below.  A more convenient installation procedure and preferences user interface is contemplated for a future release.  If you find your locale and language are not supported, please contact Wildcrest Associates.

1) Download and install CustomMailer international extensions
Download the file http://www.wildcrest.com/Software/CustomMailer/CMi18n.zip.  Unzip this file using WinZip (or similar software for reading .zip files) and place the resulting files in the folder:
    C:\Program Files\CustomMailer 4.0\JRE\1.1\lib
(or wherever you installed CustomMailer 4.0 if not in this default location).  These "i18n" extensions allow a large number of characters sets to be supported.  For a complete list of the "extended" character encodings, see: http://java.sun.com/j2se/1.3/docs/guide/intl/encoding.doc.html

2) Change character set used for sending mail
To get CustomMailer to send mail using any of the "extended" character encodings, use a regular text editor like WordPad to open the file:
      C:\Program Files\CustomMailer 4.0\CustomMailerApp\CustomMailerPreferences.txt
(or wherever you installed CustomMailer 4.0 if not in this default location).  Toward the end of this file, locate the line:
   mailSendingCharset=
To send messages encoded as, for example, Big5 (Traditional Chinese), change this to:
   mailSendingCharset=Big5

3) Change character set used for reading message template files
If you use your word processor and/or input method system to create message template files using any of the "extended" character encodings, you can read these files into CustomMailer by doing the following.  Use a text editor to open the file:
      C:\Program Files\CustomMailer 4.0\CustomMailerApp\CustomMailerPreferences.txt
Toward the end of this file, locate the line:
   messageTemplateCharset=
To read message template files encoded in, for example, Big5, change this to:
   messageTemplateCharset=Big5

4) Change character set used for reading mailing list files
If you use your word processor, spreadsheet program, and/or input method system to create mailing list files using any of the "extended" character encodings, you can read these files into CustomMailer by doing the following.  Use a text editor to open the file:
      C:\Program Files\CustomMailer 4.0\CustomMailerApp\CustomMailerPreferences.txt
Toward the end of this file, locate the line:
   mailingListCharset=
To read message template files encoded in, for example, Big5, change this to:
   mailingListCharset=Big5

5) Select the locale of your system
Under the Windows Start menu go to Settings: Control Panel and run Regional Settings.  Make sure your system is set to the desired locale (which it probably already is).  Availability of locales depends on your particular version of Windows. 

6) Change fonts used for display
As delivered, the fonts used by CustomMailer can display all languages of Western European and the Americas plus certain others.  In addition, the CustomMailer international extensions (installed in step 1) enable CustomMailer to use appropriate fonts for the following 8 Asian and Middle Eastern languages, assuming these are available on your system:
   Arabic, Hebrew, Russian, simplified Chinese, traditional Chinese, Japanese, Korean, Thai
To use any of these, open the following file in a text editor::
      C:\Program Files\CustomMailer 4.0\CustomMailerApp\CustomMailerPreferences.txt
Toward the end of this file, locate the lines:
   messageHeadersFont=
   messageBodyFont=
   mailingListFont=

These let you control the font for displaying the message template headers, the message template body (typically a monospaced font), and the mailing list.  The default values for these are helvetica, courier, and helvetica, respectively.  These default settings support the entire Windows-1252 character set.  Each of the 8 languages listed above have their own locale-specific font definitions, for which you should change the three lines to:
   messageHeadersFont=sansserif
   messageBodyFont=dialoginput
   mailingListFont=sansserif

(It appears that the default fonts support Arabic, Hebrew, and Russian as well, so for them this change may be optional).  If you wish to experiment, other possible values you can try on your system are: serif, dialog, monospaced, timesroman, and zapfdingbats.  These alternatives will mostly vary in how they treat Latin characters interspersed with the characters of your locale.

Having made these modifications, you should now be able to read in your character encoded message template and mailing list files, display them with a font appropriate for your locale, and send email using any desired character set. In addition, depending on how your operating system is set up, you may be able to enter and edit your message template and mailing list directly in CustomMailer using your system's special keyboard or input method facility or by using cut-and-paste to transfer text from other applications into CustomMailer.