| [ Team LiB ] |
|
14.4 Handling Localized InputSo far we have discussed how to generate pages in different languages, but most applications also need to deal with localized input. As long as you're supporting only Western European languages, the only thing you typically need to worry about is how to interpret dates and numbers. The JSTL I18N actions can help you with this as well. Example 14-5 shows a JSP page with the same form for selecting a language as in Example 14-1, plus a form with one field for a date and another for a number. Example 14-5. Date and number input form (input.jsp)<%@ page contentType="text/html" %>
<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %>
<%@ taglib prefix="fmt" uri="http://java.sun.com/jsp/jstl/fmt" %>
<%--
Set the locale to the selected one, if any. Otherwise, let the
<fmt:bundle> action pick the best one based on the Accept-Language
header.
--%>
<c:if test="${param.language == 'en'}">
<fmt:setLocale value="en" scope="session" />
</c:if>
<c:if test="${param.language == 'sv'}">
<fmt:setLocale value="sv" scope="session" />
</c:if>
<c:if test="${param.language == 'de'}">
<fmt:setLocale value="de" scope="session" />
</c:if>
<fmt:setBundle basename="input" var="inputBundle" />
<fmt:setBundle basename="input" scope="session" />
<html>
<head>
<title>
<fmt:message key="title" />
</title>
</head>
<body bgcolor="white">
<h1>
<fmt:message key="title" />
</h1>
<fmt:message key="select_language" />
<form action="input.jsp">
<c:set var="currLang" value="${inputBundle.locale.language}" />
<input type="radio" name="language" value="en"
${currLang == 'en' ? 'checked' : ''}>
<fmt:message key="english" /><br>
<input type="radio" name="language" value="sv"
${currLang == 'sv' ? 'checked' : ''}>
<fmt:message key="swedish" /><br>
<input type="radio" name="language" value="de"
${currLang == 'de' ? 'checked' : ''}>
<fmt:message key="german" /><br>
<p>
<input type="submit"
value="<fmt:message key="new_language" />">
</form>
<form action="process.jsp" method="post">
<fmt:message key="date" /><br>
<br>
<jsp:useBean id="now" class="java.util.Date" />
<input type="text" name="date">
(<fmt:formatDate value="${now}" dateStyle="full" />)
<p>
<fmt:message key="number" /><br>
<br>
<input type="text" name="number">
(<fmt:formatNumber value="1000.9" pattern="####.00"/>)
<p>
<input type="submit"
value="<fmt:message key="submit" />">
</form>
</body>
</html>
The language selection part, the use of a bundle, and the <fmt:message> action to display localized test are exactly as in Example 14-1; if a specific language is requested, the corresponding locale is set for the session, otherwise the <fmt:setBundle> action figures out which one to use based on the Accept-Language header. The second form in the page—with the date and number entry fields—uses the <fmt:formatDate> and <fmt:formatNumber> actions described earlier to add samples for the date and number, respectively. This helps the user to use the required format for the values. I set the dateStyle attribute to full, just to make the difference between the languages more visible. The default style is a better choice for a real application. On to the most interesting part. Example 14-6 shows the JSP page that processes the submitted values. Example 14-6. Processing localized input (process.jsp)<%@ page contentType="text/html" %>
<%@ taglib prefix="fmt" uri="http://java.sun.com/jsp/jstl/fmt" %>
<html>
<head>
<title>Parsed Date and Number</title>
</head>
<body bgcolor="white">
<h1>Parsed Date and Number</h1>
Date string converted to the internal Java Date type:
<fmt:parseDate value="${param.date}" dateStyle="full" />
<p>
Number string converted to the internal Java Number type:
<fmt:parseNumber value="${param.number}" pattern="####.00" />
</body>
</html>
This page reads and interprets (parses) the localized text values for the date and number sent as parameters and converts them to the appropriate Java objects that represent dates and numbers, using the <fmt:parseDate> and <fmt:parseNumber> actions described in Tables Table 14-11 and Table 14-12.
<fmt:parseDate> and <fmt:parseNumber> complement the <fmt:formatDate> and <fmt:formatNumber> actions, and support most of the same attributes to describe the format of the value to be parsed. Note that the parsing actions in Example 14-6 specify the same text format as the formatting actions that generate the samples in the form: dateStyle is set to full and pattern to ####.00. This allows the parsing actions to handle text values in the prescribed format for the locale selected and saved in the session by the I18N actions in the input.jsp page. In this example, the parsed values are simply added to the response in their default format to prove that the parsing works no matter which language you select. In a real application, the parsed values can be used as input to another action that requires a java.util.Date or Number object instead of a text value representing a date or a number, for instance the database actions: <fmt:parseDate value="${param.date}" dateStyle="full"
var="parsedDate" />
<fmt:parseNumber value="${param.number}" pattern="####.00"
var="parsedNumber" />
<sql:update>
INSERT INTO MyTable (DateCol, NumberCol) VALUES(?, ?)
<sql:dateParam value="${parsedDate}" />
<sql:param value="${parsedNumber}" />
</sql:update>
Both parsing actions throw exceptions if the specified value cannot be interpreted as a number or a date. You can embed the actions in the body of a <c:catch> action element, as shown in Chapter 12, to deal with invalid values. 14.4.1 Dealing with Non-Western European InputAn HTML form can be used for input in languages other than Western European, but the charset discussed earlier comes into play here as well. First of all, when you create a page with a form for entering non-Western European characters, you must tell the browser which charset should be used for the user input. One way to give the browser this information is to hardcode a charset name as part of the contentType attribute of the page directive, as in Figure 14-4: <%@ page pageEncoding="Shift_JIS" contentType="text/html;charset=UTF-8" %> The user can then enter values with the characters of the corresponding language (e.g., Japanese symbols). But there's something else to be aware of here. When the user submits the form, the browser first converts the form-field values to the corresponding byte values for the specified charset. It then encodes the resulting bytes according to the HTTP standard URL encoding scheme, the same way special characters such as space and semicolon are converted when an ISO-8859-1 encoding is used. The bytes for all characters other than ISO-8859-1 a-z, A-Z, and 0-9, are encoded as the byte value in hexadecimal format, preceded by a percent sign. For instance, the symbols for "Hello World" in Japanese are sent like the following if the charset for the form is set to UTF-8: %E4%BB%8A%E6%97%A5%E3%81%AF%E4%B8%96%E7%95%8C This code represents the URL-encoded UTF-8 byte codes for the five Japanese symbols (three bytes for each symbol). In order to process this information, the container must know which charset the browser used to encode it. The problem is that even though the HTTP specification says that the charset name must be sent in the Content-Type request header, most browsers don't. It's therefore up to you to keep track of this and tell the container which charset to use to decode the parameter values. If a fixed charset is used (e.g., always UTF-8, as in this example), you can use the <fmt:requestEncoding> (see Table 14-13) like this in the page that processes the input: <fmt:requestEncoding value="UTF-8" /> This action tells the container which charset to use, so parameter values accessed after the action element are decoded correctly. Note that you must insert this action before any actions that access request parameters; the container may decode all parameters in one shot, so it must be told which charset to use before the first parameter is used.
As long as you need to deal with only one non-Western European language, this is not so hard. But what if you need to handle input in multiple non-Western European languages, picked at runtime in the same fashion as in the previous examples for Western European languages, with each language using a different charset? Luckily, the JSTL I18N actions make this a lot easier than it sounds. Example 14-7 shows a JSP page with a form for entering a date and a text value in Japanese, Russian, or Greek. Example 14-7. Non-Western European input page (input_nw.jsp)<%@ page contentType="text/html" %> <%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %> <%@ taglib prefix="fmt" uri="http://java.sun.com/jsp/jstl/fmt" %> <c:set var="lang" value="${param.language}" /> <c:choose> <c:when test="${lang == 'el'}"> <fmt:setLocale value="el" scope="session" /> </c:when> <c:when test="${lang == 'ru'}"> <fmt:setLocale value="ru" scope="session" /> </c:when> <c:otherwise> <fmt:setLocale value="ja" scope="session" /> <c:set var="lang" value="ja" /> </c:otherwise> </c:choose> <fmt:setBundle basename="dummy" scope="session" /> <html> <head> <title> Non-Western European Input Test </title> </head> <body bgcolor="white"> <h1> Non-Western European Input Test </h1> <form action="input_nw.jsp"> <input type="radio" name="language" value="ja" ${lang == 'ja' ? 'checked' : ''}> Japanese<br> <input type="radio" name="language" value="el" ${lang == 'el' ? 'checked' : ''}> Greek<br> <input type="radio" name="language" value="ru" ${lang == 'ru' ? 'checked' : ''}> Russian<br> <p> <input type="submit" value="New Language"> </form> <form action="process_nw.jsp" method="post"> Enter a date:<br> <jsp:useBean id="now" class="java.util.Date" /> <input type="text" name="date"> (<fmt:formatDate value="${now}" dateStyle="full" />) <p> Enter some text:<br> <input type="text" name="text"> <p> <input type="submit" value="Send" > </form> </body> </html> This page looks similar to the one used for Western European input in Example 14-5. Besides the set of supported languages, and that English is used for all descriptive text (because I don't know the other languages) the main difference is that Japanese is selected if no language is requested or the requested one is not supported, instead of letting a <fmt:setBundle> action pick a language based on the Accept-Language header. The reason for this is that you can define only one fallback locale for an application, and I already defined it as the English locale for the previous examples in this chapter. If that was not the case, I could have used exactly the same approach here and defined the Japanese locale as the fallback locale. Even so, I still use a <fmt:setBundle> action in this page, with a dummy base name. This is just a hack to overwrite the default localization context. Without it, the JSTL formatting and parsing actions pick up the locale from the default localization context set for the session by the other examples. The final difference is that the data entry form now contains a field for a text value instead of a field for a numeric value, just to show you how to deal with pure text in non-Western European languages. Everything else is the same, and this similarity between the two examples illustrates the beauty of the JSTL I18N actions; they hide a lot of the details you otherwise have to take care of yourself. One detail you have to deal with when you support input in non-Western European languages is the setting of the charset for the form page. Note that no charset is specified as part of the contentType attribute. The charset is instead set automatically by the first JSTL I18N action that sets the locale for the page. In Example 14-7, it's done by the <fmt:setLocale> action, but all JSTL I18N actions that select a locale based on the Accept-Language header, such as <fmt:setBundle>, do the same. In addition to setting the charset for the response generated by the page, these actions also save the selected charset as a session scope variable named javax.servlet.jsp.jstl.i18n.request.charset. You'll soon see why this is important. Example 14-8 shows the process_nw.jsp page, the page that processes the input. Example 14-8. Processing non-Western European input (process_nw.jsp)<%@ page contentType="text/html;charset=UTF-8" %> <%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %> <%@ taglib prefix="fmt" uri="http://java.sun.com/jsp/jstl/fmt" %> <%@ taglib prefix="fn" uri="http://java.sun.com/jsp/jstl/functions" %> <fmt:requestEncoding /> <html> <head> <title>Processing Non-Western European Input</title> </head> <body bgcolor="white"> <h1>Processing Non-Western European Input</h1> Text string converted to a Java Unicode string: ${fn:escapeXml(param.text)} <p> Date string converted to the internal Java Date type: <fmt:parseDate value="${param.date}" dateStyle="full" /> </body> </html> The good news is that the most interesting difference between this page and the one for processing Western language input in Example 14-6 is the <fmt:requestEncoding> action at the beginning of the page. This action sets the charset used to read the request parameters, as described earlier. Note that I don't specify a specific charset using the value attribute. In this case, the action first looks for a Content-Type header (in case browsers one day actually comply with the HTTP specification) and then for the charset saved by a JSTL I18N action in the variable javax.servlet.jsp.jstl.i18n.request.charset. After setting the request encoding, all parameter values accessed through the EL expressions are converted to Unicode. The page directive contentType for the process_nw.jsp page specifies the UTF-8 charset for the response, so that all languages can be displayed correctly. To recap, the full round-trip goes like this. The charset for the page with the form is set dynamically based on the selected language by the I18N actions. When the form is submitted, the request parameters are passed to the target page, encoded with the charset used for the page with the form. They get decoded to Unicode by the EL expressions based on the encoding set by the <fmt:requestEncoding> action, and then encoded as UTF-8 in the response due to the contentType attribute value. That's all there's to it. There are a couple of things you should be aware of, though. First of all, the I18N JSTL actions can set the charset for the response only as long as no part of the response has been sent to the browser (this is true for all response headers). By default, JSP pages are buffered using a large enough buffer for this to be rarely a problem, but if it doesn't work for your own pages, try extending the buffer size as described in Chapter 16. Another issue is that this functionality is based on the assumption that all containers deal with the charset setting in the same way. Unfortunately, the JSP 1.2 and Servlet 2.3 specs were vague about these details, for instance, whether a charset defined by the contentType attribute has precedent over a charset defined dynamically by an action. A specification errata (clarification) was issued to correct this, but some JSP 1.2 containers may still not behave as expected. The JSP 2.0 and Servlet 2.4 specifications include clarifications of these details as well, so with a web container compliant with these specification versions, you shouldn't have a problem. When all goes as planned, the result of processing Greek input looks like Figure 14-5. Figure 14-5. Processed Greek input![]() As with the Western European input example, the decoded request parameter values are just added to the response. In a real-world application you can do anything you like with the values, such as storing them in a database. |
| [ Team LiB ] |
|