Spellex SDK for JavaTM - Technical Support
Adding a spell checker to a Java applet using Spellex
This document describes how to use the Spellex Java SDK to create a Java applet that can check
text entered in Web pages. The technique presented here.
Works in all browsers that support Java
Runs entirely on the client side and requires
no server side code, servlets, or CGI scripts
Works for both signed and unsigned applets
Source code for a working example applet is
included with the Spellex Spelling Checker Engine Java SDK.
This document assumes familiarity with the Java
language, developing Java applets, adding applets to Web pages,
etc. See http://java.sun.com for a good source of information on Java and applets.
Why spell checking in an applet isn't as easy as you might
Spelling checkers work by comparing words being
checked against a dictionary of works known to be spelled
correctly. Any words not found in the dictionary are reported
as misspelled. To avoid annoying the user with spurious error
reports, the dictionary should contain most common words in
a given language. This requirement means the dictionary must
contain a large number of words -- typically, 100,000 or more.
For efficiency, the dictionary should be compressed to reduce
memory and disk space usage and indexed for fast access. As
a result, dictionaries are implemented as large, complex data
structures that are typically stored in disk files.
Most Web browsers prevent unsigned applets from
accessing disk files on the local computer for security reasons.
One solution to this restriction is to digitally sign the
applet. However, this approach introduces complications (see
"Creating signed, persistent Java applets," Dr. Dobb's Journal, Feb. 1999 for more information):
Internet Explorer and Netscape require different
Netscape requires the browser user to explicitly
grant permissions to the signed applet, resulting in additional
complexity and confusion that discourages casual use of
The dictionary files must be downloaded
to the client computer and placed in a known location,
or the location of the files must be configured in the
applet in some way.
Two alternative approaches for accessing dictionary
files exist which do not have these complications:
Store the dictionary files in the archive
(JAR or ZIP file) containing the applet, and access them
Store the dictionary file on the same Web
server as the applet and access them as URLs.
Both of these approaches require that the dictionary
files be accessed as InputStreams. Beginning with version
Java SDK allows lexicons (dictionaries) to be constructed
One further complication exists: Netscape allows
applets to access file resources in JAR or ZIP archives only
if the file has an extension included in a list of acceptable
extensions (see http://developer.netscape.com/docs/technote/java/getresource/getresource.html for more information). Spellex's dictionaries use "clx"
for compressed lexicons and "tlx" for text lexicons,
neither of which are included in Netscape's list of allowed
extensions. New extensions can be added to the list, but this
requires Netscape-specific code which contradicts the design
goal of a single solution for all browsers. A simpler solution
is to rename the dictionary files to use an allowed extension,
such as "t" in place of "clx" and "txt"
in place "tlx".
Adding a spell checker to an applet
We will assume that the applet spell-checks
text contained in a Java TextArea component, and that it has
a button or some other event source to start the spelling
check. We will use the SpellingDialog class from Spellex
's AWTDemo to interact with the user when spelling errors
are detected. (We used an AWT-based applet, but JFC/Swing
could be used just as well.) We'll also use the PropSpellingSession
class, which is part of the Spellex class library, to construct
a spelling session and initialize it from settings contained
within a properties (java.util.Properties) file. A spelling
session is an instance of the spell-check engine. It contains
methods for checking the spelling of text, looking up suggestions
for misspelled words, etc. When the applet is deployed, we
will store its classes and the properties file in a JAR file.
The properties file lists the spelling options
(e.g., "ignore capitalized words" or "report
doubled words") and the dictionaries used by the spelling
checker. More importantly, it specifies the location of the
dictionaries, and the method used to access them. In this
design, the dictionary files will be located on the Web server
in the same directory as the Web page containing the applet.
They could also be located in sub-directories, but cannot
be located in higher-level directories because some browsers
will not allow this. The dictionary files will be accessed
through URL streams for reasons that will be given shortly.
The properties-file lines that ispecify the location and access
method for dictionaries (lexicons) might look like the following:
The properties file lines specify the name of
the dictionary file (e.g., correct.tlx), the method of accessing
the file ("url", meaning the files are accessed
as URL streams), and the format of the dictionary ("t"
for text lexicons and "c" for compressed). Note
that Netscape's restriction on file extensions does not apply
when files are accessed as URL streams.
We could have elected to store the dictionary
files in the JAR file containing the applet. The PropSpellingSession
class supports this, and the JAR-file approach does have the
advantage of keeping the applet and its files together in
one place. However, compressed main dictionary files tend
to be large (ssceam2.clx, the American English dictionary,
is over 300K). If the applet's JAR file is large, the Web
page containing the applet will take a long time to load on
computers with slow Internet connections. If the dictionaries
are accessed as URL streams, loading of them can be deferred
until the spelling check starts.
The user enters some text in the applet's TextArea,
then clicks the button to start the spelling check. In response
to the button press, the applet creates a PropSpellingSession
object, which initializes the spelling-checker engine by setting
options and opening dictionaries specified in the properties
file. Because the properties file is stored in the applet's
JAR file, we use getResourceAsStream, which is a method of
java.lang.Class. The getResourceAsStream method locates a
file in the applet's code base (the JAR file), opens it, and
returns an InputStream object. The InputStream is used to
load properties into the java.util.Properties object. PropSpellingSession
takes care of the details required to load the dictionary
files as URLs.
Because we will be checking the contents of
a TextArea component, we can use the TextAreaWordParser class
which is part of Spellex 's AWTDemo program. This class
implements Spellex 's WordParser interface, which is
used by the engine to enumerate individual words in a text
source. WordParser-derived classes like TextAreaWordParser
also allow misspelled words to be corrected.
The next and final step for the applet is to
construct a SpellingDialog object. SpellingDialog takes over
from this point. It calls on the TextAreaWordParser object
to obtain words from the TextArea one by one and passes them
to the spelling-checker engine for checking. When it encounters
a misspelled word, it displays the word and asks the engine
for a set of suggested replacements, which it also displays.
SpellingDialog also asks TextAreaWordParser to highlight the
misspelled word in the TextArea so the user can see the word
in context. The user can dispose of misspelled words by ignoring
them or replacing them. Any replacements are made directly
in the TextArea. When all words have been checked, the SpellingDialog
closes. The TextArea contains the checked and possibly corrected
text at this point.
Installing the applet
Once the applet has been compiled and tested
locally (using AppletViewer), it is ready for deployment.
The applet doesn't have to be signed to support the spell-check
features; of course, you can sign the applet if necessary
for other purposes. The following steps are required to deploy
the applet in a Web page:
Create a JAR file containing the applet's
classes and properties file.
Upload the JAR file to the Web site directory
where the Web page which uses the applet will reside.
Upload any dictionary files to the same
directory on the Web site as the JAR file.
Upload the ssce.jar file (the Spellex class
library) to the same directory on the Web site as the
Create a Web page with an APPLET tag similar
to the following:
Upload the Web page to the same directory
on the Web site as the JAR file.
Open the Web page in a browser, and you should
be able to enter text in the TextArea and check its spelling.
POSTing the text
At this point, we've described how to create
and deploy an applet that can check the spelling of some text,
but not much else. If you need to check spelling of text entered
into an existing applet, or a new applet you plan to develop,
then the technique described so far will be useful to you.
Presumably your applet does something useful with the text
entered by the user.
Many Web pages accept text entry from the user
in HTML forms. These forms typically contain "Submit"
buttons that isend data entered in the form via a POST operation
to a CGI script on the Web server. An applet can implement
the entire form as AWT (or JFC) components, and submit the
text to the CGI script on the server within the applet. This
is a general Java programming technique, so we will let the
Java experts explain it: See http://java.sun.com/docs/books/tutorial/networking/urls/readingWriting.html.
Checking text in HTML forms
An alternative to the approach presented here involves checking
text entered into HTML forms. In this approach, the applet
provides public methods and properties that can be used by
For example, an applet could provide a public method named
"check" that takes a String as a parameter. When
this method is called, the applet invokes the SpellingDialog
to check the text passed in the String. Another public method
called "getText" returns the corrected text when
the spelling check is complete. Text in a textArea component
document.emailForm.body.value = document.spellingApplet.getText();
This code would be invoked as the "onclick"
attribute of a button in the form.