Spellex SDK for JavaTM - Technical Support

Problem: Most or all words are reported as misspelled. The suggestions list is empty or nearly empty.

Description:

Spellex is a lexicon or dictionary based spelling engine, which means it validates the spelling of words by checking that they exist in a dictionary of words known to be spelled correctly. The dictionary exists as a set of one or more files, which may be accessed directly or as resource streams. If the dictionary files cannot be opened, all words will be reported as misspelled and no suggestions will be offered. If only some of the dictionary files are opened, then many or most words will be reported as misspelled.

In the Java environment, many external factors can prevent files or resources from being accessed. For example, an unsigned Java applet is usually unable to open local files. The Spellex engine is simply Java code which is subject to the same constraints as any other Java code. The Spellex engine uses standard packages such as java.io to access dictionaries. It doesn't have any special powers or do anything tricky, undocumented, or "behind the scenes" which enables it to access otherwise inaccessible files or resources. If code you write can't open a certain file or access a certain resource stream, the Spellex engine will not be able to, either.

Virtually all issues with opening dictionaries stem from external or configuration factors:

  • The dictionary file name was misspelled (e.g., "ssecam2.clx" instead of "ssceam2.clx")

  • The dictionary file is located in a different directory (e.g., dictionary file name specified as "/myapp/dicts/ssceam2.clx", but ssceam2.clx is actually located in "/myapp/dicts/am")

  • Security issues prevent the file or resource from being accessed (e.g., in an applet, the dictionary is specified as a URL resource in a different domain).

  • The MainLexiconN value is malformed (e.g., set to "ssceam2.clx,c,resource" instead of "ssceam2.clx,resource,c")

  • The wrong dictionary format code is supplied (e.g., "ssceam.tlx,resource,c" (should be "ssceam.tlx,resource,t") or "ssceam2.clx,resource,t" (should be "ssceam2,resource,c")

The key to solving problems with dictionary access is to diagnose the cause of the problem. Unfortunately this isn't always easy. Information in this document may help.

Starting with version 5.9, the PropSpellingSession class in Spellex engine logs any failed attempts to open dictionaries to System.err. The System.err log may contain information that helps to explain why a dictionary could not be accessed.

The following list shows the standard Java classes and methods used internally for dictionary access in the Spellex engine:

  • CompressedLexicon(String fileName): java.io.RandomAccessFile(fileName, "r")

  • CompressedLexicon(InputStream is): java.io.DataInputStream(is)

  • FileTextLexicon(String fileName): java.io.FileInputStream(fileName)

  • PropSpellingSession:

    • "file" dictionary access (e.g., MainLexicon1=fileName,file,type):

      • If type is "c" or fileName appears to contain a compressed lexicon: CompressedLexicon(fileName, 0)

      • If type is "t" or fileName appears to contain a text lexicon: FileTextLexicon(fileName)

    • resource" dictionary access (e.g., MainLexicon1=resourceName,resource,type):
      • java.io.InputStream is = getClass().getResourceAsStream(resourceName)

      • If type is "t": StreamTextLexicon(is)

      • If type is anything other than "t": CompressedLexicon(is)

    • "url" dictionary access (e.g., MainLexicon1=resourceName,url,type):
      • URL url = new java.net.URL(codeBase + resourceName) (codeBase is an optional parameter passed to the PropSpellingSession constructor
      • java.io.InputStream is = url.openStream();

      • If type is "t": StreamTextLexicon(is)

      • If type is anything other than "t": CompressedLexicon(is)

One way to diagnose dictionary access problems is to insert test code into your application which attempts to open or access a dictionary file or resource in the same way as the Spellex engine (as listed above). For example, if you use PropSpellingSession to access "/ssceam2.clx" using a definition such as the following:

MainLexicon2=/ssceam2.clx,resource,c

then insert Java code similar to the following near the place in your code where PropSpellingSession is constructed:

InputStream is = getClass().getResourceAsStream("/ssceam2.clx");
if (null == is) throw new Exception("/ssceam2.clx not found");
CompressedLexicon lex = new CompressedLexicon(is);

Any exceptions raised can be caught and may help you to determine the reason PropSpellingSession was unable to access "/ssceam2.clx".

Note that when the "resource" access method is specified in the Properties object passed to PropSpellingSession, the PropSpellingSession class uses getClass().getResourceAsStream() to open a stream to the resource. If PropSpellingSession is constructed from a static method (such as "main"), in some Java implementations the call to getResourceAsStream will fail and the lexicon will not be opened. To avoid this, ensure that PropSpellingSession is constructed from a non-static object method (if necessary, create a class whose sole purpose is to call PropSpellingSession and provide access to the constructed PropSpellingSession object through a public member).

Different Java implementations have different rules about how resources with relative names are located. For example, if a Properties object contains the following definition:

MainLexicon1=ssceam.tlx,resource,t

the approach used to locate ssceam.tlx may vary depending on the Java implementation. It's generally safest to specify an absolute location for the resource:

MainLexicon1=/ssceam.tlx,resource,t

Even if absolute paths to resources are specified, the location of the resource root may not be what you expect. For example, some servers allow the root of a servlet to be configured. If the servlet's root was configured to be something other than what you expect, then an absolute resource name such as "/ssceam.tlx" will not point to the actual location of the resource. Even if the root has not been changed through configuration, a common source of problems with dictionary access when the Spellex engine is used in a servlet is failing to place the dictionary files in the correct directory on the server.

Another common cause of all words being reported as misspelled is an expired license file. The Spellex license information is contained within the spellex.jar file. An expired license will disable the lexicons making all the terms encountered by the spelling engine to be considered misspelled. On evaluation versions the license expires 30 days from the time it was issued. You can send your spellex.jar file to our support department for analysis.

 

Home | Order Now | Products | Upgrades | Free Trial | Partners | About Spellex | Contact Us | Site Map | Privacy Policy

Spellex Corporation © 2012. All rights reserved