Re: problem seems to be solved: unreadable .hash files (dictionaries)


Subject: Re: problem seems to be solved: unreadable .hash files (dictionaries)
From: Dom Lachowicz (cinamod@hotmail.com)
Date: Fri Mar 16 2001 - 16:35:16 CST


Hi Vlad and Paul,

This is some good news on the ispell front. I had all-but-given-up on ispell
working for us long-term. Now it seems that it might be at least feasible
again.

I have a RFP that's really simple to implement for whomever wants it:

We need to abandon "american.hash" - we need something more robust. What I
think we want is en_US.hash, de_DE.hash, etc... If we do this, we can
dynamically load dictionaries based on our current locale or even with the
"lang" attribute like my hack last night.

So I guess my suggested plan of action is this:
1) Rename the dictionaries (and start housing (*not necessarily shipping*)
some known working ones on the website)
2) And Either:
a) Change ispell's SpellCheckInit() function to take a string of the form
'en_US' and have *it* create the proper .hash name so we can share 100% code
with Pspell
b) Keep passing the full path to the dictionary, and have that 1 ifdef in
our code for ispell/pspell

Whaddya think?
Dom

>From: Paul Rohr <paul@abisource.com>
>To: Vlad Harchev <hvv@hippo.ru>, abiword-dev@abisource.com
>Subject: Re: problem seems to be solved: unreadable .hash files
>(dictionaries)
>Date: Fri, 16 Mar 2001 14:13:08 -0800
>
>Vlad,
>
>Thanks for the detective work!
>
>At 02:02 PM 3/16/01 +0400, Vlad Harchev wrote:
> > I remember a lot of people complained that AW can't use some hash files
>(i.e.
> >dictionaries for ispell) - that ispell module spits out some message
>about
> >incorrect header..
> > While helping other people to select a russian dictionary, I discovered
>that
> >'file' utility knows ispell format (at least on my RH6.0) and that we can
> >judge whether the hash file will be loadable by ispell module or not
>basing on
> >the output of 'file' command. For example, here is an output for the
> >russian.hash file that can be used by AW's ispell:
> >
> >[hvv@h dictionary]$ file russian.hash
> >russian.hash: little endian ispell 3.1 hash file, 8-bit, capitalization,
>26
> >flags and 100 string characters
> >[hvv@h dictionary]$
> >
> > It seems that hash files for which '7-bit' is mentioned in the output of
> >'file' command can't be used by AW.
>
>Bingo. That's it. If you grep the sources for NO8BIT, you'll see that one
>of the few things it affects is SET_SIZE, which in turn controls the size
>of
>various ispell structs, inclung the main hashtable.
>
> http://www.abisource.com/lxr/source/abi/src/other/spell/ispell.h#495
>
>The error message we usually get is a sanity check to make sure that
>ispell's not reading a hashtable of the wrong length. For example, see:
>
> http://bugzilla.abisource.com/show_bug.cgi?id=902
> http://bugzilla.abisource.com/show_bug.cgi?id=824
>
>Note that the hashtable loader currently just reads the entire struct from
>disk to memory here:
>
> http://www.abisource.com/lxr/source/abi/src/other/spell/lookup.c#159
>
>Gag. Methinks it would be prudent to just rewrite the loader to do the
>math
>to detect this situation and do the extra work needed to try and load 7-bit
>content into the 8-bit structs we currently use.
>
> >Also it turns out that (at least for
> >russian dictionary) it's possible to specify whether to use 7-bit or
>8-bit
> >format of hash files by altering Makefile for dictionary (there are
>makefile
> >variables that control that). So, it seems we have a hope of knowing te
>way of
> >building ispell dictionaries that will be understood by our ispell. At
>least
> >we may try to build .hash files for languages for which only unreadable
>by
>our
> >iconv compiled dictionaries are available..
>
>Exactly. Until someone's willing to write the code mentioned above to also
>load 7-bit dictionaries, we now have a few simple workarounds:
>
> - update the FAQ to tell folks not to use 7-bit dictionaries
> - ideally, point them to 8-bit alternatives
>
>Any volunteers? ;-)
>
>Paul
>

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com



This archive was generated by hypermail 2b25 : Fri Mar 16 2001 - 16:35:22 CST