Fututre Pspell Plans [Was: Re: Commit: Pspell fixes]


Subject: Fututre Pspell Plans [Was: Re: Commit: Pspell fixes]
From: Kevin Atkinson (kevina@users.sourceforge.net)
Date: Sun Nov 12 2000 - 20:32:26 CST


On Wed, 8 Nov 2000, Dom Lachowicz wrote:

> Pspell changes. Now should properly handle UTF8, so go ahead and use this to
> spell-check your international documents and test it out.

As I told the Dom I am thinking about adding a length fields to some of
the Pspell methods. This will avoid having to convert the words right
now. As it stands now here is how the conversion process goes.

1) AbiWord converts to UTF8
2) Pspell converts UTF8 to some 8 bit format used by the spell checker

as you can see there is an extra level of unnecessary conversion going on.

Pspell can already partly handle AbiWord internal encoding, assuming that
I understand it right. It is labeled as "machine unsigned 16". However
it expects the string to have a 2 byte null character at the end. The
reason for this is because Pspell will simply cast the "const char *"
passed in to "const unsigned short *". Thus in order for it to tell when
the end of the string it needs to see a 0.

In a future version of Pspell (perhaps the next version) I plan on adding
support for a length field so that the string can be passed with out any
conversion on AbiWords part. Once this is done all AbiWord will have to do
is cast the "unsigned short *" to a "const char *" and provide the length.
Pspell with then simply cast it back and then convert it to the format
needed by the spell checker. I don't thing I will bother returning length
information sense, as opposed to allocating new memory and copying the
string, finding the length of a string is a fairly cheep process.

Do you think this would be a good idea?

---
Kevin Atkinson
kevina at users sourceforge net
http://metalab.unc.edu/kevina/



This archive was generated by hypermail 2b25 : Sun Nov 12 2000 - 20:22:47 CST