Re: Integrating OpenMedSpel word list with Abiword's spell checker

From: Vidh <vidhu2366_at_gmail.com>
Date: Mon Mar 25 2013 - 10:11:51 CET

Hi Martin,
Any thoughts on this?

On Sat, Mar 23, 2013 at 12:16 AM, Vidh <vidhu2366@gmail.com> wrote:
> Integrating OpenMedSpel word list with Abiword's spell checker
> ====================================================
>
> I tried doing random things related to Abiword in order to relate with
> source code.
> I got things moving on this GSOC idea:
> http://www.abisource.com/wiki/Google_Summer_of_Code_2013#OpenMedSpel_plugin_for_AbiWord
>
> I am writing this mail to report what I have done related to this idea
> and to present some results.
>
> The goal is to make Abiword simultaneously spell check against
> OpenMedSpel and the default language dictionary.
> I have achieved the same result by making few changes to "enchant" and "aspell".
>
> Here are the steps I followed:
> ========================
> 1)Downloaded OpenMedSpel wordlist from:
> http://www.e-medtools.com/openmedspel100.zip
>
> This had the following files:
>
> vidhoon@vidhoonv:/usr/lib/aspell$ ls -l ~/Downloads/openmedspel100
> total 2968
> -rw-rw-r-- 1 vidhoon vidhoon 607041 Feb 14 2007 OpenMedSpel 100.csv
> -rw-rw-r-- 1 vidhoon vidhoon 558312 Mar 14 01:38 OpenMedSpel 100.txt
> -rw-rw-r-- 1 vidhoon vidhoon 1169 Feb 14 2007 README_OpenMedSpel.txt
>
> 2) The txt file in the download had DOS characters in it. Hence, I had
> to do this:
>
> $dos2unix OpenMedSpel\ 100.txt OpenMedSpelunix.txt
>
> 3) Now I created a wordlist for aspell using the command below:
>
> $aspell --lang=en create master ./openmedspel.rws <
> ~/Downloads/openmedspel100/OpenMedSpel\ 100.txt
>
> (why I chose Aspell? - I could easily find relevant documentation to
> solve this problem)
> This link was really helpful:
> http://aspell.net/0.50-doc/man-html/5_Working.html#SECTION00640000000000000000
>
> 4) After this, I located Aspell in my local system and found it at:
> /usr/lib/aspell
>
> In this location I could find all "multi" dictionary files and "rws"
> lists included in them.
> I copied the openmedspel.rws list to this location.
>
> 5) I took the en_US.multi (since OpenMedSpel wordlist is also USA
> english) and found it to contain "en_US-wo_accents.multi"
> Then I opened en_US-wo_accents.multi and added the new wordlist
> created as shown below:
>
> vidhoon@vidhoonv:/usr/lib/aspell$ cat en_US-wo_accents.multi
> # Generated with Aspell Dicts "proc" script version 0.60.2
> add en-common.rws
> add en_US-wo_accents-only.rws
> add openmedspel.rws
>
> 6) I understood that Aspell is exercised through "enchant" by Abiword
> and located enchant in my local system:
> /usr/share/enchant
>
> 7) I found the "private" enchant.ordering file that determines the way
> spell checker and dictionaries are chosen for each language.
> I learned more about this file from this link:
> https://listman.redhat.com/archives/xdg-list/2003-July/msg00188.html
>
> For time being, I hacked this file to place "aspell" ahead of
> "myspell" from the ordering so that aspell gets picked for en_US and
> this would contain OpenMedSpel wordlist also.
>
> vidhoon@vidhoonv:/usr/lib/aspell$ cat /usr/share/enchant/enchant.ordering
> *:aspell,myspell,ispell //-> order changed in this line
> fi:voikko,ispell,myspell,aspell
> fi_FI:voikko,ispell,myspell,aspell
> he:hspell,myspell
> he_IL:hspell,myspell
> yi:uspell
> tr:zemberek
> tr_TR:zemberek
>
> Now I can see that abiword spell check does not underline words from
> OpenMedSpel list which indicates that the goal is achieved. That is,
> Abiword does spell check for English US - normal words and OpenSpelMed
> words.
>
> I have attached screenshot of abiword illustrating this.
>
> So coming to the next steps,
> =======================
> 1)I would like to know how these steps involving enchant and aspell
> (purely! abiword need not even be compiled) can be transformed into an
> abiword plugin.
> 2)Have I achieved the required result in the right way?
>
> Or is it that a UI based option from Abiword for users to add their
> own word lists for a language dictionary is needed?
> I think this might fit in the "plugin" definition!
>
> Any help, suggestions and thoughts appreciated.
> thanks & regards!
>
>
> --
> Vidhoon Viswanathan
> +91 7760711773

-- 
Vidhoon Viswanathan
+91 7760711773
Received on Mon Mar 25 10:12:02 2013

This archive was generated by hypermail 2.1.8 : Mon Mar 25 2013 - 10:12:02 CET