Re: [Gnome-OCR] Integration of Tesseract-OCR... (fwd)

From: Ryan Pavlik <abiryan_at_ryand.net>
Date: Mon Nov 27 2006 - 20:47:53 CET

I think this would be a great addition. Another thing that we may wish
to keep in mind is that AbiWord does in fact work on Windows, so if the
design of this software is modular enough that Windows TWAIN scanner
interface can be substituted in, the work is twice as valuable.

Always looking out for the Microsoft-stricken,
Ryan

Alan Horkan wrote:
> Forwarded from the Gnome Destkop Developer mailing list.
>
> I expect you the abiword developers will express the usual enthusiasm for
> this and efforts to help improve abiword. The lecturer has good plans but
> a cautious outlook and he recognises it might be in the longer term
> before this becomes relevant to Abiword but still we can offer
> encouragement until then.
>
> Sincerely
>
> Alan Horkan
>
> http://advogato.org/person/AlanHorkan/
> http://alanhorkan.livejournal.com/
>
> ---------- Forwarded message ----------
> Date: Mon, 27 Nov 2006 10:55:06 +0100
> From: Emmanuel Fleury <fleury@labri.fr>
> To: desktop-devel-list@gnome.org
> Subject: [Gnome-OCR] Integration of Tesseract-OCR...
>
> Hi all,
>
> As I'm pretty new here just forgive me if I'm not at the right place. :)
>
> I am associate professor at Bordeaux-I university (France) and I have
> submitted a project for students about integrating tesseract-OCR in
> Gnome (the student project starts only in January).
>
> My plan is more or less to:
> - make them develop a libgnome-ocr as wrapper to tesseract-OCR,
> - clean the code of tesseract-OCR,
> - refactor tesseract-OCR within the Gnome libs and,
> - try to add some extra features.
>
> (I think the students will stop at the first item but we never know !)
>
> I don't know exactly what should be the API and what could be the usage
> of such library but with the help of Étienne Bersac (the author of
> libgnome-scan) I though about few examples:
> - A plug-in for Abiword (outputting also formatting informations);
> - A plug-in for e-mail readers (image spam analysis);
> - ... and so forth ...
>
> I guess, that the API should include an initialization function, setting
> the image input format, the output format plus some settings
> (recognition strategy, drawing recognition, where to store the output, etc).
>
> For now, waiting for the start of the project in January I'm trying to
> port Tesseract-OCR to 64bits plate-forms... and I'm a bit horrified by
> the way they handle basic types and data-structures... My guess is that
> a lot of cleaning is needed there. :-/
>
> Anyway, is this project interesting for the Gnome community, would you
> have comments, advices or objections ?
>
> Regards
>

-- 
Ryan Pavlik
AbiWord Win32 Platform Maintainer: www.abisource.com
AbiWord Community Outreach Project: www.cleardefinition.com/oss/abi/blog/
"Optimism is the father that leads to achievement."
 -- Helen Keller
"The folder structure in a modern Linux distribution such as Ubuntu
was largely inspired by the original UNIX foundations that were
created by men with large beards and sensible jumpers."
 -- Jono Bacon, The Ubuntu Guide
Received on Mon Nov 27 20:48:35 2006

This archive was generated by hypermail 2.1.8 : Mon Nov 27 2006 - 20:48:35 CET