Abiword Win32 Unicode Strategy - thoughts and proof of concept

From: Jordi Mas <jmas_at_softcatala.org>
Date: Mon Apr 09 2007 - 10:59:00 CEST

Hello,

As many of you already know Abiword in Win32 is an ANSI application. Since the
begining of February, I have been experimenting on porting Abi to Unicode.

I think that benefits of Unicode are clear to all:

- Support for the many new "Unicode only" languages such as Hindi, Georgian,
Nepali,Armenian, Gujarati, Hindi, Kannada, Konkani. We already have Abiword
translated to Napali that is only a Unicode language. And more will come.

- Improving international and multilingual support:
* Better integration with the features of the NT platform
* Better support of the Multilingual UI (MUI) features of Windows 2000/XP/Vista

Some data that I think that is also relevant:

* Microsoft dropped support for the last Win9x (Windows ME) on July 11,
2006[1]. Win9x is not longer a supported platform.

* In most of the webs that I have access Win9x counts bellow the 3% of users.
May be some one can check the Abiword site stats? In any case, a decreasing
user base.

For me, the options that we have:

1) Port Abiword completely to Unicode (after 2.6) and drop support for Win9x
platforms (ANSI) after Abiword 2.6 and refer users always of these platforms
to Abiword version 2.6. Keep a stable branch 2.6 that we keep updating for
important bugs and security updates.

2) Port Abiword completely to Unicode and keep supporting ANSI and UNICODE
Builds using the Unicows[2]. In my experience, unicows builds are inestable.

3) Port Abiword completely to Unicode and keep supporting both platforms using
separate builds.

What I think makes more sense to me is option 1. This allows us to concentrate
  and grow as a full Unicode application without worring obout legacy systems,
that even in developing countries, are going to be extint next year. Even for
those, we can still offer 2.6.

* My proof of concept

What I have been working these days is basically option 3). I dropped option 2
because the Unicows port always has been very unstable for me.

* How is implemented that I have:

- The internal chartset code for Win32 is now UTF-8. Before was the user's
charset locale. Since the user locale can be single bit in ANSI builds and it
is double-byte in Unicode builds I think it is better to standarized to UTF-8
that is the charset code used internally also by Abiword in most of the places.

- There is a new class called UT_Win32LocaleString that encapsulates the
strings that are displayed and collected from the operating system (mainly
User Interface wifgets) and that has a Unicode and an ANSI implementation.
Also the TCHAR and tchar.h macros are used.

- About 70% of the dialog boxes are ported to be able to compile using ANSI
and Unicode builds. The code of the non-ported dialog boxes has been remove
temporary to allow compilation.

* Code

- Source code (diff):
http://www.softcatala.org/~jmas/abi/unicode.diff and
http://www.softcatala.org/~jmas/abi/ut_Win32LocaleString.cpp and
http://www.softcatala.org/~jmas/abi/ut_Win32LocaleString.h
- Win32 ANSI build:
http://www.softcatala.org/~jmas/abi/abi251_ansi_build_bin.zip
- Win32 UNICODE build:
http://www.softcatala.org/~jmas/abi/abi251_ansi_unicode_bin.zip
- Unicode Abiword running in Nepali (not possible with Ansi):
http://www.softcatala.org/~jmas/abi/nepali.png

* Estimation

My estimation is that whatever approach we follow there are still two months
work. Also, it is going to be good to have a long period of testing because
basically all the Win32 UI classes are going to be affected. Testing will
point out things that we missed during the porting. In fact, even
today in STABLE the are bugs related to encoding.

* Next steps

- Decide which approach we want to follow
- Evolve the current proof of concept to match the decission taken (for
example moving from TCHAR to WCHAR if option 1 is decided).
- Decide if we wait until 2.4.6 is release to commit the final code or we use
a branch. For me, if 2.4.6 is released in less than a month the branch does
not make sense.

Well, that's all for now.

Best regards,

[1] http://www.microsoft.com/windows/support/endofsupport.mspx).
[2] http://www.microsoft.com.nsatc.net/globaldev/handson/dev/mslu_announce.mspx

-- 
Jordi Mas i Hernāndez, HomePage http://www.softcatala.org/~jmas/
Bloc personal http://www.softcatala.org/~jmas/bloc/
Planeta Softcatalā: http://www.softcatala.org/planet/
Received on Mon Apr 9 11:00:40 2007

This archive was generated by hypermail 2.1.8 : Mon Apr 09 2007 - 11:00:41 CEST