[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MiNT] XaAES multi-language help needed



Dne 17.1.2011 17:06, Jo Even Skarstein napsal(a):
On Mon, 2011-01-17 at 00:41 +0100, Jan Krupka wrote:
Jo Even Skarstein napsal(a):
On Sun, 2011-01-16 at 02:01 +0100, Bohdan Milar wrote:

I started to work on the Czech translation. For now I am sending
the xa_help.cz file. It is done in ISO-8859-2 coding, which is the
same as used by get-text localized (.po files) GNU tools in
SpareMiNT.

Doesn't the Atari charset have all the necessary characters? If so,
how is this solved in TOS?

Unfortunately doesn't, font in TOS ROM has hebrew characters instead
international characters or parts of frames how we know it from PC.
Here in Czech we have several solutions: from TSR programs which load cz
font, later replaced by GDOS fonts. Or even special TOS versions with
translated texts and replaced character sets.

Ok, so there is already a Czech font in use. Wouldn't using 8859-2 cause
even more incompatibilies?

It depends. In fact I have been "fighting" against the conservative majority of Czech Atari users for last nearly 10 years.
The key question is - what do we want to be compatible with?

1. Do we like to be compatible with the old data produced ages ago on Atari and stored on SCSI harddrives with PC-incompatible partition tables? Well we can in most cases copy those file to a server and convert them to something readable.

2. Or do we like to be compatible with the whole Unix world, share files via network, browse the web etc.?

And this is the question for all of us, not only Czech, Greek and Russian?

First of all I have to emphasise that the following basic characters (char numbers 32 to 126) are on THE SAME places in all mentioned charsets:

 !"#$%&'()*+,-./
0123456789:;<=>?
@ABCDEFGHIJKLMNO
PQRSTUVWXYZ[\]^_
`abcdefghijklmno
pqrstuvwxyz{|}~

So most of the texts (especially English) will be readable no matter what solution will be taken.

--- DOS vs. ISO ---

It alll happened like this: western European atarists uses the AtariST encoding (how it is called be the GNU recode SW). It was created by Atari as a modification ("improvement") of IBM PC Latin 1 (code page 850 - http://en.wikipedia.org/wiki/Code_page_850) used in MS DOS. Czech language uses something like 20 characters not included in AtariST and CP 850 charsets. Guys (coders from Czech Atari community) who tried to help Czech Atari users (patched TOS and prepared some tools) decided to use a new locally developed (originally for MS DOS, fighting with "official" PC Latin 2) character coding Kamenicky (http://en.wikipedia.org/wiki/Kamenick%C3%BD_encoding - BTW here you can see what can one stupid national character do in URL :-) So all SW written or translated to Czech and nearly all Czech texts (including HYPs) produced on Atari in 90s are in Kamenicky.

How it continued: In 2001 I got my first PC (with Linux) and wanted to share data. Of course Linux worked in ISO-8859-2, so I decided to force MiNT to use ISO-8859-2 as well. I found out, there was already ISO-8859-1 in use. KGMD contained ttyvfont (utility to set fonts in virtual consoles) and in the /usr/share/misc/ttyvfonts/ there was set of ISO-8859-1 bitmap fonts. I took those and using (most probably) GEMFONT 1.22 I modified them to ISO-8859-2. Well it was much easier then make whole new characters (like for ISO-8859-7) because Czech (and other languages covered by ISO-8859-2) needs only diacritic modifications of normal latin characters (so e.g. I copied "o" five times and dots and other things above or below it etc.) In addition me and one other (former) atarist translated tens of GEM programs (incl. qed) and offered both ISO-8859-2 and Kamenicky versions of all the RSCs.

Current situation: western atarists use (I guess) the dead AtariST coding with nearly no problem - all English characters are OK and they don't care about the few special ones (like sharp s in German or "tailed" c in French). In Czech Republic I use ISO-8859-2 and have only problem when I want to use an old software localized to Czech without RSC, or if I want to read an old Czech HYP file. BUT here we talk about MiNT. In SpareMiNT there are hundreds of GNU and other free programs ported from other Unix systems. All of them use ISO-8859-1 as default. Of course you can hardly notice difference when using system in English. But tens of programs ported to SpareMiNT already include also the Czech localization and documentation, which is, of course, in ISO-8859-2.

So do we need Atari machines as access tools for 20 years old close-sourced programs and historical data? Or do we to use MiNT as a modern multi-language system for work, data sharing and enjoying hundreds of localised GNU applications? I would vote for ISO and offer help with converting open source GEM SW (and PD with RSC) to ISO.

--- Technical notes ---

No matter which charsets we will choose (and since we decided not to use UTF), we need a set of fonts for each region (IL1 = Western European, IL2 = Eastern European, IL5 = Russian Cyrillic, IL6 = Arabic, IL7 = Greek, ...) Also a keyboard layout, maybe for each language. And the system should handle it. I guess there is (or soon will not be) a problem with kbd layout. It is only a matter of preparing the tables and learn the system to switch among the chosen ones easily.

It is a bit more complicated with the fonts (but I guess not as much as moving to UTF). We need free (even the btmap) fonts to use in distributions. We need good SW to set a system font and may be better - to change system font on the fly. That would solve even the ISO/Kamenicky (or ISO/AtariST) dilemma for some users. Here we have to say - NVDI is dead! We need a new free VDI. Is it fVDI? May be, but I still have problems with it on real HW (e.g. it did not run with MiNT 1.17.x, but I had no time to try it with the new official 1.17.0 yet).

....

I am sorry, I have no time and energy to continue thinking now...
I hope it is clear what I tried to say.
In short - loosing the compatibility (and it is debatable whether it even is for wester Europe) is much more comfortable then loosing compatibility with present and future. Although there are some technical topics to be solved.
I attached the original KGMD ISO-8859-1 and my ISO-8859-2 fonts.

Jo Even

Regards,

Bohdan

Attachment: iso2a-8.fnt
Description: Binary data

Attachment: iso2a-16.fnt
Description: Binary data

Attachment: iso-atari-8.fnt
Description: Binary data

Attachment: iso-atari-16.fnt
Description: Binary data