Yudit: edit your multi-language text easily

Yudit: edit your multi-language text easily


In this article I will show you how to write multi-language texts without the cumbersome OpenOffice.org. Back in 1999, the Hungarian Gáspár Sinai needed to edit Hungarian and Japanese texts. So he decided to write an editor that was Unicode [1] compliant. Once he had done the basic work, it was a straightforward task to include other languages, and Yudit [2] was born.

Yudit was built for Unix, but Sinai did do a version for Windows.

In this article I’ll show you how easy it is to write multi-language documents with Yudit. Every time I refer to documents, I’ll be referring to plain text documents.

The Unicode standard

7-bit ASCII can only do the basic Latin alphabet, with a few other characters. If I want to write in my own language (Italian) I have to use a special mapping of 8-bit ASCII.

7-bit ASCII can only do the basic Latin alphabet, with a few other characters

The first time I tried to write an Italian document containing some Russian words, I immediately came across the following problem: if extended ASCII gives me Italian letters, what do I have to use to get Cyrillic letters, too? Then I discovered that there are a lot of ASCII extensions: ASCII codes 0—127 match those of 7-bit ASCII; 128—255 provide different characters, according to the set language.

Unicode was designed to answer similar questions, and to replace the number of different overlapping ASCII extensions. It assigns every known letter a unique code, and it is compatible with ASCII (codes 0—127), so that every ASCII document is a Unicode document, too. Of course, a Unicode document is not necessarily an ASCII document. Unicode describes characters in up to 6 bytes. The high order bits of the first byte determine how many bytes a character is. They also group characters by family, for example, Latin or Cyrillic.

Unicode was designed to replace the number of different overlapping ASCII extensions

Enter the (Y)Unicode eDITor

Yudit is designed to save texts in Unicode. So you can use it to write something more complex than programs: for instance, I used Yudit instead of vi to write this article.

When you run the program, you are given a window like the one in figure 1.

Figure 1: the (Y)Unicode eDITor that I used to write this articleFigure 1: the (Y)Unicode eDITor that I used to write this article

From top to bottom you’ll find: an information bar below the titlebar, which indicates the name of the file you’re currently editing and the text encoding (usually utf-8); the buttons; the editor; a data bar, where you read cursor’s line and column, font size and the current character’s Unicode number. At the bottom of the window there is a command line, similar to the one in vi or Emacs. The command line is good for quickly entering commands. You access it either by escaping from the editor, or by clicking on it with the mouse.

The same operations lead you from the command line to the editor. Be careful: once you enter a valid command you’ll no longer be in the command line, but back in the editor.

The first feature I noticed was the cursor, shaped like an arrow: it points into the direction you’re going to write. The only directions, up to now, are left-to-right (as with English) and right-to-left (as with Hebrew). The author managed to change cursor direction and text color when reversing your direction: do it by clicking on the button, or typing Ctrl-D. Figure 2 shows a multi-language document with this feature. Unicode allowed me to mix Italian, French, English, Spanish, Icelandic, Russian, (politonik) Greek and Hebrew in the same document. I have to tell you that I only typed Italian, English and Russian lines, and cut and pasted the others from the internet.

Figure 2: Yudit edits a multi-language (and multi-directional) textFigure 2: Yudit edits a multi-language (and multi-directional) text

Suppose you’ve just opened Yudit for the first time. It’s ready to write left-to-right in plain ASCII. Now reverse the cursor and try to write something.

It’s easy, isn’t it? Don’t forget that backspace and delete keys exchange their usual behavior when writing from right to left.

As Yudit is completely multi-language, you can decide which end-of-line (EOL) character to use. DOS, Mac and Unix, as well as Unicode line and paragraph separator are available. You can even mix them, though it isn’t advisable.

You can always ask Yudit to help you: just enter help in the command line and you’ll be shown a FAQ file.

Babylon by bus

Yudit allows you to change the current font type and size. To change them you can either click the buttons or type function keys (font family) and Ctrl-A (smaller size) or Ctrl-Z (bigger size). If you feel that there aren’t enough font families to choose from, you can install other True Type fonts; you can even install Open Type fonts for the Yudit’s newest versions. Be careful because not all the installed font families provide all the glyphs. Even though you typed your document reading all of it, you may change the font and miss some letters, replaced by their Unicode number.

At last, you have to pay attention that Yudit will apply font changes to the whole text, not to parts of it. Text highlighting is not intended to apply a font change, as in a common word processor.

It’s now time to talk about Yudit’s most interesting feature: easy-to-type Unicode characters. There are several ways to do it, and they involve changing keyboard mapping.

If you start typing in plain English, the mapping (input, from now on) button will say “straight”. This shows that you can type letters as they appear on your keyboard, or your normal mapping.

If you want extra letters, you can change the input with the F1—F12 keys. If you need more input keys, click on the input button to see how many keyboard maps are provided. Here you are invited to provide other maps.

Yudit assigns default keyboard maps to function keys. If they wholly match your needs, you’re okay; if they don’t, you can change the bindings.

Figure 3 shows the keyboard map (KMap) setup window.

Figure 3: the keyboard maps windowFigure 3: the keyboard maps window

To change the default bindings, just select one of the available inputs (from the list on the left), one of the function keys (from the list in the center), and click the arrow button between them. In the right-hand list you can see some input-output associations.

You can type most of the languages directly or typing a sequence of Latin letters for every glyph. For instance, to enter the Cyrillic “я” you have to type the “q” letter according to the JAWERTY mapping, or the “ja” string according to a Russian-transliterated mapping. As you’ll easily understand, every Latin string reproduces (transliterates) the original letter sound.

Okay, you now know how to type almost everything. The only thing I have left to tell you is how to type characters that are not immediately available in any of the mappings.

Suppose you have to type an em-dash. You’re not likely to change the default binding, and associate the em-dash to the hyphen always present on keyboards (you’ll probably need it). You have to type its Unicode number (2014). To enter it switch to “unicode” input, and then type the string “u2014”. As soon as you type “u”, it will be underscored, since it is an escape character in this input. The subsequent valid characters (hexadecimal figures, i.e., 0—9,a—f) will be underscored, too. As soon as you enter a valid sequence (4 hexadecimal figures) the underscored string will be replaced by the corresponding Unicode character. In my example an em-dash.

Tony once said I was a LaTeX guru, but I’m only a LaTeX power user. Anyway, Yudit has a TeX input mode, that substitutes glyphs for TeX commands (i.e., if you type “\times” you’ll get “×”).

Conclusions

I introduced a simple, but powerful, text editor that allows you to easily enter multi-language texts and to save them according to Unicode encoding.

I find it handy when writing articles for Italian or foreign magazines. Unfortunately, they often look for RTF documents, or similar. In this case Yudit isn’t suitable.

In case you need a text editor as fast as vi, or as powerful as Emacs, maybe you won’t find Yudit to your liking. However, if your need is to easily type multi-language, plain text documents, maybe Yudit is what you need.

Bibliography

[1] The Unicode Consortium.

[2] The (Y)Unicode eDITor.

Category: 
License: 

Comments

Berdai Mohammed's picture
Submitted by Berdai Mohammed (not verified) on

Thanks ! ;)

Author information

Gianluca Pignalberi's picture

Biography

Gianluca is Free Software Magazine's Compositor.

Most forwarded

Interview with Dave Mohyla, of DTIDATA

Dave Mohyla is the president and founder of dtidata.com, a hard drive recovery facility based in Tampa, Florida.

TM: Where are you based? What does your company do?
DTI Data recovery is based in South Pasadena, Florida which is a suburb of Tampa. We have been here for over 10 years. We operate a bio-metrically secured class 100 clean room where we perform hard drive recovery on all types of hard disks, from laptop hard drives to multi drive RAID systems.

Anybody up to writing good directory software?

Since the very beginning, directories (of any kind) have had a very central role in the internet. (I have recently grown fond of Free Web Directory. Even Slashdot can be considered a directory: a collection of great news and invaluable user-generated comments. As far as software is concerned, doing a quick search on Google about software directories will return the free (as in freedom) software directories like Savannah, SourceForge, Freshmeat and so on, followed by shareware and freeware sites such as FileBuzz, PCWin Download Center and All Freeware (great if you're looking for shareware and freeware, but definitely less comprehensive than their free-as-in-freedom counterparts).

Interview with Mark Shuttleworth

Mark Shuttleworth is the founder of Thawte, the first Certification Authority to sell public SSL certificates. After selling Thawte to Verisign, Mark moved on to training as an astronaut in Russia and visiting space. Once he got back he founded Ubuntu, the leading GNU/Linux distribution. He agreed on releasing a quick interview to Free Software Magazine.

Is better education the key to finding better software?

I read David Jonathon's article Anybody Up To Writing Good Directory Software? the other day, which got me thinking about software directories in general. As David mentioned, many of the software directories one finds when doing a quick google search are free as in beer, not as in freedom. But what interests me is the software directories that already exist, providing a combination of both free as in beer software, and open source software. Sites such as Freeware Downloads and Shareware Download don't advertise themselves as providing free as in liberty software, but each of them have a good selection of open source software available... if you know where to look.

Most emailed

Free Open Document label templates

If you’ve ever spent hours at work doing mailings, cursed your printer for printing outside the lines on your labels, or moaned “There has got to be a better way to do this,” here’s the solution you’ve been looking for. Working smarter, not harder! Worldlabel.com, a manufacture of labels offers Open Office / Libre Office labels templates for downloading in ODF format which will save you time, effort, and (if you want) make really cool-looking labels

Creating a user-centric site in Drupal

A little while ago, while talking in the #drupal mailing list, I showed my latest creation to one of the core developers there. His reaction was "Wow, I am always surprised what people use Drupal for". His surprise is somehow justified: I did create a site for a bunch of entertainers in Perth, a company set to use Drupal to take over the world with Entertainers.Biz.

Update: since writing this article, I have updated the system so that the whole booking process happens online. I will update the article accordingly!

So, why, why do people and companies develop free software?

More and more people are discovering free software. Many people only do so after weeks, or even months, of using it. I wonder, for example, how many Firefox users actually know how free Firefox really is—many of them realise that you can get it for free, but find it hard to believe that anybody can modify it and even redistribute it legally.

When the discovery is made, the first instinct is to ask: why do they do it? Programming is hard work. Even though most (if not all) programmers are driven by their higher-than-normal IQs and their amazing passion for solving problems, it’s still hard to understand why so many of them would donate so much of their time to creating something that they can’t really show off to anybody but their colleagues or geek friends.

Sure, anybody can buy laptops, and just program. No need to get a full-on lab or spend thousands of dollars in equipment. But... is that the full story?

Fun articles

Santa Claus - the most successful open source project

It dawned on me the other day, as I was shopping for the dozens of gifts it seems I have to buy every December, that Santa Claus is the most successful open source project in history. (Bridget @ Illiterarty would agree with that). Santa Claus is essentially a marketing development that is embodied by everyone who stuffs a sock, gives a gift, hosts a dinner or wishes Merry Christmas over the holiday season.

Most emailed

Editorial

When I first started thinking about Free Software Magazine, I was feeling enthusiastic about the dream. I had Dave, Gianluca, and Alan willing to help me, I had established members of the free software community willing to help me out, I had writers volunteering their time and energy for free, and I had a generous offer from OpenHosting for servers, all before I'd proved myself. There was a sense of excitement in the air, and I thought maybe, just maybe, I could make this work.

Free Software Magazine uses Apollo project management software and CRM for its everyday activities!