What would a keyboard optimised for Luxembourguish look like?

[This article was first published on Econometrics and Free Software, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.


I’ve been using the BÉPO layout for my keyboard since 2010-ish, and it’s been one of the best computing
decisions I’ve ever taken. The BÉPO layout is an optimized layout for French, but it works quite well
for many European languages, English included (the only issue you might have with the BÉPO layout
for English is that the w is a bit far away).

To come up with the BÉPO layout, ideas from a man named August Dvorak were applied for the French
language. Today, the keyboard layout that is optimised for English is called after him, the DVORAK
layout. Dvorak’s ideas were quite simple; unlike the QWERTY layout, his layout had to be based on
character frequency of the English language. The main idea is that the most used
characters of the language should be on the home row of the keyboard. The home row is the row where
you lay your fingers on the keyboard when you are not typing (see picture below).

The problem with the “standard” layouts, such as QWERTY, is that they’re all absolute garbage, and
not optimized at all for typing on a computer. For instance, look at the heatmap below, which shows
the most used characters on a QWERTY keyboard when typing an a standard English text:

(Heatmap generated on https://www.patrick-wied.at/projects/heatmap-keyboard/.)

As you can see, most of the characters used to type this text are actually outside of the home row, and
the majority of them on the left hand side of the keyboard. The idea of Dvorak was to first, put the
most used characters on the home row, and second to try to have an equal split of characters, 50% for each hand.

The same text on the DVORAK layout, shows how superior it is:

As you can see, this is much much better. The same idea was applied to develop the BÉPO layout for
French. And because character frequency is quite similar across languages, learning a layout such as
the BÉPO not only translates to more efficient typing for French, but also for other languages, such
as English, as already explained above.

The reason I’m writing this blog post is due, in part, to the confinement situation
that many people on Earth are currently facing due to the Corona virus. I have a job where I spend
my whole day typing, and am lucky enough to be able to work from home. Which means that I’m lucky
enough to use my mechanical keyboard to work, which is really great. (I avoid taking my mechanical
keyboard with me at work, because I am never very long in the same spot, between meeting and client
assignments…). But to have a mechanical keyboard that’s easy to take with me,
I decided to buy a second mechanical keyboard, a 40% keyboard from Ergodox (see picture below):

Because I don’t even want to see the QWERTY keycaps, I bought blank keycaps to replace the ones that
come with the keyboard. Anyway, this made me think about how crazy it is that in 2020 people still
use absolute garbage keyboard layouts (and keyboards by the way) to type on, when their job is
basically only typing all day long. It made me so angry that I even made a video, which you enjoy
here.

The other thing I thought about was the specific case of Luxembourg, a country with 3 official
languages (Luxembourguish, French and German), a very large Portuguese minority, and where English
became so important in recent years that the government distributed leaflets in English to the
population (along with leaflets in French, Luxembourguish, German and Portuguese of course) explaining
what is and is not allowed during the period of containment. What would a keyboard optimized for
such a unique country look like?

Of course, the answer that comes to mind quickly is to use the BÉPO layout; even though people routinely
write in at least 3 of the above-mentioned languages, French is still the one that people use most
of the time for written communication (at least, that’s my perception). The reason is that while
Luxembourguish is the national language, and the language of the native population, French has
always been the administrative language, and laws are still written in French only, even though
they’re debated in Luxembourguish in the parliament.
However, people also routinely write emails in German or English, and more and more people also
write in Luxembourguish. This means that a keyboard optimized for Luxembourguish, or rather, for
the multilinguistic nature of the Luxembourguish country, should take into account all these
different languages. Another thing to keep in mind is that Luxembourguish uses many French words,
and as such, writing these words should be easy.

So let’s start with the BÉPO layout as a base. This is what it looks like:

A heatmap of character frequencies of a French, or even English, text would show that the most used
characters are on the home row. If you compare DVORAK to BÉPO, you will see that the home row is
fairly similar. But what strikes my colleagues when they see a picture of the BÉPO layout, is the
fact that the characters é, è, ê, à and ç can be accessed directly. They are so used to having
these characters only accessible by using some kind of modifier key that their first reaction is to
think that this is completely stupid. However, what is stupid, is not having these letters easily
accessible, and instead having, say, z easily accessible (the French “standard” layout is called
AZERTY, which is very similar and just as stupid as the QWERTY layout. The letter Z is so easy to
type on, but is almost non-existing in French!).

So let’s analyze character frequencies of a Luxembourguish text and see if the BÉPO layout could be
a good fit. I used several text snippets from the Bible in Luxembourguish for this, and a few lines
of R code:

library(tidyverse)
library(rvest)
root_url <- "https://cathol.lu/article"

texts <- seq(4869,4900)

urls <- c("https://cathol.lu/article4887",
          "https://cathol.lu/article1851",
          "https://cathol.lu/article1845",
          "https://cathol.lu/article1863",
          "https://cathol.lu/article1857",
          "https://cathol.lu/article4885",
          "https://cathol.lu/article1648",
          "https://cathol.lu/article1842",
          "https://cathol.lu/article1654",
          "https://cathol.lu/article1849",
          "https://cathol.lu/article1874",
          "https://cathol.lu/article4884",
          "https://cathol.lu/article1878",
          "https://cathol.lu/article2163",
          "https://cathol.lu/article2127",
          "https://cathol.lu/article2185",
          "https://cathol.lu/article4875")

Now that I’ve get the urls, let’s get the text out of it:

pages <- urls %>%
  map(read_html)

texts <- pages %>%
  map(~html_node(., xpath = '//*[(@id = "art_texte")]')) %>%
  map(html_text)

texts is a list containing the raw text from the website. I used several functions from the {rvest}
package to do this. I won’t comment on them, because this is not a tutorial about webscraping (I’ve
written several of those already), but a rant about keyboard layout gosh darn it.

Anyway, let’s now take a look at the character frequencies, and put that in a neat data frame:

characters <- texts %>%
  map(~strsplit(., split = "")) %>%
  unlist() %>%
  map(~strsplit(., split = "")) %>%
  unlist() %>%
  tolower() %>%
  str_extract_all(pattern = "[:alpha:]") %>%
  unlist() %>%
  table() %>%  
  as.data.frame()

Computing the frequencies is now easy:

characters <- characters %>%
  mutate(frequencies = round(Freq/sum(Freq)*100, digits = 2)) %>%
  arrange(desc(frequencies)) %>%  
  janitor::clean_names()

Let’s start with the obvious differences: there is not a single instance of the characters è, ê
or ç, which are used in French only. There are however instances of ü, ä, and ë. These
characters should be easily accessible, however their frequencies are so low, that they could still
only be accessible using a modifier key, and it would not be a huge issue. However, since ç does
not appear at all, maybe it could be replaced by ä and ê could be replaced by ë. But we must
keep in mind that since the average Luxembourger has to very often switch between so many languages,
I would suggest that these French characters that would be replaced should still be accessible
using a modifier such as Alt Gr. As for the rest, the layout as it stands is likely quite ok.
Well, actually I know it’s ok, because when I write in Luxembourguish using the BÉPO layout, I find
it quite easy to do. But let’s grab a French and a German text, and see how the ranking
of the characters compare. Let’s get some French text:

Click to read the French text

french <- "Au commencement, Dieu créa les cieux et la terre.
La terre était informe et vide: il y avait des ténèbres à la surface de l'abîme, et l'esprit de Dieu se mouvait au-dessus des eaux.
Dieu dit: Que la lumière soit! Et la lumière fut.
Dieu vit que la lumière était bonne; et Dieu sépara la lumière d'avec les ténèbres.
Dieu appela la lumière jour, et il appela les ténèbres nuit. Ainsi, il y eut un soir, et il y eut un matin: ce fut le premier jour.
Dieu dit: Qu'il y ait une étendue entre les eaux, et qu'elle sépare les eaux d'avec les eaux.
Et Dieu fit l'étendue, et il sépara les eaux qui sont au-dessous de l'étendue d'avec les eaux qui sont au-dessus de l'étendue. Et cela fut ainsi.
Dieu appela l'étendue ciel. Ainsi, il y eut un soir, et il y eut un matin: ce fut le second jour.
Dieu dit: Que les eaux qui sont au-dessous du ciel se rassemblent en un seul lieu, et que le sec paraisse. Et cela fut ainsi.
Dieu appela le sec terre, et il appela l'amas des eaux mers. Dieu vit que cela était bon.
Puis Dieu dit: Que la terre produise de la verdure, de l'herbe portant de la semence, des arbres fruitiers donnant du fruit selon leur espèce et ayant en eux leur semence sur la terre. Et cela fut ainsi.
La terre produisit de la verdure, de l'herbe portant de la semence selon son espèce, et des arbres donnant du fruit et ayant en eux leur semence selon leur espèce. Dieu vit que cela était bon.
Ainsi, il y eut un soir, et il y eut un matin: ce fut le troisième jour.
Dieu dit: Qu'il y ait des luminaires dans l'étendue du ciel, pour séparer le jour d'avec la nuit; que ce soient des signes pour marquer les époques, les jours et les années;
et qu'ils servent de luminaires dans l'étendue du ciel, pour éclairer la terre. Et cela fut ainsi.
Dieu fit les deux grands luminaires, le plus grand luminaire pour présider au jour, et le plus petit luminaire pour présider à la nuit; il fit aussi les étoiles.
Dieu les plaça dans l'étendue du ciel, pour éclairer la terre,
pour présider au jour et à la nuit, et pour séparer la lumière d'avec les ténèbres. Dieu vit que cela était bon.
Ainsi, il y eut un soir, et il y eut un matin: ce fut le quatrième jour.
Dieu dit: Que les eaux produisent en abondance des animaux vivants, et que des oiseaux volent sur la terre vers l'étendue du ciel.
Dieu créa les grands poissons et tous les animaux vivants qui se meuvent, et que les eaux produisirent en abondance selon leur espèce; il créa aussi tout oiseau ailé selon son espèce. Dieu vit que cela était bon.
Dieu les bénit, en disant: Soyez féconds, multipliez, et remplissez les eaux des mers; et que les oiseaux multiplient sur la terre.
Ainsi, il y eut un soir, et il y eut un matin: ce fut le cinquième jour.
Dieu dit: Que la terre produise des animaux vivants selon leur espèce, du bétail, des reptiles et des animaux terrestres, selon leur espèce. Et cela fut ainsi.
Dieu fit les animaux de la terre selon leur espèce, le bétail selon son espèce, et tous les reptiles de la terre selon leur espèce. Dieu vit que cela était bon.
Puis Dieu dit: Faisons l'homme à notre image, selon notre ressemblance, et qu'il domine sur les poissons de la mer, sur les oiseaux du ciel, sur le bétail, sur toute la terre, et sur tous les reptiles qui rampent sur la terre.
Dieu créa l'homme à son image, il le créa à l'image de Dieu, il créa l'homme et la femme.
Dieu les bénit, et Dieu leur dit: Soyez féconds, multipliez, remplissez la terre, et l'assujettissez; et dominez sur les poissons de la mer, sur les oiseaux du ciel, et sur tout animal qui se meut sur la terre.
Et Dieu dit: Voici, je vous donne toute herbe portant de la semence et qui est à la surface de toute la terre, et tout arbre ayant en lui du fruit d'arbre et portant de la semence: ce sera votre nourriture.
Et à tout animal de la terre, à tout oiseau du ciel, et à tout ce qui se meut sur la terre, ayant en soi un souffle de vie, je donne toute herbe verte pour nourriture. Et cela fut ainsi.
Dieu vit tout ce qu'il avait fait et voici, cela était très bon. Ainsi, il y eut un soir, et il y eut un matin: ce fut le sixième jour."
characters_fr <- french %>%
  map(~strsplit(., split = "")) %>%
  unlist() %>%
  map(~strsplit(., split = "")) %>%
  unlist() %>%
  tolower() %>%
  str_extract_all(pattern = "[:alpha:]") %>%
  unlist() %>%
  table() %>%  
  as.data.frame() %>%  
  mutate(frequencies = round(Freq/sum(Freq)*100, digits = 2)) %>%
  arrange(desc(frequencies)) %>%  
  janitor::clean_names()

Let’s now do the same for German:

Click to read the German text

german <- "Am Anfang schuf Gott Himmel und Erde.
Und die Erde war wüst und leer, und es war finster auf der Tiefe; und der Geist Gottes schwebte auf dem Wasser.
Und Gott sprach: Es werde Licht! und es ward Licht.
Und Gott sah, daß das Licht gut war. Da schied Gott das Licht von der Finsternis
und nannte das Licht Tag und die Finsternis Nacht. Da ward aus Abend und Morgen der erste Tag.
Und Gott sprach: Es werde eine Feste zwischen den Wassern, und die sei ein Unterschied zwischen den Wassern.
Da machte Gott die Feste und schied das Wasser unter der Feste von dem Wasser über der Feste. Und es geschah also.
Und Gott nannte die Feste Himmel. Da ward aus Abend und Morgen der andere Tag.
Und Gott sprach: Es sammle sich das Wasser unter dem Himmel an besondere Örter, daß man das Trockene sehe. Und es geschah also.
Und Gott nannte das Trockene Erde, und die Sammlung der Wasser nannte er Meer. Und Gott sah, daß es gut war.
Und Gott sprach: Es lasse die Erde aufgehen Gras und Kraut, das sich besame, und fruchtbare Bäume, da ein jeglicher nach seiner Art Frucht trage und habe seinen eigenen Samen bei sich selbst auf Erden. Und es geschah also.
Und die Erde ließ aufgehen Gras und Kraut, das sich besamte, ein jegliches nach seiner Art, und Bäume, die da Frucht trugen und ihren eigenen Samen bei sich selbst hatten, ein jeglicher nach seiner Art. Und Gott sah, daß es gut war.
Da ward aus Abend und Morgen der dritte Tag.
Und Gott sprach: Es werden Lichter an der Feste des Himmels, die da scheiden Tag und Nacht und geben Zeichen, Zeiten, Tage und Jahre
und seien Lichter an der Feste des Himmels, daß sie scheinen auf Erden. Und es geschah also.
Und Gott machte zwei große Lichter: ein großes Licht, das den Tag regiere, und ein kleines Licht, das die Nacht regiere, dazu auch Sterne.
Und Gott setzte sie an die Feste des Himmels, daß sie schienen auf die Erde
und den Tag und die Nacht regierten und schieden Licht und Finsternis. Und Gott sah, daß es gut war.
Da ward aus Abend und Morgen der vierte Tag.
Und Gott sprach: Es errege sich das Wasser mit webenden und lebendigen Tieren, und Gevögel fliege auf Erden unter der Feste des Himmels.
Und Gott schuf große Walfische und allerlei Getier, daß da lebt und webt, davon das Wasser sich erregte, ein jegliches nach seiner Art, und allerlei gefiedertes Gevögel, ein jegliches nach seiner Art. Und Gott sah, daß es gut war.
Und Gott segnete sie und sprach: Seid fruchtbar und mehrt euch und erfüllt das Wasser im Meer; und das Gefieder mehre sich auf Erden.
Da ward aus Abend und Morgen der fünfte Tag.
Und Gott sprach: Die Erde bringe hervor lebendige Tiere, ein jegliches nach seiner Art: Vieh, Gewürm und Tiere auf Erden, ein jegliches nach seiner Art. Und es geschah also.
Und Gott machte die Tiere auf Erden, ein jegliches nach seiner Art, und das Vieh nach seiner Art, und allerlei Gewürm auf Erden nach seiner Art. Und Gott sah, daß es gut war.
Und Gott sprach: Laßt uns Menschen machen, ein Bild, das uns gleich sei, die da herrschen über die Fische im Meer und über die Vögel unter dem Himmel und über das Vieh und über die ganze Erde und über alles Gewürm, das auf Erden kriecht.
Und Gott schuf den Menschen ihm zum Bilde, zum Bilde Gottes schuf er ihn; und schuf sie einen Mann und ein Weib.
Und Gott segnete sie und sprach zu ihnen: Seid fruchtbar und mehrt euch und füllt die Erde und macht sie euch untertan und herrscht über die Fische im Meer und über die Vögel unter dem Himmel und über alles Getier, das auf Erden kriecht.
Und Gott sprach: Seht da, ich habe euch gegeben allerlei Kraut, das sich besamt, auf der ganzen Erde und allerlei fruchtbare Bäume, die sich besamen, zu eurer Speise,
und allem Getier auf Erden und allen Vögeln unter dem Himmel und allem Gewürm, das da lebt auf Erden, daß sie allerlei grünes Kraut essen. Und es geschah also.
Und Gott sah alles an, was er gemacht hatte; und siehe da, es war sehr gut. Da ward aus Abend und Morgen der sechste Tag."
characters_gr <- german %>%
  map(~strsplit(., split = "")) %>%
  unlist() %>%
  map(~strsplit(., split = "")) %>%
  unlist() %>%
  tolower() %>%
  str_extract_all(pattern = "[:alpha:]") %>%
  unlist() %>%
  table() %>%  
  as.data.frame() %>%  
  mutate(frequencies = round(Freq/sum(Freq)*100, digits = 2)) %>%
  arrange(desc(frequencies)) %>%
  janitor::clean_names()

Let’s now visualize how the rankings evolve between these three languages. For this, I’m using the newggslopegraph()
function from the {CGPfunctions} package:

characters$rank <- seq(1, 30)
characters_fr$rank <- seq(1, 29)
characters_gr$rank <- seq(1, 27)

characters_fr <- characters_fr %>%
  select(letters = x, rank) %>%
  mutate(language = "french")

characters_gr <- characters_gr %>%
  select(letters = x, rank) %>%
  mutate(language = "german")

characters <- characters %>%
  select(letters = x, rank) %>%
  mutate(language = "luxembourguish")

characters_df <- bind_rows(characters, characters_fr, characters_gr)

CGPfunctions::newggslopegraph(characters_df, 
                              language,
                              rank,
                              letters,
                              Title = "Character frequency ranking for the Luxembourguish official languages",
                              SubTitle = NULL,
                              Caption = NULL,
                              YTextSize = 4) 
## Registered S3 methods overwritten by 'lme4':
##   method                          from
##   cooks.distance.influence.merMod car 
##   influence.merMod                car 
##   dfbeta.influence.merMod         car 
##   dfbetas.influence.merMod        car

Click to look at the raw data

characters_df 
##    letters rank       language
## 1        e    1 luxembourguish
## 2        n    2 luxembourguish
## 3        s    3 luxembourguish
## 4        a    4 luxembourguish
## 5        i    5 luxembourguish
## 6        t    6 luxembourguish
## 7        d    7 luxembourguish
## 8        r    8 luxembourguish
## 9        h    9 luxembourguish
## 10       u   10 luxembourguish
## 11       g   11 luxembourguish
## 12       m   12 luxembourguish
## 13       o   13 luxembourguish
## 14       l   14 luxembourguish
## 15       c   15 luxembourguish
## 16       w   16 luxembourguish
## 17       é   17 luxembourguish
## 18       k   18 luxembourguish
## 19       f   19 luxembourguish
## 20       ä   20 luxembourguish
## 21       z   21 luxembourguish
## 22       p   22 luxembourguish
## 23       j   23 luxembourguish
## 24       ë   24 luxembourguish
## 25       b   25 luxembourguish
## 26       v   26 luxembourguish
## 27       ü   27 luxembourguish
## 28       q   28 luxembourguish
## 29       x   29 luxembourguish
## 30       y   30 luxembourguish
## 31       e    1         french
## 32       u    2         french
## 33       i    3         french
## 34       t    4         french
## 35       s    5         french
## 36       a    6         french
## 37       l    7         french
## 38       r    8         french
## 39       n    9         french
## 40       d   10         french
## 41       o   11         french
## 42       c   12         french
## 43       m   13         french
## 44       p   14         french
## 45       é   15         french
## 46       q   16         french
## 47       v   17         french
## 48       f   18         french
## 49       b   19         french
## 50       è   20         french
## 51       x   21         french
## 52       y   22         french
## 53       j   23         french
## 54       à   24         french
## 55       z   25         french
## 56       g   26         french
## 57       h   27         french
## 58       ç   28         french
## 59       î   29         french
## 60       e    1         german
## 61       n    2         german
## 62       d    3         german
## 63       a    4         german
## 64       s    5         german
## 65       r    6         german
## 66       t    7         german
## 67       i    8         german
## 68       u    9         german
## 69       h   10         german
## 70       g   11         german
## 71       c   12         german
## 72       l   13         german
## 73       m   14         german
## 74       f   15         german
## 75       o   16         german
## 76       b   17         german
## 77       w   18         german
## 78       ü   19         german
## 79       ß   20         german
## 80       v   21         german
## 81       z   22         german
## 82       p   23         german
## 83       j   24         german
## 84       k   25         german
## 85       ö   26         german
## 86       ä   27         german

Certain things pop out of this plot: the rankings of the German and Luxembourguish languages are
more similar than the rankings of French and Luxembourguish, but overall, the three languages have
practically the same top 10 characters. Using the same base as the BÉPO layout should be
comfortable enough, but the characters h and g, which are not very common in French, are much
more common in Luxembourguish, and should thus be better placed. I would advise against using the
German ergonomic/optimized layout, however, because as I said in the beginning, French is still
probably the most written language, certainly more often written than German. So even though the
frequencies of characters are very similar between Luxembourguish and German, I would still prefer
to use the French BÉPO layout.

I don’t know if there ever will be an ergonomic/optimized layout for Luxembourguish, but I sure
hope that more and more people will start using layouts such as the BÉPO, which are really great
to use. It takes some time to get used to, but in general in about one week of usage, maybe two,
you should be as fast as you were on the legacy layout.

Hope you enjoyed! If you found this blog post useful, you might want to follow
me on twitter for blog post updates and watch my
youtube channel. If you want to support
my blog and channel, you could buy me an espresso or
paypal.me, or buy my ebook on Leanpub.

Buy me an EspressoBuy me an Espresso

To leave a comment for the author, please follow the link and comment on their blog: Econometrics and Free Software.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)