Changing keyboard layout on Touchpad

From WebOS Internals
Revision as of 21:46, 20 October 2011 by Compvir (talk | contribs) (→‎V. Offsets: 3.0.4 offsets)
Jump to navigation Jump to search

Virtual keyboard layouts on the TouchPad are hardcoded into the LunaSysMgr binary. Therefore, editing the keyboard layouts requires a binary patch to the LunaSysMgr executable. HP/Palm has not released the source for LunaSysMgr.

This article describes how to perform that patch. Some patches for dvorak and other layouts can be found on this PreCentral thread

Built-in layouts

Generally, there are three keyboard layouts: QWERTY, QWERTZ and AZERTY. On the selected keys of these layouts, extended characters can be accessed with a long tap on some keys. The long tap causes a lookup of the extended character set defined for that key, and those optional characters are displayed to be selected from. For example, the "a" key may have extended characters of á or à or â or ã.

I. Keyboard layouts

Each keyboard layout consists of a main layout and 3 interchangeable bottom rows on the keyboard for different applications.

The main keyboard layout describes 5 rows of 12 keys each.

The 3 bottom rows are 12 keys each.

In total, there are 8 rows of 12 keys each.

Each key is described by 4 double words (4 bytes).

  1. Button type (big endian)
    2 last bytes are used
    Mostly they are set of flags but there is too few info to decode all of them.
    1. The least one is button type
      1. BF - no button. i.e. there is button and if you press it it will type but it is invisible (try typing to the right of 0 or to the left of 1 to check what is being typed in normal keyboard)
      2. 3F - normal button
      3. 40 - long button (like spacebar)
    2. The second is length class:
      1. 40 - short one (mostly spacers)
      2. 80 - normal width
      3. C0 - long one (Shift etc.)
      4. A0 - very long one (space bar)
    Normal button will look like 0x0000803F
  2. UTF-16LE code of main symbol followed bt 2 byte code of source (0x0000 if normal symbol, 0x0001 if special like shift enter etc, 0x2001 if text from static variable)
  3. UTF-16LE code of secondary symbol followed bt 2 byte code of source (0x0000 if normal symbol, 0x0001 if special like shift enter etc, 0x2001 if text from static variable)
  4. Extended characters set address (little endian)

About symbols (2,3) If the greatest byte (the most right one in LE) is 01, then it is not symbol but the text from some static variable defined in the file (like smiles, .com, etc.) or special button like shift, enter, etc. If main symbol is considered regional by system (it is hardcoded somewhere in LunaSysMgr), then the behaviour is as follows:

  • When shift is pressed, the big letter is produced.
  • Without shift, small letter is produced.
  • With caps on, big letter is produced.
  • When switched to secondary keys ([]-/ button), secondary symbol is
    produced always.
  • Button is white

If main symbol is not considered regional (unfortunately, Russian letters are not considered regional),

  • When shift is pressed, secondary symbol is produced.
  • Without shift small (if applicable), main symbol is produced.
  • With caps on big (if applicable), main symbol is produced.
  • When switched to secondary keys ([]-/ button) behaviour of
    previous statements is not changed. And look of keys too.
  • Button is light grey.

To make non-regional keys act like regional ones, the only workaround is to duplicate them as main and secondary symbols.

Alternative way is to modify some code elements of LunaSysMgr to make it think that characters are regional. This process was researched by Isaac Garzón (isagar2004) and automated in the script described in #III._Patching_script

It is easy to say that any zero key is just skipped during keyboard generation process, so in reality, there can be less then 12 buttons in a row.

Extended characters set address is internal pointer not and offset, so it is quite a work to find which one corresponds to which character set.

II. Extended characters sets

There are 65 extended character sets.

Namely:

  • 17 for letters (E R T Y U I O P A S D G L Z C N M). They are general for all 3 keyboard layouts.
  • 3*10 for numbers (1 2 3 4 5 6 7 8 9 0). For each layout there are different sets for each number
  • 5+5+6 for punctuations and .com. There are 4 for QWERTY and QWERTZ and 5 for AZERTY. And for each layout, there is special set for .com.
  • Also there is a set for http:// and for choosing keyboard size. They are general for all 3 layouts.


Each set is a sequence of double words with utf-codes of letters (or links to static text) ending with zero dword.

For example, for the A-letter set, it is:

   41000000 c0000000 c1000000 c2000000 c3000000 c4000000 c5000000 e6000000 aa000000 00000000


III. Patching script

Just finished patching script that is run on a TouchPad, generates needed layout from JSON file and then patches LunaSysMgr on the fly.

It is really easy to use. The only thing you should do is to enable developer mode and have novaterm to be able to access console (Of course, you can use ssh if you've installed openssh or dropbear from preware feeds).

Disclaimer

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Use it at your own risk.

It is a shell script which besides internal instructions uses the following commands:

  • echo (For feedback and pipelined passing of variables)
  • sha1sum (for determining version)
  • mv, cp, rm (mostly operations with tempfiles as I found that file operations are faster then string variables)
  • hexdump (reading hexcodes)
  • dd (patching main file)
  • sed (parsing JSON)
  • diff, ls, awk (Checking differences while self test and checking size during patching process)
  • date (for selftest and timing only)
  • sleep (for pausing after mount and luna stop as some people complained on errors sometimes)
  • initctl (to start stop luna during replacing)
  • mount (to remount / as rw and ro)

As all these commands are part of core *nix functions, they are present in TouchPad, and so no need to install any other software.

Script description

Script consists of several files

  • patch.sh - main file that governs the process
  • funcs.sh - repositary of functions that are used during patching process
  • map.sh - Variables for mapping human readable variables to machine codes where they are necessary (like width and class of buttons)
  • vars - Variables that are subject to change. Naming
    • VERSIONS - Set of versions that are used during version check
    • FILEPATH - Path to LunaSysMgr. Usually /usr/bin/LunaSysMgr but can be changed to run not on TouchPad but linux machine or other.
    • SHA1SUMPARAMS - Parameters for sha1sum command, as I found out that they differ on TouchPad and my debian-based OS (-cs for TouchPad and --check --status for debian)
  • <version>.offsets.sh files - They contains offset and sha1sum information about versions of LunaSysMgr. DO NOT EDIT THESE FILES UNLESS YOU ARE ABSOLUTELY SURE OF WHAT YOU ARE DOING.
  • (optionally). rus.json - Containes example of layout file. It describes my original Cyrillic Russian layout replacement for AZERTY. DO NOT USE THIS FILE AS STARTING POINT. Only as example.

Definitions

  • Main file - File that is stated as FILEPATH in "vars" file
  • Backup file - File with path "FILEPATH.orig"
  • Original file - Main or backup file, which have detected version. If both versions are detected Main file will be used.

IF MAIN AND BACKUP FILES HAVE DIFFERENT BUT DETECTABLE VERSIONS (For ex. after OTA) THEN BACKUP FILE WILL BE REPLACED DURING PATCHING PROCESS WITH NEW ONE.

Script Usage

There are few actions that you can do with this script.

  • sh patch.sh --help
    Display information on developer and short info of usage.
  • sh patch.sh version
    Just checks the version of main and backup and shows message wheather pathcing or generation processes are available
  • sh patch.sh generate
    If detectable version is found, generates (or forcibly regenerates) <version>.data and <version>.template.json files using original file.
    The first one is used afterwords for patching and mostly saved to save time in future. But it safely can be deleted after each execution. Script will regenerate it whenever needed.
    The second contains full JSON info about layouts in original file
  • sh patch.sh check
    Performs a self test of script. Operations done are as follows:
    • Checks versions of main and backup files
    • Generates <version>.data file if not present. If present then just read info from it
    • Generates <version>.template.json if not present.
    • Reads and parses <version>.template.json.
    • Copies original file to temp location and patches it
    • Compares patched and original file and echoes the result
    • Deletes patched file
    If self test is failed it is strongly recommended NOT to use this script and contact developer.
    WARNING: Parsing JSON process can take 5-8 minutes for full template. Be patient.
  • sh patch.sh check <layout file>
    Almost the same as self test but uses given template and does not delete patched file but echoes it's name (<version>.patched)
    Also if REGIONPATCH parameter is supplied in JSON file, patches code sections. ONLY AVAILABLE FOR 3.0.2 VERSIONS OF DEVICE AND EMULATOR.
    WARNING: Parsing JSON process can take 5-8 minutes for full template. Be patient.
  • sh patch.sh patch <layout file>
    Main patching procedure. After making patched file it backs up main file (if needed) and replaces main file with patched one.
    Also if LANGCODE parameters is supplied, adds lines of format

<syntaxhighlight lang="javascript"> {"languageName":"KeyBoardPatch_<langcode>","languageCode":"<langcode>","countries":[]}, </syntaxhighlight>

  • to /usr/lib/luna/customization/locale.txt (or /etc/palm/locale.txt if missing). ONLY AVAILABLE FOR 3.0.2 VERSIONS OF DEVICE AND EMULATOR.
    Prior to adding langcodes it deletes old ones(from previous patchings).
    If any of specified langcodes are found in locale.txt then script just ignores them with the message.
    WARNING: Parsing JSON process can take 5-8 minutes for full template. Be patient.
    WARNING: Replacement of main file is followed with Luna restart, do not leave any unsaved data open and don't be afraid.
    WARNING: If LANGCODE is supplied full devise restart is required for changes to take effect.
  • sh patch.sh revert
    Restores backed up file if it has detectable version.
    Also removes any KeyBoardPatch_ langcodes from /usr/lib/luna/customization/locale.txt (or /etc/palm/locale.txt if missing)

WARNING: During process the following files could be created in script folder

  • tmp, tmp0
    Temp files that are used by script during template generation and parsing processes. Should be auto deleted after execution. But if interrupted feel free to delete them yourselves (If they interfere with your means)
  • <version>.data
    File that contains important variables of layouts: offsets, button counts, virtual addresses and so on. They are needed during almost any processing of the script. If deleted file will be regenerated. It is strongly advised not to delete the file, because it speeds up the process.
  • <version>.template.json
    JSON file containing layout info for user

JSON layout file structure and editing hints

Layout file have standard JSON structure

Here is short example of it:

<syntaxhighlight lang="javascript"> {

   "layouts": {
       "QWERTY": {
           "0": {
               "0": {
                   "WIDTH": "none",
                   "CLASS": "invisible",
                   "MAIN": {"TYPE": "C", "CHAR": "Q"},
                   "SEC": {"TYPE": "C", "CHAR": "["},
                   "EXT": ""
               },
               "1": {
                   "WIDTH": "normal",
                   "CLASS": "normal",
                   "MAIN": {"TYPE": "C", "CHAR": "1"},
                   "SEC": {"TYPE": "C", "CHAR": "!"},
                   "EXT": "SET0"
               }
           }
       }
   },
   "sets": {
       "SET0": {
           "0": {"TYPE": "C", "CHAR": "1"},
           "1": {"TYPE": "C", "CHAR": "!"},
           "2": {"TYPE": "C", "CHAR": "¹"},
           "3": {"TYPE": "C", "CHAR": "¼"},
           "4": {"TYPE": "C", "CHAR": "½"},
           "5": {"TYPE": "C", "CHAR": "¡"}
       }
   },
   "params": {
       "REGIONPATCH": "0",
       "LANGCODE": {
           "0": ""
       }
   },

} </syntaxhighlight>

The file consists of 3 main objects: layouts, sets and params.

  1. layouts
    Layouts describes 3 main layouts
    1. QWERTY
    2. QWERTZ
    3. AZERTY
    Each layout consists of maximum 8 rows (0 - 7): 0 - numbers row, 1-3 - letter rows, 4 - Bottom row, 5 - bottom row by default, 6 - bottom row for url, 7 - bottom row for email
    Each row consists of maximum 12 buttons (0 - 11)
    Each button is described by sevral parameters
    1. WIDTH
      Evidently width of the button. It can be of 5 values
      1. "none" - No special width set. Autosized by system
      2. "short" - Short button. Shorter then normal sized approximately by half
      3. "normal" - Standard-sized button
      4. "long" - Long button like shift and enter in original layouts.
      5. "spacebar" - Very long button. Used only for spacebar and language/special char changer. Also used in conjuction with CLASS="spacebar"
    2. CLASS
      Class or type of the button. It can be
      1. "none" - no class. "If there is no class, then there is no button" ( Almost citation :-) )
      2. "normal" - standard button class
      3. "invisible" - Self-explaining. There is button but we cannot see it. Mostly used for indents. The can have characters assigned and even will type them(look to the left and right of number row in QWERTY, try to type them)
      4. "spacebar" - Used for spacebar. (Suppose that width is not just stored in 1 byte and spreads to half of class byte too)
    3. EXT
      This is the name of set of extended characters, which are accessed by long press. If there is no extended characters on a button empty string ("") is used
    4. MAIN and SEC
      Main and secondary character respectively. They have two parameters each
      1. TYPE
        Type of character
        1. "C" - normal Сharacter
        2. "S" - Special character (like shift, enter, backspace etc.)
        3. "V" - character from static Variable. (Smiles, .com, http:// etc.)
      2. CHAR
        The character itself.
        If type is "C" then it is just normal UTF-8 character
        If type is "S" or "V" it's value is some hex numeric constant.
  2. sets
    Sets are sets of extended characters that are accessed by long tap on a button.
    There are 65 sets (0-64) and each set is named SET#.
    Each set consists of several characters (0,1,...)
    Each character is described by TYPE and CHAR which are absolutely the same as in layout description.
  3. params
    Params contains special parameters for patching process. It is optional and should be used with CAUTION.
    Right now it supports only 2 options
    1. REGIONPATCH
      It is an option that enables patching regional status of keys made by Isaac Garzón (isagar2004).
      By default it is off. Enabling it. i.e. setting value to "1" mades some changes in ASM code of LunaSysMgr and alters its behaviour, assigning key backgrounds and look
    2. LANGCODE
      This option is made only for aesthetic reason. It adds two-character abbreviation to /usr/lib/luna/customization/locale.txt (or /etc/palm/locale.txt if missing) with name KeyBoardPatch_<langcode> so you can assign it in language select to chosen keyboard. There can be any number of additional Langcodes/ Just specify them like

<syntaxhighlight lang="javascript"> "0": "ru", "1": "ua" </syntaxhighlight>

etc...
WARNING: Full device reboot is required for these changes to be effective.
WARNING: DO NOT USE THIS OPTION WITH ANY LOCALIZATION PACKAGE.
NOTICE: It won't work with 3.0.4 version yet, as now Languages are taken not from localizations but from spelling checking. So it is still not very clear as to add new langcodes.

As you can see in stated example, there are 2 first buttons (0,1), in number row (0), in QWERTY layout. The first one is invisible button with characters Q and /, and the second is the normal button with 1 and !, and SET0 binded to it.

Also, we can see that SET0 consists of 6 buttons (0-5) which are: 1, !, ¹, ¼, ½, ¡

WARNING: Symbols \ and " should be escaped in JSON format, so to use them type "\\" and "\"" respectively

NOTICE: If object or parameter of an object have not been changed from original template, you can delete it from template to speed further parsing process, but do it ONLY IF YOU REALLY KNOW how JSON is structured and know that you won't ruin it's structure.


In keyboard layout, you can easily change button type flags, primary and secondaty symbol.

Extended characters set can be changed too.

Characters in extended characters sets can be changed.

You can choose which set should be assigned to which button by changing EXT in layout.

WARNING: Do not change names of sets, or they won't be read.

WARNING: It is not possible to increase number of rows, buttons or characters in set, as we cannot increase size of data in binary file, so any extra character, rows, buttons, even layouts and sets will be ignored during patching.

NOTICE: Though you surely can decrease these numbers (well, not rows, I suppose) by filling unneeded buttons and extended symbols with zeros. i.e.

For removing button, it's WIDTH and CLASS should be "none", MAIN and SEC TYPE should be "C", CHAR should be empty string (""), and EXT should be empty too.

For removing extended character from set, just set TYPE to "C" and CHAR to empty string.

To remove SET, well actually not remove but do not use, remove reference to it from EXT parameter in layout.

Patching process

Patching process looks pretty easy.

First of all, download the archive with script, and extract it. keypatch.zip

You will get folder "keypatch".

Copy this folder on your device.

Now, connect to your device via novaterm or ssh, and enter the directory.

Run template generation script

#sh patch.sh generate
Checking main file...
Main version not detected
Checking backup file...
Backup file 3.0.2 device version found
Backup file will be used
Reading data from file /usr/bin/LunaSysMgr.orig
Storing data into 302.data
Generating template
Writing template to file 302.template.json
Ok.

Now, copy this template to the computer and open it with editor that affords editing UTF-8 files with no BOM (Byte Order Mark) and Unix style endings. For example Notepad++ (on Windows) or vim (on *nix) Editing instructions in previous subsection #JSON layout file structure and editing hints

After editing, copy newly created file in the same folder and run patch or check options (you can use "check" instead of "patch" if you want to check patched file first or replace original file manually yourself)

sh patch.sh patch <new layout file>
Main version not detected
Checking backup file...
Backup file 3.0.2 device version found
Backup file will be used
Found file data 302.data
Reading template <new layout file> and redefining variables
Copying source file to temp location
Applying patch to tempfile
Patched file 302.patched generated
Installation started
Remounting / as rw
Original file aleready backed up in /usr/bin/LunaSysMgr.orig
Stopping Luna
Coping file 302.patched to /usr/bin/LunaSysMgr
Starting Luna
Remounting / back as ro
Removing temp file 302.patched
Ok.

or

sh patch.sh check <new layout file>
Checking main file...
Main version not detected
Checking backup file...
Backup file 3.0.2 device version found
Backup file will be used
Found file data 302.data
Mon Aug 29 15:08:50 MSD 2011
Reading template <new layout file> and redefining variables
Mon Aug 29 15:09:07 MSD 2011
Copying source file to temp location
Applying patch to tempfile
Patched file 302.patched generated
Mon Aug 29 15:09:15 MSD 2011
Ok.


IV. Changing layout manually

In keyboard layout, you can easily change button type flags, primary and secondary symbol.

Extended characters set address can be changed too, but you should deduce it by initial values of layouts. It is not very hard but requires time.


Characters in extended characters sets can be changed too. Although you cannot change the number of characters in a particular set, you can choose the one with needed number of symbols and reassign it to needed button by changing address in the layout.


After you are satisfied with the result, you need to replace LunaSysMgr in the device with newly edited file. For this purpose, you need to have shell access to it. You can use either ssh server from preware packages or novaterm to do it. You cannot use on-device terminal as you would need to stop luna in process.

First, you need to copy file on the device using USB drive mode (or you can scp it). Let the new file be /media/internal/LunaSysMgr.

After that, do the following commands:

remount / for write access

#mount -o r,w remount /

backup original file

#cp -dpR /usr/bin/LunaSysMgr /usr/bin/LunaSysMgr.orig

stop the luna

#initctl stop LunaSysMgr

copy new file

#cp /media/internal/LunaSysMgr /usr/bin/LunaSysMgr

start luna back

#initctl start LunaSysMgr

remount / back as read only

#mount -o r,o remount /

And enjoy your new keyboard.


V. Offsets

3.0.0 emul 3.0.0 device 3.0.2 device 3.0.4 device keyboard/letter
0x003EE840 0x003BEE80 0x003CACB0 0x3CA920 QWERTY Main layout
0x003EEC00 0x003BF240 0x003CB070 QWERTY Default bottom row
0x003EECC0 0x003BF300 0x003CB130 QWERTY URL bottom row
0x003EED80 0x003BF3C0 0x003CB1F0 QWERTY Email bottom row
0x003EEE40 0x003BF480 0x003CB2B0 0x3CAF20 QWERTZ Main layout
0x003EF200 0x003BF840 0x003CB670 QWERTZ Default bottom row
0x003EF2C0 0x003BF900 0x003CB730 QWERTZ URL bottom row
0x003EF380 0x003BF9C0 0x003CB7F0 QWERTZ Email bottom row
0x003EF440 0x003BFA80 0x003CB8B0 0x3CB520 AZERTY Main layout
0x003EF800 0x003BFE40 0x003CBC70 AZERTY Default bottom row
0x003EF8C0 0x003BFF00 0x003CBD30 AZERTY URL bottom row
0x003EF980 0x003BFFC0 0x003CBDF0 AZERTY Email bottom row

Extended character offsets are too numerous to list them all. I will just name offsets for the first one and the order of sets. As each set is ended with 0x00000000, it will be quite easy to distinguish them.

The order is the following:

3.0.0 emul 3.0.0 device 3.0.2 device 3.0.4 device letter to which character set is bind
0x003C33DC 0x00388BB4 0x00393608 0x39040C QWERTY 1
QWERTY 2
QWERTY 3
QWERTY 4
QWERTY 5
QWERTY 6
QWERTY 7
QWERTY 8
QWERTY 9
QWERTY 0
E
R
T
Y
U
I
O
P
A
S
D
G
L
Z
C
N
M
QWERTY ,/
QWERTY .?
QWERTY '"
QWERTY -_
Hide keyboard(size of it)
URL key /
QWERTY .com
QWERTZ 1
QWERTZ 2
QWERTZ 3
QWERTZ 4
QWERTZ 5
QWERTZ 6
QWERTZ 7
QWERTZ 8
QWERTZ 9
QWERTZ 0
QWERTZ ,;
QWERTZ .:
QWERTZ #?
QWERTZ -'
QWERTZ .com
AZERTY 1
AZERTY 2
AZERTY 3
AZERTY 4
AZERTY 5
AZERTY 6
AZERTY 7
AZERTY 8
AZERTY 9
AZERTY 0
AZERTY ,?
AZERTY .;
AZERTY :/
AZERTY @_
AZERTY !*
AZERTY .com