Mapping HP48 Text to Unicode

The Problem

The HP48 calculators have a text encoding that is based on the Latin-1 character set (a.k.a. ISO 8859-1) with the exception of 34 of the control characters. These characters are 0x1F and 0x7F to 0x9F. Instead of leaving these characters as the normal Latin-1 control codes, HP re-purposed these mostly unused control codes for 34 characters better suited for displaying on a high-end calculator. Problems appear when the re-purposed characters are present in HP48 text or file names that are being used on a different computing platform (ex: transferring a file from an HP48 to a PC). This sometimes results in garbage data, bugs, and crashes in software that doesn’t attempt to handle these special characters.

Table of HP48 font at size 3. Blue HP48 characters differ from the Latin-1 character set.

Table of the HP48 size 3 font. Blue HP48 characters differ from the Latin-1 character set.

Table of HP48 font at size 2. Blue HP48 characters differ from the Latin-1 character set.

Table of the HP48 size 2 font. Blue HP48 characters differ from the Latin-1 character set.

Table of HP48 font at size 1. Blue HP48 characters differ from the Latin-1 character set.

Table of the HP48 size 1 font. Blue HP48 characters differ from the Latin-1 character set.

Unicode has become the ubiquitous standard since the time the HP48 was originally created. Unicode supports over 1 million possible characters. This means that it is now possible to convert HP48 text to characters that much of the world now uses.

However, with so many characters to chose from that look similar, sometimes the issue then becomes one to use. For example, the number 0 and the letter O looks somewhat similar, depending on what font is being used.

Solution

To convert an HP48 character to a Unicode character, use the following mapping table:

HP48			Unicode
Decimal	Hex	I/O Char*	Name	Char	Hex	UTF-8
31	1F		Ellipsis	…	2026	E2 80 A6
127	7F		Medium Shade	▒	2592	E2 96 92
128	80	\<)	Angle	∠	2220	E2 88 A0
129	81	\x-	Latin Small Letter a with Macron	ā	0101	C4 81
130	82	\.V	Nabla	∇	2207	E2 88 87
131	83	\v/	Square Root	√	221A	E2 88 9A
132	84	\.S	Integral	∫	222B	E2 88 AB
133	85	\GS	Greek Capital Letter Sigma	Σ	03A3	CE A3
134	86	\\|>	Black Right-Pointing Triangle	▶	25B6	E2 96 B6
135	86	\pi	Greek Small Letter Pi	π	03C0	CF 80
136	88	\.d	Partial Differential	∂	2202	E2 88 82
137	89	\<=	Less-Than or Equal To	≤	2264	E2 89 A4
138	8A	\>=	Greater-Than or Equal To	≥	2265	E2 89 A5
139	8B	\=/	Not Equal To	≠	2260	E2 89 A0
140	8C	\Ga	Greek Small Letter Alpha	α	03B1	CE B1
141	8D	\->	Rightwards Arrow	→	2192	E2 86 92
142	8E	\<-	Leftwards Arrow	←	2190	E2 86 90
143	8F	\\|v	Downwards Arrow	↓	2193	E2 86 93
144	90	\\|^	Upwards Arrow	↑	2191	E2 86 91
145	91	\Gg	Greek Small Letter Gamma	γ	03B3	CE B3
146	92	\Gd	Greek Small Letter Delta	δ	03B4	CE B4
147	93	\Ge	Greek Small Letter Epsilon	ε	03B5	CE B5
148	94	\Gn	Greek Small Letter Eta	η	03B7	CE B7
149	95	\Gh	Greek Small Letter Theta	θ	03B8	CE B8
150	96	\Gl	Greek Small Letter Lamda	λ	03BB	CE BB
151	97	\Gr	Greek Small Letter Rho	ρ	03C1	CF 81
152	98	\Gs	Greek Small Letter Sigma	σ	03C3	CF 83
153	99	\Gt	Greek Small Letter Tau	τ	03C4	CF 84
154	9A	\Gw	Greek Small Letter Omega	ω	03C9	CF 89
155	9B	\GD	Greek Capital Letter Delta	Δ	0394	CE 94
156	9C	\PI	Greek Capital Letter Pi	Π	03A0	CE A0
157	9D	\GW	Greek Captial Letter Omega	Ω	03A9	CE A9
158	9E	\[]	Black Square	■	25A0	E2 96 A0
159	9F	\oo	Infinity	∞	221E	E2 88 9E

* not all I/O Characters are listed here.

All remaining HP48 characters can be directly mapped to Unicode. For example, an HP48 ‘A’ is 0×41 and in Unicode is 0041. This applies for the ranges of 0×00 to 0x1E, 0×20 to 0x7E, and 0xA0 to 0xFF.

If you are using UTF-8, then it is necessary to encode each Unicode characters into 1, 2, or 3 byte sequences. Details are available at http://en.wikipedia.org/wiki/Utf-8.

Rationale

Character 0×80 (angle)
1. Instead using ∠ 2220 for character 0×80, others have incorrectly used ∟ 221F. This is the Right Angle character and is not intended for any generic angle. Also, it does not visually match the HP48.
2. While ∡ 2221 is visually an even better match, this character often does not render properly on various computer platforms and software. In short, some users will just see empty boxes such as:
  Empty boxes.
Character 0×81 (x-bar)
1. In theory, Unicode allows two characters to be visually combined if the 2nd character is a “combining character”. This would allow for the display of x̄ by using x followed by the “combining macron” character, which would be 0078 followed by 0304. However, there are two problems with this.
  1. This combining of these two characters often renders poorly or not at all and will leave the user confused. In the example below, the first two in the example are rendering failures while the last two are simply difficult to read at the default settings:
    For additional examples of how x-bar is inconsistently rendered based on font, go http://www.kreativekorp.com/charset/encoding.php?file=hp-48.kte&char=81.
  2. Using two characters to represent one HP48 character breaks the pattern having a simple one-to-one mapping. Some HP48 developers will likely have bugs in the code when converting back from Unicode to HP48 characters.
2. Instead, ā 0101 is used. It is a single Unicode character so it is easy for HP48 developers to deal with, leading to less bugs. Also, x-bar is used in statistics as the notation for average and ā looks like an ‘a’ for average.
Character 0×82 (nabla)
1. The character ∇ 2207 was chosen over other triangles since this is the Nabla character which is used in mathematics. Details can be read http://en.wikipedia.org/wiki/Nabla_symbol.
Characters 0x8D through 0×90 (arrows)
1. In Unicode, there are a large number of characters that represent arrows. However, 2190 through 2193 were chosen because these are just simple arrow characters and don’t carry any additional implied meaning. Also, this set of arrow characters supports all four directions where as some of the other sets do not. Lastly, some of the alternative arrow characters do not consistently get rendered on some computing platforms.
Characters 0×85, 0x8C, 0x9B, 0x9C, 0x9D (various Greek symbols)
1. These are Greek symbols that could have alternatively been represented by various mathematical or electrical Unicode characters. However there are several reasons for preferring the Greek symbols:
  1. We can gain insight into the original HP48 developers intentions by looking at how they translated these characters when using ASCII transfer mode 2 or 3 over a serial link. These characters were translated into \GS, \Ga, \GD, \PI, and \GW respectively. If we assume that “G” stands for Greek, then we can assume these translations mean Greek Capital Sigma, Greek lower alpha, Greek Capital Delta, Capital Pi, and Greek Capital Omega (a lower omega looks like a ‘w’). This pattern holds for all the other translated Greek letters as well, except for \pi which is trivial to see that this is lower pi.
  2. Using all Greek symbols results in a visually clean look. In contrast, when symbols from math, electronics, and Greek symbols are mixed together, they often look sloppy because they don’t line up, have different line weights, and different drawing styles.
Character 0x9E (box)
1. Instead of using ■ 25A0 as the Black Box, others have incorrectly used ▬ 25AC which is the Black Rectangle. This visually does not match.

Other HP48 to Unicode Mappings

http://www.kreativekorp.com/charset/encoding.php?file=hp-48.kte – Differs from above on HP48 characters 0×81 and 0×85. Characters 0x1F and 0x7F are not dealt with.
http://www.kostis.net/charsets/hp48.htm – Differs from above on HP48 characters 0×80, 0×85, and 0x9E. Characters 0x1F, 0x7F, and 0×81 are not dealt with.

Note: efforts are being made (or will be made) to rectify the differences.

Other Resources

Unicode Standard: http://unicode.org/
Unicode Character Name Index: http://www.unicode.org/charts/charindex.html
HP48 ASCII Transfer mode translations: http://holyjoe.net/hp/tiotable.htm
Newsgroup Post: https://groups.google.com/d/topic/comp.sys.hp48/hek271hUD-E/discussion
Matching Tables:
- http://www.ascii.ca/hp48.htm