Theory of the Month: Rainer Hannig

Who is the candidate for the “Voynich Theory of the Month” in June 2020?

It is Prof. Dr. Rainer Hannig, who has even managed to briefly be featured in the VM’s Wikipedia entry before being excised again. Hannig has followed the usual spiel of VTotM to the letter:

  1. He is an “outsider”, namely an Egyptologist with no direct links to cryptography, or medieval manuscripts, or…
  2. His solution doesn’t build on previous work, but is the result of a maverick approach.
  3. He assumes an initially simple substitution cipher where one letter represents one sound, and one ciphertext word is equivalent to one plaintext word. He is original inasfar as he assumes the underlying plaintext language to be Hebrew, which would be exotic enough not to have been considered by other researches, and at the same time not completely implausible.*)
  4. After having some initial success in manufacturing Hebrew words out of this, the enciphering rules become increasingly complex the longer he progresses. Things turn into a labyrinthine set of rules with multi-value letters, the ommission and reintroduction of vowels, etc.
  5. At the same time, little thought is given to problems of the transcription, which may well hold surprises for the would-be decipherer — have we correctly identified different and identical characters? Is <ch> really <cc> or a different character? Is <r> the same as <s>?
  6. The multi-faceted structure of VM words, the complex rules governing their composition, is ignored. Which is odd considering such a large number of alternative ways the author had in enciphering his text — if he had so many different options to compose his ciphertext, why do all the words so strictly adhere to only a narrow selection of rules?
  7. While it is possible to create a string of words in this way, the creation of meaningful sentences remains elusive, even when one discards most of Hebrew grammar from the game (as Hannig apparently does.) A coherent narrative spanning paragraphs is nowhere in sight. And this is where I consider the case closed and loose interest.

As an example, let me give you Hannig’s translation from f17r:

I am a bull ready which facilitates and renews house and ruins.
You are a piece of lamb which opens the mouth and is discouraged
when eye-in-eye.

Or f2v, the nymphaea page:**)

Surely, Nymphaea is the twin. Enough juice in the tip.
Drink carefully, this is like something which provides spirit.
Will come juice with repetition. Juice facilitates prophecies...
like rebellion in presence of philosophers.
All which is in Greek about is silence without talking. [sic]
When not speaking about juice, spoke: Do dig... spoken in Arabic.

We’ve had a number of those theories, and I do not only present this piece of scientifically and methodologically somewhat unsound work out of malice (though I wonder about the quailty of Hannig’s other work, if his VM paper is representative for it) or to ridicule it. But it is exemplary for a mistake made so often in VM approaches that it cannot be pointed out often enough.

No, really.

*) Interestingly enough it seems that Hannig never bothered with the question whether the text is supposed to be read left-to-right as in Western languages, or right-to-left as in Hebrew, but opted for left-to-right from the start. Which is strange, considering his background in hieroglyphics.

**) Notice the highly repetitive text with a very limited vocabulary.


How Bad is Bad Enough?

Recently, I made my first tests to revive the Strokes theory of how the VM was enciphered and arrived at a quota of around 80% of the VM text (by volume, Currier A) which could be composed of Robert Firth’s 24 building blocks (or “syllables”). Now, is that a good or a bad result?

  • It can be considered bad inasfar as it’s “only” 80%. There are a number of degrees of freedom involved in the experiment, namely as regards transcription and block composition. Assuming that the VM text wasn’t written as a completely random string of symbols but governed by some kind of “grammar” which dictates possible word compositions, it’s not surprising that it’s possible to reconstruct a good chunk of this tome from some set of building blocks, especially if this building blocks are freely chosen. (And some of them consist only of a single letter, hey!) So one could argue that, if Firth’s blocks are any good, they should be able to cover more than 80% of the ciphertext.
  • On the other hand, one can consider the 80% surprisingly good. For example, the current set of 44 blocks allows for the representation of just two sets of characters forming the latin alphabet, one of uppercase and one of lowercase letters.*) This means that any special characters — arab digits, greek letters — aren’t covered from in this repertoire and drop through. Likewise Firths blocks didn’t include some gallows letters from the start, and thus are unable to compose words containing them.
    Let us add to this the high probabilities that

    • Firth’s block set isn’t completely correct, and
    • There are errors in the transcription system. With this I don’t mean a mistake in the transcription process, but an error in the transcription system, ie two different ciphertext letters are consistently considered the same (or vice versa), or two seperate letters are transcribed as one letter (or the other way around.) The ciphertext character set being unknown has notoriously been one of the obstacles of tackling the VM.**)

Any of these mistakes would naturally result in lower composition rates, and in the light of this, one could consider the 80% surprisingly good.

How to proceed from here?

As opposed to everyone else who seems bored under the Corona lockdown, I myself am actually quite busy. Nevertheless, there are two avenues of attack I’d like to write little pieces of software for:

  • One would be an interactive “Fiddler”. Basically, a piece of interactive software which lets you reassign plaintext letters to blocks and to change the blocks on the fly to see what effect this would have on a “decipherment.” With a little luck, and patient fiddling, one or the other readable word might come out of this, hinting at the “true” composition and assignment of the block…
  • Of course, there’s also the opportunity for a brute force attack. Empty diskspace is unused assets. The idea is to introduce random variations to the block set and see how these variations influence both the “composition rate” (ie the volume of text that can be synthesized with the blocks) and the number of blocks required for the composition.***) Letting the software run for a few hours and leaving the “better” results to survive (and discarding those which make the result “worse” might start an evolution towards a better optimized set of blocks.

Now all I need to do is find the time to hack the code.

*) Depending on whether some of the fancier renaissance additions like “j” and “w” or seperate letters for “u” and “v” should be considered.

**) The other being the underlying unknown plaintext language.

***) This might require a bit of explanation. I tried similar “evolition runs” in the past, but one problem was that they tend to “erode” the blocks used further and further until you’re left with a set of single-letter blocks. And this is only logical, because if you have a transcription which uses, say 44 different transcription letters, it will be possible to cover 100% of the transcription with 44 single-letter blocks, if each “block” contains exactly one letter of the set of transcription symbols. Thus there have to be two criteria whether one set of blocks is “better” or worse than the other in explaining the ciphertext:

  • The “better” set must cover more volume of the ciphertext, and
  • The “better” set must not use more blocks than the previous set. (Or, in other terms, the blocks used must be larger or of equal length.)

“C’est ne pas un mot”

Edit: I’ve just added a whole page dedicated to the “Face Value-Fallacy“, because I feel it’s important more people are aware of it.

One of the pitfalls of VM research is the presumption to take its text at face value — these letters that make up the text look so very much like latin letters (except… not quite ;-)), that it’s tempting to presume that each ciphertext letter indeed does represent one plaintext letter. And from that starting point the next logical step is to presume that each chipertext word corresponds to one plaintext word.

But upon closer inspection, this presumption is not borne out by observation, except by the fact that the letters are grouped into small sequences, seperated by visual spaces. A lot of features speak against this assumption or “words”,*) namely —

  • The words of the VM show a high internal structure: Many letters appear only word-initial, some only word-terminal, and many show a high dependency on their neighborhood. While these features are not unheard of in natural languages — compare “q”, which is always followed by “u” in most western languages, or the German “ß”-s which has a strong tendency to appear word-terminal — no language exhibits so many of these features and such a strongly regulated word-internal grammar.
  • The letters aren’t evenly distributed on the page. It’s common knowledge that the gallows characters are concentrated on the page tops and paragraph starts. While this could be explained by them being ornamental versions of regular characters, Julian Bunn’s analysis from 2016 shows a bunch of certain characters “crowd” in line-initial or line-terminal positions, which is a pretty odd feature, if one character really represents one plaintext letter.
  • Unless we are very wrong about the character set used for the VM, one VM word simply doesn’t have enough information content to encipher a plaintext word.**)
  • “Sentences” often differ by only slight changes from word to word or show word repetitions or show word repetitions, so that it almost looks like words are not independent but “morphing” one into the other, and the true information content doesn’t lie in the words themselves, but in the changes introduced between them.***) This is also difficult to reconcile with the idea that each VM word corresponds to a plaintext word.

No. There is much too much going on in the encipherment of the VM. A ciphertext word is not a plaintext word, and a ciphertext letter does not correspond to a plaintext letter, I’m willing to bet on both.

It’s still my convinction that the fiendishness of the VM encipherment doesn’t lie in it’s complexity, but in it’s seeming simplicity: Taken at face value, it looks like something dead simple to solve, and so even a moderately complicated scheme escapes the eye of the beholder. We’re missing the forest for the trees which look like shrubbery.

*) Subsequently I’ll use the term “word” for “a short sequence of glyphs in the VM, seperated from the rest by visual breaks.

**) It could be that the VM character set is much more complicated than presumed and contains many more fine details which discriminate between different character, but I doubt this for reasons of practicability: The VM characters are already quite small, and it would have been impossible for the author to write down his letters so exactly on rough vellum that small nuances would have been legible for a reader. (Not to make too fine a point on this.)

***) Wouldn’t it be fascinating if the word sequence “walter winter” would be used in such a manner to encipher the word “in”?

Fresh attempt at the strokes (1): Robert’s observation

When all you’ve got is a hammer, then everything looks like a nail. For the VM, the same holds true for number crunching, which seems to be about the only tool we have to get any information out of the VM — no matter how misleading it may be.

Now I’ve decided to go back to my notorious “Strokes” theory, which, as I found out to my shock, dates back to early 2005 (without having made much progress, I have to admit.) Read all about it!

Continue reading “Fresh attempt at the strokes (1): Robert’s observation”

Breathing space

Many Voynicheros assume that the enciphering system of the VM treats the space between words as a particular character, like any other of the alphabet.

This is an attitude we have grown accustomed to, since we’ve grown up with computers, where the space has an ASCII code like the rest of “A” to “Z”, and before that the typewriter, where the space bar was a key similar to the others.

But it’s a fairly modern attitude. Until fairly recently, a space was just that — an empty gap between words, but not a character or symbol in its own right. (Even the venerable Engima cipher machine of WWII fame didn’t feature that character in its symbol set.) Rather, it was considered a part of visual design, like a line break (for with the Engima didn’t have a symbol either). Word breaks were useful to discriminate between word boundaries, but they contained no information in themselves. Throughout much of the middle ages, textswerewrittenassimplyalongsequenceofletters, and it was up to the reader to find the word breaks. (Compare this to modern typography, where it’s for the better part up to the reader to learn about stressed syllables etc.)

Even though this practice had pretty much ended by the presumed genesis of the VM (early 15th century), and word breaks were regularly used to increase readbility, I don’t think that any encipherer would already have thought of treating the spaces thus generated as particular characters which would be enciphered like regular letters. Hence, I also think it’s futile to search for such enciphering characteristics in the VM.


The one thing constant when examining the VM is the fact that nothing is as it looks at first glance. Whether this is Voynich’s personally devised conundrum, designed to enhance the appeal of the book, or simply the result of an otherwise innocent enciphering scheme developed by a 15th century scholar, Voynicheros learn quickly not to trust superficial appearances.

This thought struck me while perusing Julian Bunn‘s new book about the Puzzles of the VM, and coming across his description of f68r2, which is generally assumed to depict the Moon, the open star cluster of the Pleiades, and maybe Aldebaran, the singularly brightest star in the Pleiades vicinity.

Click this link for a high-res scan of f68r2 courtesy Jason Davies

As usual, attempts to use the names “Pleiades” and “Aldebaran” as cribs to break the VM cipher led to nothing, and also the mysterious wavy line connecting the stars and the Moon has only been met with tortuous, hardly convincing explanations. But what if these are not the Pleiades?

In terms of astronomy, the Pleiades are an open cluster of around 500 stars in relative proximity to Earth. Depending on visibility conditions, usually either six or eight, but rarely seven of these stars can be discerned with the naked eye, since numbers 7 and 8 have almost the same apparent brightness. In many cultures, the Pleiades are nevertheless associated with the number “7” (probably due to the “magic” qualities of this number). The Pleiades supposedly are depicted in the shape of seven dots on the early bronze age Nebra sky disk and are also known in German as the “Siebengestirn”, the “seven stars.” OTOH, in japanese astronomy six of its stars form the constellation of “Mutsuraboshi” (conveniently meaning “six stars”), and the paintings in the caves of Lascaux apparently feature a suspicious cluster of six, not seven stars.

So, under the impression that interpreting the f68r2 constellation as the Pleiades is leading us nowhere, what else could it depict?

I was reminded of the fact that classical astrology always spoke of seven “planets” — not planets in the modern astronomical sense, but objects changing their position relative to the fixed stars. Namely, these were the Sun and the Moon, plus those bodies visible to the naked eye we consider planets today: Mercury, Venus, Mars, Jupiter, and Saturn. What if the f68r2 objects really depicted the seven planets — Where would that lead us?

Being a reader of comic books, the wavy line between Moon and “Pleiades” reminded me of “speed lines”, used to indicate motion in comics. By analogy, let’s take a wide leap and say, the “Pleiades” are moving away from the Moon in this picture — after the Moon has been hit by “Aldebaran”, maybe? Using this as a starting point, as usual there is no difficulty in coming up with wild speculations: Is this maybe a picture explaining an early astronomical theory about the creation of the solar system — A massive star struck the Sun, and the collision broke seven pieces of rock free which went on to become the planets, Moon and Earth?

Of course, that would be an unusually advanced theory for a manuscript presumably written in the 15th century — It would predate Copernicus‘ ideas of a heliocentric universe by at least half a century.*) Also, it requires one to assume that the VM author got a little confused by his own genius when he drew the Moon in the centre of the picture rather than the Sun. (And it pretty clearly is the Moon.**) On the other hand, it has been assumed that the VM is a collection of advanced scientific theories, and if the balneological section is interpreted as a treatise about anatomy, similar advanced views on astronomy could not be ruled out either.

It is certainly tempting to fantasize that the author had developed a theory for the creation of the planets which could still have held up its head in the 19th century. But, the longer I sort my thoughts about it, the less I believe in it. While I’m also unconvinced that the f68r2 object are the Pleiades, the “heliocentric hypothesis” simply requires too many stretches of the imagination.

So, once more, not only the first appearance, but also the second one of a feature in the VM seems to be misleading.

*) While the idea of a heliocentric universe had been known since antiquity, it had never found much acceptance before the middle of the 16th century.

**) A “weaker” version of the heliocentric hypothesis would be the assumption that the body in the centre of f68r2 is not the Moon, but the Earth. In that case, one could assume a geocentric universe being depicted, with the planets being the result of the collision of the Earth with a massive object which later vanished into deep space again. This being a very early depiction of Earth in a modern astronomical context, it’s unclear how the VM author would have drawn it.

New Book to Read over the Holidays

Just in time for the holiday season, another Venerable Voynich Veteran™, namely Julian Bunn, has treated us with a book about the Voynich manuscript:

Puzzles of the Voynich Manuscript: An Illustrated Guide to the Perplexing Puzzles of MS Beinecke 408 is a small, but invaluable tome of 69 pages, apparently self-published through Amazon’s services. At some 10€, Puzzles is reasonably priced, and the Kindle e-book version is even free!

In Julian’s own words, the book is not supposed to offer an in-depth analysis to the progressed Voynichero, but is meant to be an introduction to newcomers to the field of the VM who wish to get a first overview over “what the fuss is all about,” and the book does this job admirably well.

Julian gives a concise summary of the manuscript’s various enigmatic features, touching on proposed answers and solutions, but never really advocating a viewpoint. After a first quick read in the subway (which Voynichero would wait any longer than absolutely necessary to read such a new book?), I haven’t noticed any relevant omissions or errors.*) Of course, on some points one might wish for a more exhausting treatment, but obviously the question to which level of detail an introduction should go, is a matter of personal preference.

There are a few points of criticism, but these are minor. At US letter format (almost equivalent to A4), Puzzles is a bit big and unwieldy. (OTOH, the illustrations, which are all very well reproduced, obviously benefit from a larger format, so I assume this was a deliberate decision.) One would have wished for better typography (the lettering is at times jarringly bad), and generally a more careful eye for layout. For example, several times one page is filled with only a line of text or two, because the subsequent page is occupied completely by an illustration. A table of contents and numbered chapters would have made finding a particular spot again easier (though of course at only 69 pages, one can quickly browse through the tome.) Finally, being a Wikipedia editor has spoiled me, and I would have wished for a more comprehensive list of references which would make it easier to track statements and observations to their sources. (There isn’t even an imprint in the book…)

But overall the book is a must-have for people interested in the Voynich, and if friends of yours ask you what the whole hullabaloo is about, you can safely point them to Julian’s work. They won’t go amiss.

*) with the exception of calling my blog a source of “profound insights,” which is a nice compliment, but like all compliments a slight exaggeration, IMHO

“If it was solvable, it would have been solved.”

Fellow-Voynichero Rich SantaColoma has recently summarized his theory, that the Voynich manuscript is a forgery perpetrated by probably no-one else but Voynich himself, in his blog. In this context, the question whether the content of the VM is “genuine” keeps coming up, ie, is the VM “solvable,” does it have content, or is it just gibberish?

I don’t personally see a clear link between the questions of authenticity and content. All combinations are plausible: A 15th century manuscript filled with enciphered text or gibberish, or a contemporary forgery with nonsense or genuine content. I’m currently leaning to the latter option — that it’s a Vorgery*), but that Voynich included stronger hints in the text that the VM was written by Bacon than given by the illustrations alone. Unfortunately he was a bit too clever for his own good, and nobody looked through what he had considered a simple cipher — and after all he himself could hardly drop too many hints how to solve that encrypted text…**)

But sometimes the argument comes up, the VM must be gibberish, because if it was enciphered, we would have long since deciphered it. After all, for a century the best codebreakers of the world have been at work over it, and the enciphering — if there was one — would have been done by an either medieval or early 20th century dilettante, by current standards.

Enter the Dorabella cipher:


The story of this cipher is, in short, that composer Edward Elgar sent this note to a lady friend of his, Dora Penny in 1897, along with another letter of different provenance. Dora never managed to read the message. And to this day, no convincing deciphering of this innocuous missive has been achieved.***)

So, let’s look at the note:

  • It certainly is meaningful. It is difficult to conceive why Elgar would have sent a note with gibberish to Dora, with which he wanted to remain on good terms.
  • While Elgar was fond of wordplay and riddles and certainly had a bit of a cryptographic savvy, Dora was naive in that regard, so it’s reasonable to assume that Elgar’s method wasn’t too sophisticated — especially in view of the fact that Elgar apparently did not provide Dora with a key, but trusted she’d find the solution on her own.
  • The note bears a striking resemblance to the well-known pigpen cipher (also known as masonic cipher), being set up in three groups of symbols (single, double and triple arc) in one of eight positions each. Elgar is known to have used pigpen ciphers, and, being a simple substitution cipher, pigpen is usually fairly easy to crack.

And yet, nobody has managed to come up with a solution to what Elgar must have intended as a trivial puzzle.

So, when we compare this with the VM, where we are completely in the dark about the method, the character set used, and the plaintext language and can assume that the author made the breaking of his cipher as hard as possible (provided the VM is authentic), then it may not be so unreasonable to assume there is a solution, but it has eluded us.

Also, while the claim repeatedly comes up that “the best heads in cryptography” were confessed with it, this is certainly true, but it always was only a pasttime to them, and none of them were able to devote their full time and resources to the VM.

Thus I’m fairly confident that there is a truth to the VM, and the truth is still out there.

(If you’re interested in Dorabella, Nick Pelling has — as always — already posted an in-depth treatment of the cipher on his blog.)

*) That was a genuine typo, but I’ll leave it in for kicks. ;-)

**) Besides, once he had sold the VM, there was nothing left to gain for him by raising the value of the manuscript by dropping Bacon hints.

***) There are a few suggested translations, but constructs like “Luigi Ccibunud luv’ngly tuned liuto studo two” bear all the hallmarks of VM translation failures.