We are all hatching our beliefs and theories about the origin and the encryption of the VM, with varying degrees of zest and conviction. Utlimately, none of those theories has as yet managed to convince anyone but the inventors of the idea themselves.*) But in my personal little Voynich world, the various theories have a different “half-life”, with which their plausibility decays. On this pages, I’ve set up a list of symptoms which indicate to me that any proposed VM decipherment idea is flawed.
You can check your own hypothesis against this and see how you fare: If you score three out of ten or more, my take is the idea is doomed. (In all fairness, the Stroke theory currently rates 2/10.)
Now, here is my Top Ten of Bad Signs. As you will notice, many of them follow the “mortar and pestle”-pattern, where the algorithm fragments the VM text into a structureless pulp and allows one to more or less arbitrarily reconstruct a surmised “plaintext”. Of course, such a procedure will always render some plaintext, regardless of a) the original plaintext contents and b) the originally used enciphering method.
- Anagramming. Anagramming is the devil’s (at least if we’re talking about lossy anagramming, where there is no rule how to reconstruct the original plaintext, as opposed to regular reordering, as in reversing a letter sequence.) The number of possible words derived through anagramming is immense, and so we shouldn’t be surprised if it’s easy to reconstruct existing vocabulary from any sequence of letters, the more so, if the frequency distribution has been matched to the frequencies in our presumed plaintext language.
But not only is this always possible (to a certain extent), the big problem is that the ambiguities introduced this way (since any one letter sequence can be anagrammed in a number of valid words) would not only hamper the code breaker, but the intended audience as well: How are they supposed to know which alternative of the possible solutions to take? (The answer commonly given is that he would know from “context”, which boils down to saying that the text can only be deciphered if the plaintext is known…)
- MIMO. “Multiple in, multiple out”, which means that the proposed algorithm maps several plaintext letters onto one ciphertext letter or vice cersa or both — in other words, qo may sometimes mean “x”, and sometimes “u”. To myself, I call this “arbitrary bottleneck ambiguities”.
If you follow the reasoning of the VM code breakers closely, you will see that MIMO schemes are invented, because the fairly rigid and schematic word structure of VM ciphertext words doesn’t match the (relatively) liberal structure of plaintext words. Hence, this amount of flexibility is built into the theory to disolve the dichotomy.
But, as Sherlock Holmes has already told us a century ago, it’s a mistake to adapt the facts to fit the theory.
- Dropped letters (esp vowels). This falls somewhere between Anagrams and MIMO. For the minute, assume you have a four-letter ciphertext word and allow that one plaintext letter has been dropped anywhere in the course of the enciphering. With five different possible locations and a plaintext alphabet of, say, 25 letters, that alone gives you 125 different possible plaintext words. For every word of the ciphertext.
How is one to read such a text with any degree of fluency?
- Exotic language, undiscovered dialect, words not used in period… Probably the most notorious example of this is Leo Levitov’s “polyglot oral tongue” (with which I reckon he meant to say, “hitherto only orally tradited dialect as a mixture of several languages”).
While strictly speaking this can’t be ruled out, it adds a factor of implausibility to one’s theory. After all, everything we know about the VM points to a central European origin in the late Middle Ages or Renaissance. So, while the idea of the VM being from the Far East has some merits based on linguistic aspects, any theory involving “exotic language” must not only decipher the text but at the same time also explain the origin or the “polyglot oral tongue.”
- Text does not fit the pictures. Simple, but often wrongly rated as of minor importance. We have plenty of herbs, we have astronomical/astrological imagery, and we have girlies galore in a trashcan.
Any decipherment must explain why they are here, or — if they are supposed to be a red herring to throw a decipherer off track — why somebody would have gone to lengths to include this weird and unique images which are bound to attract everybody’s attention, rather than make the VM look innocuous.
- Reliance on micrography. Plenty of solution attempts focus on minute distinctions between various letter shapes and stroke lengths and orientations and what not. Of course, this introduces degrees of freedom into the solution and makes it easier to arrive at legible text.
But. With the 600dpi images currently available which look so neat and huge on the monitor, people tend to forget that the VM is actually a fairly small book. If you print two VM pages on one shee of letter or A4 paper, you arrive at roughly the original size. Now take a look at how tiny these letters are, and consider you had to write that text with a fickly quill running out of ink every other word on uneven vellum, and supposedly at a decent speed so you’ll finish the thing before the Renaissance runs out.
How much detail can you preserve in your writing? It’s a very instructive and quick experiment which I highly recommend.
- More than three steps in the enciphering process. Again, the VM is a pretty extensive text comparable to a short novel in volume. It’s not reasonable to assume that the author piled enciphering step after enciphering step on one another, else he would never have accomplished anything. Mind you that in his time (if we stick with a 15th century provenance) the simple expedient of using an invented alphabet was considered safe even for highly sensitive diplomatic correspondence.
How many more steps is the VM author likely to have built on top of this?
- The algorithm does not explain the statistical features of the VM. About the only reasonable clue for decipherment the VM has yielded is the statistics. What we know about the enciphering, we know through counting words and letters. Fortunately (but up to now fairly uselessly), some of these features are sticking out like sore thumbs, as the notorious letters and letter sequences which occur only word-initial, word-terminal or in connection with some other letters. Any theory which is unable to explain these features at a basic level (ie as a consequence of the algorithm proposed) is useless.
A common example is the suggestion that the VM is written as a sequence of abbreviations, which is clearly at odds with the observation that statistically the ciphertext appears to be “bloated”, rather than compressed.
- Exceptions and deceptions. As a last resort, when theories don’t fit the facts, exceptions begin to crop up in the plan. Why they should be there, how the encipherment should benefit from them or such considerations usually don’t enter the equation. But they allow one to adapt the facts to the theory (again…): “Hey, previously I said qo is “x”, but for my translation to work here I need a “u” in this spot. Well, obviously that was an exception the VM author introcuded!”
A similar technique is the invention of deceptions. Namely, the assumption is that some feature which doesn’t line up with the proposed algorithm is declared a “deception” on the part of the author to throw the sleuths off track. This could be the extraneous writings or parts of the ciphertext. Of course, it’s always possible to discard disagreeable aspects of the VM this way, and to me this has a remarkable similarity to Creationist attempts to put a literal reading of the bible on a “scientific” footing.
- Solution makes no sense. If everything is said and done, and the final translation reads something like Left left oh fleeting left foot for fodder flitter flatter all that’s left is left, maybe then it’s time for the meticulous researcher to reconsider.
*) Hence a cynic might argue that the topmost sign that your theory is likely wrong is that it is trying to decipher the VM, ha ha.