When one follows the developments around the VM, research into it seems to be like a stream, where new ideas and theories appear somewhere upstream in the distance to come closer, and let themselves be examined more fully while they drift by you, before they follow the water downstream and finally vanish in the dusk of forgetableness. Rarely one of them will leave so much as a beacon behind.
One repeated pattern that seems to crop up time after time, and goes mostly unnoticed by even the old-time and seasoned Voynicheros, is the more or less explicit assumption that one VM letter is the equivalent to one plaintext letter, and that one VM word corresponds to one plaintext word.
Now, I can see where this would come from: Of course it’s natural to assume, it’s the most simple and straightforward way to do it (Which, in itself ought to be a warning sign: If the VM was enciphered “simply and straightforward”, it would have been solved long ago…), it’s the way ciphers were done in period, and it lends itself readily to easy analysis.
Unfortunately, in all probability this is not the way the VM was cooked up. Let’s look at a few of the arguments against this case:
- The main body of the VM consists of a rather limited character set. Depending on your choice of transcription, you’ll arrive at somewhat around 17 different characters, which make up the better part of the VM. Admittedly, there is a large number of “minor” characters, but these are fairly rare. Now, assuming that the VM has a European medieval background, the character sets which are conceivable for the plaintext are latin, greek, cyrillic and perhaps arab or hebrew letters. But unfortunately all of these writing systems sport more than 20 different characters, plus possibly numerals, interpunction and other special characters. The VM character set simply appears to be too small to accomodate a complete alphabet.
- Words show a very strong internal structure. VM words are far from random, au contraire: While nobody has as yet been able to completely determine which rules are underlying their composition, it’s pretty obvious that there are rules which are followed fairly coherently. To my knowledge, there are no European languages and only very few at all in the world which have such a determined set of rules for word composition, whether they are being written in their original alphabet or in that of the VM. What is worse: If in addition to switching to a new alphabet the author has employed an enciphering system (which he obviously has, because if the VM was a simple substitution cipher, we would have solved it), this would contribute to reducing the word structure and make it appear more random. (The only schemes which might increase the word regularity would be something like a transposition cipher with alphabetical sorting, which is conceivable, but wouldn’t exactly render the task of deciphering easier even for the intended audience.)
- One could argue that the VM author dropped eg the vowels from the plaintext before enciphering, hence reducing the number of different VM characters required. But this in turn would have reduced the word regularity. Bear in mind that the EVA transcription allows the better part of the VM to be read out aloud, which shows that there is in the VM ciphertext a structure retained which assigns to the VM letters functions comparable (but probably not identical) to “consonants” and “vowels”.
- All of the above work together to render an information content in the VM of roughly 2 characters/ciphertext word; obviously far too little for any natural languages.
- No words in the VM have been identified which seem to have the function of particles in natural languages, like articles, “and”, “or” and similar.
- On a more subtle level, there also appear to be things going on with decreasing average word length towards the end of a line, which don’t sit well with a 1:1 correspondence.
So, all in all this is why I’m convinced that the VM is something different than a simple substition or even a transposition cipher. Something else is going on, and the enciphered information is categorized in letters and grouped in words, but these ciphertext letters are not plaintext letters, and the ciphertext words are not plaintext words.