“… Year 2 might reform w spelling, so that which and one would take the same konsonant, wile Year 3 might well abolish y replasing it with i and Iear 4 might fiks the g/j anomali wonse and for all …”
Okay, most of you probably know about Mark Twain’s English spelling reform proposal and while we Germans in general would welcome it if you Anglophiles would make your spelling match the pronounciation at least somewhat (“now” — “plough” — “rough”, which two of them will sound more alike…?), what’s the bearing for the VM?
Time and again I noticed that people take the transcription of the VM to be the Real Thing(TM) itself. This it is not, not anymore than the word “apple” is an actual apple.
I’m convinced (I’m constantly under the urge to write “convicted”, but I feel that’d be wrong) that the crucial trick the VM author played on us does not lay in the enciphering scheme. In all probability, this will turn out to be something new and original, but not overly complicated. I even dare a bet that it will turn out that one page of the VM will be sufficient to break the manuscript’s code, and later generations will sneer at us, “What took those suckers so long to decipher something so laughably simple?”
It was the brilliant idea of the author to invent a special alphabet for his creation, with just the right amount of ambiguity to it.
No matter how much number-chrunching power we hurl at the VM, it’ll all come to naught as long as the data we feed to our computers are flawed, and this data is the transcription we use. But, as opposed to “conventional” codebreaking problems, where we in general have a good idea of what the ciphertext alphabet looks like, all we have for the VM are sophisticated guesses: Garbage in, garbage out, they say about computers, and our “garbage” is a flawed transcription.
We really can’t be sure if EVA “r” and “s” are representing the same ciphertext letter. All those different hooks about the “ch”, are they to mean the same? Are they different? Do they mean anything? Does the hook make a difference like between an “a” and an “ä”, or is it like “e” and “é” only?*) Is “iiin” one letter, two, three or four? Why does “qo” look like two letters, but behaves as one?
All this has a devastating impact to any test we might want to do on the text. If we do a letter frequency count, “a” and “ä” really should go into different baskets, while “e” and “é” should not. How can you arrive at a meaningful wordlength distribution if you can’t even count the letters in “daiin” reliably? Our statistical shells bounce off the alphabet fortress and detonate in our own camp with lots of smoke and fog.
This is not meant to dissuade anyone in general from working on the VM with statistical tests. I just want to point out that utmost diligence is required. Especially, never forget that you’re working under assumptions which may or may not be true.
(I simply write this because these days I happend to fall into that very trap myself.)
*) In case I’m not making myself clear to the tiny and neglectable community of non-speakers of German: “a” and “ä” are really two different letters, while “é” (in French) simply would mean that the “e” is voiced, rather than mute. I hope these fancy characters make it to your computers…