Opposed to what?

Every now and then I check out the Voynich category at the dmoz open directory site. (For those of you who don’t know it, dmoz tries to be for links what Wikipedia is for information.) I have submitted my little blog to their review process and hope that in due time the Voynichthoughts will show up there as well.

Only today did I notice that there is a subsection “Opposing views” in the Voynich directory, and now I wonder — opposed to what? We all know that there is no universally accepted theory of the VM, and even what is considered “mainstream” in VM research is subject to debate. Methinks everybody with an opinion on the VM will be in opposition, more or less, to their peers.

The mind boggles.

A nessecary speling reform?

“… Year 2 might reform w spelling, so that which and one would take the same konsonant, wile Year 3 might well abolish y replasing it with i and Iear 4 might fiks the g/j anomali wonse and for all …”

Okay, most of you probably know about Mark Twain’s English spelling reform proposal and while we Germans in general would welcome it if you Anglophiles would make your spelling match the pronounciation at least somewhat (“now” — “plough” — “rough”, which two of them will sound more alike…?), what’s the bearing for the VM?

Time and again I noticed that people take the transcription of the VM to be the Real Thing(TM) itself. This it is not, not anymore than the word “apple” is an actual apple.

I’m convinced (I’m constantly under the urge to write “convicted”, but I feel that’d be wrong) that the crucial trick the VM author played on us does not lay in the enciphering scheme. In all probability, this will turn out to be something new and original, but not overly complicated. I even dare a bet that it will turn out that one page of the VM will be sufficient to break the manuscript’s code, and later generations will sneer at us, “What took those suckers so long to decipher something so laughably simple?”

It was the brilliant idea of the author to invent a special alphabet for his creation, with just the right amount of ambiguity to it.

No matter how much number-chrunching power we hurl at the VM, it’ll all come to naught as long as the data we feed to our computers are flawed, and this data is the transcription we use. But, as opposed to “conventional” codebreaking problems, where we in general have a good idea of what the ciphertext alphabet looks like, all we have for the VM are sophisticated guesses: Garbage in, garbage out, they say about computers, and our “garbage” is a flawed transcription.

We really can’t be sure if EVA “r” and “s” are representing the same ciphertext letter. All those different hooks about the “ch”, are they to mean the same? Are they different? Do they mean anything? Does the hook make a difference like between an “a” and an “ä”, or is it like “e” and “é” only?*) Is “iiin” one letter, two, three or four? Why does “qo” look like two letters, but behaves as one?

All this has a devastating impact to any test we might want to do on the text. If we do a letter frequency count, “a” and “ä” really should go into different baskets, while “e” and “é” should not. How can you arrive at a meaningful wordlength distribution if you can’t even count the letters in “daiin” reliably? Our statistical shells bounce off the alphabet fortress and detonate in our own camp with lots of smoke and fog.

This is not meant to dissuade anyone in general from working on the VM with statistical tests. I just want to point out that utmost diligence is required. Especially, never forget that you’re working under assumptions which may or may not be true.

(I simply write this because these days I happend to fall into that very trap myself.)

*) In case I’m not making myself clear to the tiny and neglectable community of non-speakers of German: “a” and “ä” are really two different letters, while “é” (in French) simply would mean that the “e” is voiced, rather than mute. I hope these fancy characters make it to your computers…

Stroke theory, after round 1: Elmar’s corner

Okay, I’ve made a mistake, so my attacks lost their punch.

Dennis Stallings and other acute readers have pointed out to me that the hit ratio I achieved — around 40% by token*) — was much less than what even superficial attempts from them achieved (around 80%).

At first, I attributed this to the Takahashi transcription which I had used, and which features a number of words running together (like “cthaiinydaiin” or “cheoeesykeor”), which in all probability should be split up in two words each. But I was doubtful if those run-togethers would really be so numerous as to account for half of the possible hits I had obviusly missed.

Turns out, I had made a mistake at one point: Robert Firth had worked from the Currier transcriptions, while I was using EVA, assuming that both could be unambiguously converted back and forth into each other. I was wrong there. The translation between the two systems is “lossy”, hence an unsophisticated (ie “dumb”) matching system as the one I used will of course render different results in the two domains.

Thus, either I adapt my programs to use Currier, or I find a real EVA equivalent to Robert’s odd and even groups

Time for some infighting, Mr. Voynich!

*) I’m also indebted to Dennis for pointing out to me the difference between “the number of words” (which is usually understood to mean the number of different words), and “the number of tokens” (the amount of words in total). Thus, “I was very, very ignorant” amounts to 4 (different) words, but 5 tokens in the above count. “By token” would mean something like “by volume”.

Welcome to the whacky world of Wilfrid Voynich…

Not long ago, the writer of these lines, “in the mad pride of intellectuality”, underwent the enterprise of setting up this blog.

Shortly thereafter (obviously, else it would be an event of the future) Nick Pelling wrote a very kind review about my blog. He even went as far as calling me “a friend”, regardless of the fact that he still makes me pay for his books.

Anyway, in a side remark he suggested I might delve into a discussion of the Sagittarius’ archer’s crossbow. (Hope I got the apostrophe’s right.) I’m not sure what might qualify me for this topic, except for a certain tendency of rhetorically sniping other people’s ideas. But I recalled having colaborated some time ago in the translation of Jens Sensfelder’s article on this very subject. (Jens’ conclusion was, in a nutshell, that it’s probably a late medieval/early renaissance crossbow with few spectacular features.) So I dug into my digital cellars, came finally up with the scans of a crumpled printout of Jens’ manuscript (the original file long since having been lost), and without much ado posted it in this blog.

Four days later Nick himself announced that he has posted the original article himself in his blog. Power to the man, because back in 2003 he was involved in the translation of the article as well (which I had completely forgotten), so he has every right to do that. (And he even still has the original files, because he’s better organized than I am.) But what escapes me is why he asked me about the article in the first place.

Sometimes I feel the people attracted to the Voynich are as hard to understand as the Voynich itself…

Anyway, in a nutshell, read all about the Crossbow here!

P.S.: I will not call Nick a friend, unless he starts buying my books.

170,000 — perhaps not nearly enough

Give or take a few thousand, the VM consists of about 170,000 glyphs.

This is comparatively much. More than enough to start statistics on it.

On the other hand, it’s not really that much. For really complex statistics, the text will be to short and the results meaningless.

What I mean is: Frequency analysis on a text of several kByte enciphered with a simple substitution will most likely reveal the underlying key. The same analysis on a text of 100 letters probably won’t result in much useful.

What is reasonable (and what is not)

Sometimes in VM research, logic is employed in the place of fact. “It would have been logical for the VM author to do this or that”, or “It wouldn’t make sense…”

Tacitly, these assertions assume —

  • The VM author was sane and acted in a completely logical and rational manner
  • We have a complete and thorough understanding of what he wanted to achieve with the VM
  • We have full insight in his mindset, ie how he planned to achieve what he wanted to achieve

Hm. Now look again at the VM and probe your heart.

The deceptive nature of invented deceptions

It is a bad habit to discard parts of the VM which don’t fit with your theories or ideas as “deceptions” by the author, intended to distract wannabe codebreakers and to throw us off track.

While it actually might be the case, resorting to this stunt is technically simply an excuse for not explaining VM features.

Unless this tool is used with utmost self-restraint (some folks discard all illustrations in the VM because they don’t fit their “translations”), one will start to “explain” away each and everything, and can then arrive at any preconceived or desired conclusion.

Drosera?

Aside of the notorious sunflower, there are precious few plants from the VM’s herbal section which have been identified with any degree of certainty.

Here’s one from the German edition of the Wikipedia:

f56r (top) is associated with Drosera intermedia (bottom):

droseravoynichmanuscriptf56r

mittlerer_sonnentau_i_d_blute

When browsing the VM, I noticed that f53r also bears a certain resemblance:

vm_f53r

But what do I know about plantography…?