h1

The great tragedy of Science

August 19, 2009

The great tragedy of Science — the slaying of a beautiful hypothesis by an ugly fact.
Thomas Henry Huxley, “Biogenesis and abiogenesis” (1870)

h1

Voynicheros Velcome the “Voyager”!

July 6, 2011

Jason Davies, apparently quite new to the Voynich scene, has appeared on the stage with quite some thunder:

Hitherto, browsing through the Voynich manuscript always was a tedious affair, with people either using Beinecke‘s fairly awkward library tool, or perusing the image gallery they had set up at home for themselves. Either way, discussing matters regarding the VM has always been inconvenient when referring to details beyond a mere folio number: One had to refer to “the weirdo character on the second line of f213r”, or such, to communicate special points of interest. Of course, this is prone to misunderstandings. The fact that the VM mailing list doesn’t allow attachments helped to compound the difficulties.

Showing a detail of f77r with the URL 'http://www.jasondavies.com/voynich/#f77r/0.144/0.467/5.00'

Now, Jason has taken matters in his hands and has set up a website with a tool called The Voyager which contains the high-res scans of the Voynich, and a simple interface to navigate through the various images and zoom in and out of them, much like Google Maps. So far so good, but the real hoot of it all is the fact that the current settings for viewing an image are stored in the URL of said page. This means that pointing out a particular detail of any of the folios to someone else on the web is as simple as moving to the detail in question, copying the URL from the address line of the browser, and sending this URL to your other party who will simply paste it in their address field — ta-dah!

Thanks, Jason, for a simple, elegant and powerful little piece of software! This really rocks.

h1

Not a Rose by Any Other Name

June 27, 2011

Thing with the VM is, it offers so preciously little in terms of “hard” information helping us to decipher it, that we tend to cling to whatever shred of facts we can find to guide us along. First and foremost among these, there are the statistics on the VM, but with using them there also comes the danger of overly relying on the results along with it.

For example, to a high degree of certainty, we will have gotten our transcription wrong somewhere; not in the sense of writing down some individual error, but of consistently misidentifying letters as being different or seperate where they are truly the same or compound, or vice versa. But as long as we don’t know whether “iiin” is one letter or four, whether “ch” is one letter, two different letters or just “cc” in disguise, and as long as we can’t even know for sure whether there’s one, two, or four different gallows, all our estimates about character frequencies and word lengths are on very shaky ground. And hence, all our results.

So, while I’m all for statistical tests (After all, it’s all we’ve got, right?), and while I’m wishing for a a “universal statisticator” which would spit out the essential statistical parameters for various ciphertext candidates, I recommend taking such results with a grain of salt.

Take for example something as simple as a monoalphabetic substitution cipher.*) If someone employed that cipher and used the VM alphabet for the ciphertext, we’d probably despair in something even as simple as that, because our prime tool in that case, viz. statistical frequency analysis, would fail for sure as long as we fed it with a transciption which misidentified the ciphertext character set.

Likewise, wordlength distributions etc. are all prone to be distorted due to systematic transcription errors. So, while I agree that statistics can give us valuably hints, I’d regard them as circumstantial evidence, but not as hard facts, and if one test does give results which don’t put a theory in accordance with the VM, I wouldn’t dismiss the theory immediately, if the rest of the story looked good.

As for Zipf’s law, which is so often quoted in the context of the VM — I’ve got my hesitations about that in particular. IIUC, we don’t really know what it means if a distributions does or doesn’t obey Zipf’s law, and since random texts and even the sizes of cities can follow Zipf’s law, I wouldn’t assign too much importance to this test.

So, my advice is, stay tuned, but don’t stand with baited breath.

*) I know, the VM is not a monoalphabetic substitution.

Slightly edited from a post to the Voynich Mailing List.

h1

An Eye for an Eye, and a… Letter for a Letter…?

June 15, 2011

When one follows the developments around the VM, research into it seems to be like a stream, where new ideas and theories appear somewhere upstream in the distance to come closer, and let themselves be examined more fully while they drift by you, before they follow the water downstream and finally vanish in the dusk of forgetableness. Rarely one of them will leave so much as a beacon behind.

One repeated pattern that seems to crop up time after time, and goes mostly unnoticed by even the old-time and seasoned Voynicheros, is the more or less explicit assumption that one VM letter is the equivalent to one plaintext letter, and that one VM word corresponds to one plaintext word.

Now, I can see where this would come from: Of course it’s natural to assume, it’s the most simple and straightforward way to do it (Which, in itself ought to be a warning sign: If the VM was enciphered “simply and straightforward”, it would have been solved long ago…), it’s the way ciphers were done in period, and it lends itself readily to easy analysis.

Unfortunately, in all probability this is not the way the VM was cooked up. Let’s look at a few of the arguments against this case:
Read the rest of this entry »

h1

Understanding the Voynich…

February 23, 2011

… without a decent decipherment is the equivalent of a blind man trying to find his way with a wet noodle rather than a stick.

h1

Creativity is Making the Complicated Simple

September 22, 2010

Charles Mingus

I’ve received some comments that people were having problems understanding what the whole Stroke Theory is really about. Admittedly, I’ve contributed fairly to this confusion by letting the theory develop and reporting the “increments”, but never having provided a concise overview of the status quo.

I want to make amends for that and have written a short PDF document which outlines the basic tenets of the ST and explains some features and consequences of it. I will keep this document updated, so that it will always reflect the current state of affairs, and you don’t need to wade through pages of history.

The actual works of the enciphering system according to the ST is only three pages, the rest are essentially discussions of special properties and background information.

Here goes.

h1

Just so You Know…

August 24, 2010

I’m not dead. I haven’t fallen off the edge of the world. I was just busy.

And right now I’m hacking away at a little tool which should settle the question of the Stroke Theory once and forever.

But it could be a little while until it’s sophisticated enough, up and running…

h1

Crossing the Line from Amateur to Dilettante

May 27, 2010

Although in general I recommend the Voynich mailing list to everybody interested in the subject, the current discussion about using distributed computing to crack the VM with a statistical brute-force attack very much reminds me of people planning some open-heart surgery with their medical knowledge gained from watching a few episodes of House.

“If at first you don’t succeed, use a bigger hammer.” The sad thing is that while one supposed side effect of this would be to make the project and VM research more public (in the vein of the SETI project), and hence to attract more people to it, I’m afraid a bungling approach like what is currently considered would have the opposite effect of exposing us to ridicule.

(Not to mention the fact that people seem to think that using a distributed approach would invalidate all previously gathered statistical information about the VM, like that it certainly is not a simple substitution, and that with a high degree of confidence one VM word is not equivalent to one plaintext word.)

h1

voynich.net is up again

May 27, 2010

Rich SantaColoma has taken over the helm at voynich.net. This site features some information about the VM, but most importantly it serves as the subscription point for the Voynich mailing list (which had temporarily been mirrored at Rich’s own site).

So, everybody feel free to subscribe to the list (if you haven’t already done so). The list still is the central hub for information and “research” regarding the VM.

Thanks for the good job, Rich!

h1

What did Huxley mean?

May 4, 2010

I just noticed that apparently somebody came across this site by searching for: the great tragedy of science — the slaying of a beautiful hypothesis by an ugly fact what did he mean

To avoid leaving the pour soul in the dark, the way I understood this quote is that in the course of scientific research, many a “beautiful” (elegant, simple, powerful) hypothesis is developed when one digs into a topic. Unfortunately, it happens quite so often that later experimental findings “rear their ugly head” by giving proof that the beautiful theory is beautiful, but false. And since we are not in the art department, truthfulness takes precedence over beauty.

In the case of the VM (admittedly not exactly a “science”), beautiful theories about its encryption get constantly slayed by statistical or systematic evidence to the contrary. Unfortunately, there is a tendency for people to twist the facts (or outrightly ignore them), rather than giving up the perceived beauty of their approach.

h1

Chasing your own Tail

May 3, 2010

As I mentioned the other day, I had committed a serious omission in my assessment of the Stroke Theory.

I had written a little tool to analyse the VM ciphertext and decompose it into the hypothetical “syllables” of the Stroke Theory, where each ciphertext “syllable” would represent one plaintext letter. This tool worked by constantly modifying the hypothetical syllable set and retaining those modifications which lead to an overall increase of “coverable text” (= ciphertext words that could be composed from the syllable set). The tool got “saturated” (ie, further changes would increase the overall coverage no more), when the syllable set could compose 66% and 74% of the ciphertext by volume, for Currier A and B, resp. This was interesting, but by no means convincing.

A different approach used a second tool which would synthesize ciphertext from inoccuous plaintext, according to the rules of the Stroke Theory. This test was simply on a qualitative basis, to see whether the ciphertext rendered this way would look anything like the VM, and, in my humble opinion it actually did.

It took me about a year to figure out that one could combine the two approaches, namely letting the analytical tool work on the results of the synthetical tool. In a perfect world, namely if the analytical tool worked correctly, this chase of one’s own tail should result in a 100% coverage of the plaintext.

Of course it didn’t. For instance, the plaintext used — the German 15th century “Weinbuch” — contained a few characters (like arab digits or umlaute) which couldn’t be properly transcribed in the synthesis and which could thus not be recovered.

Still, the result was that about 68% of the plaintext words by volume could be synthesized before the analysis program showed signs of stalling. This teaches us two things:

  1. It’s — pardon my french — fucking close to the results for the VM, and
  2. This can’t be due to special unencrypted characters alone.

Upon closer inspection, I found the culprit indeed. Namely, I had set the minimum syllable length in the analyzer to 2 strokes, and this is of course stupid. Letters like “I”, “l”, “o” will hardly require more than one stroke (and indeed required only one in my synthesis). Thus, these letters were effectively indecipherable, and this may well account for a good deal of the lost coverage.

This error of course would also hold true for the VM. (There is no reason why the VM author should have chosen to use a minimum of two strokes per plaintext letter.) And the fact that this programming error led to almost the same amount of lost coverage in the VM as in the synthesized text could be a hint that the same effects are in play, and that I’m thus on the right track by chasing my own tail…

Follow

Get every new post delivered to your Inbox.