Period wordlists

Dan wrote already some time ago, and again I must apologize that I’m currently fairly busy with other projects, and hence can’t devote as much time to the VM as I should. Nevertheless, I finally should give him the floor:

Yeah yeah, here’s another theory. Actually I’m not going into the theory, but simply asking if you can provide any assistance in resources I am seeking. Let me back up a bit – I’m a full time software developer of over 20 years, and I have had some insights regarding the manuscript. I’ve written software to generate various statistics about the document and have found some surprising and very obvious (once distilled down to hard numbers) patterns that further validate the insights. These are not “hunches”, or “gut feelings” or any mystical, nutty stuff. It’s simply what it is, and the analysis doesn’t lie.

I am currently running brute force deciphering attempts using additional software I have developed, based on my theory of how the document is ciphered. The main resource I am lacking at this time are simply word lists of the candidate languages the manuscript may have been written in (in its decoded form of course), and specifically, the vernacular and spelling of those languages when the manuscript was written in the 1400s.

I have always assumed the Voynich manuscript was a hoax, but when it was positively dated a few years ago I took a harder look, again with the expectation that it was a hoax but at least a hoax contemporary to the 15th century. My attempt was actually to prove (just to myself) that very thing – that it is just a contrived hoax. Unfortunately the insights and analysis I have done over the last few years have left no other option but to follow the logical progression until it peters out and comes to a dead end. I have not yet reached that point.

Thanks for you time, and again, if you know of simple word lists (or who can provide them or assist in that) of good candidate languages from the 15th century, that would be quite helpful.

This question isn’t so easy to answer. First of all, even when taking the the C14 dating of the vellum as a given, we still have about a century of leeway regarding the actual production date of the manuscript. A century is a long time in which languages can change.

Secondly, languages weren’t “codified” as strictly as they are today, and pretty much everyone would write down their MSs in their local dialect, not to mention the fact that strict orthography wasn’t enforced yet either. Which means that even two people from the same region writing at the same time wouldn’t necessarily employ the same spelling. (An extreme example of this is the Bayeux Tapestry (admittedly predating the VM by some 400 years), where the name of William the Conqueror is written IIRC in not less than seven different manners.) Hence, to make a long story short, any word list should be taken with a grain of salt.

I did some statistics in the past myself, and to get decent wordlists I simply went to Gutenberg.org, downloaded a few works I considered representative of the era, and ran my own little wordcount scripts on these files.

IMHO, prime candidates for the plaintext languages are Latin, English, French, German (including the various dialects like Swiss), and perhaps Spanish. But though I wouldn’t bet on it, more exotic options like Hungarian, Finnish or maybe the Lingua Franca can’t be ruled out either.

Sorry, but this is probably as less simple answer than you asked for?