Strokes, Round 3: Fast forward

Unwilling to face defeat, I decided to give the Stroke Theory a new run. This time, as opposed to the previous attempts where I tried to analyse the ciphertext to recreate the original plaintext, I decided to go the other way around and synthesize something which might look like Voynichese.

So, I wrote me a little BASIC program that did the following:

  • Take a few lines of plaintext (in different languages),
  • Take a “Stroke Code”, which determines into which strokes the capital and minor letters are decomposed (or, if you wish, into which ciphertext letter sequence each plaintext letter is translated),
  • Drop all whitespace from the plaintext,
  • Translate the plaintext into ciphertext, according to the following rule:

    Each word consists of

    • Either a single uppercase plaintextletter, or
    • A lowercase letter, followed by an uppercase plaintext letter.*)

    (The latter was done to reflect the Currier B grammar**), where it appears that all words have to consist of an end group, with an optional start group preceding them.)

I used five different source texts, hoping they’d qualify as halfway period at least:

  • Latin: De bello gallico, by the salad dressing dude,
  • German: Das Weinbuch, a book about winemaking, anon,
  • French: Li Romanz de l’estoire dou Graal, Robert de Boron (attributed)
  • Italian: La divina commedia, by Dante
  • English: Canterbury Tales, by Chaucer

Here are the results. Note that the strokes, named *A* through *Y* bear fairly arbitrary names. (I’ll use asterisks to denote the code for a stroke, to avoid confusion with plain or ciphertext letters.) A closer inspection could relate them to their EVA counterparts.

Latin:

EB OPEA EA A OPEEEA BN AEBG AIMA RPA BN AG PIM PBN OPA AIA YOAEI PEAE EEEA SNAE ANK EEEA BN * OYLA OPANK TAIMA LA RPAEI AIMA PAIA SBG ALA AIA AE AOEEEA AEB OPEEEA * AEI EA A AEI RRPAEI OYLA A AE AEI RPA * AEEEEA PEAE PAEI AIMA OYLA PA AN BN BG ANK LA AIMA AA AIA CXLA OPB SEEA AE OPEEEA * RPBG BN AE PEAEI EB AEI EA EA A AEI YOAN EEEA EA EA OPAIA AELA PE* ARA OAIMA RPEEEA SNEA PAIA EB LA AEI * PAIA BN AE PAE LA AEA BN *EA EEEA CXA ANN TBN A AIA AEEEEA ANK BN EEEA AG A EEA AEEEEA ANK LA AIA AE * EB AEI EA EA OBN AEI ANN OPBGK TA AEAEI RPA BN EB AEI ANK LA RRPAIA OPEEA ALA AIMA EEEA AIA *AEI ANN SEEA EB A SNAIMA AEI AE PEBG RPAEI EEEA AEBN SEBGK LA OPAIA AEI AG PIM A OAA AE* AEA BG PELA RRPBG RRPAIA A TAIMA AEBG ANK AEA BN SNA AIMA A SNLA AIA AEANN SEEA CXAEI EEEA * YOANK OAN AE SEANK EEEA OPBGK TBG AG AEI B TEA AELA OPAE BGK TEEEA

German:

AG PEEEA BMF SEA B AEA SEAIA OAEEEA PEAE ANK *ANN EEEA ABN BG AAIMA AEI RPANN EEEA ARKEEEA AIA RPEEEA RPAIA A AE OPEA AEEEA PAIA EEEA OPAIA AG EEEA RRP EB EEEA SNAIMA AEI B AMK SE* SNLA RPAG EEEA PEAEI TB AEA AEI AIA OAEEEA AIMA UXEEEA PEBN LA SAEA EEEA AIA ** A EEEA SNA EEEA SNAIMA OPB ARKEEEA AIA AE* IM SEANK SNLA SAEA EEEA * BG AOEEEA SNA EEEA BN ** SEBN A RPAE * UXAIA AG *AEI RPAIA RRPAEI RPBN A SEAG PELA SAMK EEEA AE* OAAEI BMF BN PEEEA A* BN SEAIA AE OAA SEAEA LA AEIM BG AIA AG EEEA RPAMK BG ANK AIA SEANK AIA *IM AIA AG AMK AEEEA ANN EEEA ANK PEB SNA RPAE UXAIA OA AG A EEEA EEA PAIA CXEEEA ANK BMF LA BN AEI RRPEEEA AIA EXF*A EEEA AEA SEAIA AE * IM RPAG OANN EEEA SNA SEIM OEA AEEEA * EEEA A SAEA EEEA BN A RPAE *

French:

BN OPIM BG A PEAG BG PIM EEEA RPAE AEBG LA AE YOEEEA SAEA SEEEEA TANK SEAE AA AN SEAE A AE EEEA AE EA PAIMA SEAIA EEEA TANK OYLA EEEA AG SEIM AEI AIA AE B EEEA BGK TEEEA *AEA EEEA SNLA SN* B ANK A BMF IM EEEA AIA PBN AE EEEA AIA AEEEEA ANK ANK SE* AN OPANK EA EEEA BN OAA EXF* EEA A SNAE OAEEEA BN YOANK OAN AREEEA AE EEEA SNAEI RPLA AIA B A EEEA ANK SNAEI UXEEEA AIA TEEEA SEAIA AE SEANK ANK EEEA *EEEA AE AEA TB ARA EEEA PE BGK LA SEAG PEEEA UXBN OAIA AEA AEEEA RPIM BG A SEANK OA AEB OP* * TBN AEI IM AEI EA *EEEA AE BN OLA AEEEA EEEA PEANK OA AE AIMA BG TAE OAEEEA AE BG TANK RRPEEEA AIA BMF *AIMA BG LA AE OAEEEA OABG AEEEA TANK SN* AIMA BG TAE OAEEEA EEA PEBG A EXF*EEEA AEAIMA BG TAE OAEEEA BN LA EEEA TANK SN*

Italian:

AIA SEEA AIMA EEEA BMF BMF OAG EEEA EA SAEI RRPAIMA PAIA OAA RPBG BN AEANK OPIM A AEAEI RRPA ANK PAE ANK BG UXAEI PAN EEEA ANK TAIA OPBN EEEA AIM AEI BG BN B LA PEAEI * B AEA * EA OPAG PANK A AE AE OPIM PAEI SEANK OPBN AIMA AEI ANK PEA AE OP* OPAEA PBGK LA AEI AIA AEBG OPAG PANK OYLA OPEA EEEA ANK OP* B OBN OPAG TANK AEI SEBN AE AEI SNEEEA AIM OPBN SEEA IM OPEB EB A AEI EEEA AEI SNAN PEAEI EEEA AEBG PEAE SEB AREEEA AIA SEEA YOEEEA AIA BN PEEEA PEANK A AIA BG IM OPEA OPAN AEI LA PEAEI * AE AEI RPAE *AEI RRPAEI ANK OPB AEA SEAN OB O* AN A * AIMA OANK AEEEEA *AIMA AEI YOEEEA PEAE PEAEI AE AE AEI ANK AG EEEA EA AOEEEA RPB AEA A UXA AEANK BG IM OPA * OAA ANK * OAEEEA EA OPEA AEANK EEEA B BG SNEEEA B ARA IM AEA OBN SBG PEAE SE* A BG RPBG AIA SNBG AOEEEA RPANK A OAA PEB OAIMA PIM A AIA AEANK OPA *

English:

* AREEEA AIA AEAEA AEI AEAEI YOANK PEA PBN * * PAE ARAEA PBN SNAEA O* EEEA PEBN SN* BG OAE * AE AREEEA OAANK OLA EB ARAE BG EEA AIMA AEI PEB ARAEA OPAE ARAN A SEANK B EEEA AG AEBG AEAEA EEEA PEBG OAE * AEI AIA AG AOAEI AE AEA SEAG SEIM EEEA PE* UXEEEA PAIA PAIA BN LA B AREA PB BG LA PE* OEEA *AEA A B ARIM A ANK AE LA SEEEEA AIA CXEEEA AIA OAEEEA ANK *AG PBN AE AEA SEEEA ABG *EEEA PE* * AEA EEEA AIA BMF SEAN AR* PELA SNEEEA AMK EEEA * A AE AEA AEA A SNBN *BG OAE SEANN ANK EEEA OPAE AR

An asterisk “*” denotes a plaintext letter which could not be translated (umlaut or special character.)

I think this looks promising, considering it is the first try and little whatsoever has been done beside the general scheme to fine-tune the details to match the structure of the Voynichese.

The repetitivity of the original text is there (probably mostly due to the fact that there are no more than some 500 words possible with this scheme), the slightly-altered words in sequence can be found as well. *O* shows up almost only word-initial, while *A* seems to cluster at word beginnings and ends; similar to the behavious of qo and dy.

For you to check the results, here is the list of stroke codes used: Left column is the plaintext letter, right column is the corresponding stroke/ciphertext sequence. For example, the plaintext letter “B” is composed of a vertical slash (which is represented by stroke *A*), and two “crescents”, open to the left, which have stroke code *N*, giving in total *ANN*. Correspondingly, “P” is *AN* a vertical slash with one crescent.***)

A AEI
B ANN
C B
D AG
E EEEA
F EEA
G EB
H AEA
I A
K AMK
L EA
M AIMA
N AIA
O BG
P AN
Q BGK
R ANK
S BN
T AE
U LA
V IM
Z BMF

a OP
b AO
c S
d OA
e SE
f AE
g CX
h AR
i P
k ARK
l A
m RRP
n RP
o O
p YO
q OY
r PE
s SN
t AE
u T
v UX
x UX
z EXF

All in all, I think it shows promise.

Room for improvement:

  • Of course, the composition of strokes and their arrangement (Is “E” really *EEEA*, or *AEEEE*, or something completely different?),
  • The rules for word composition starting with the individually translated letters.

I think it’s still too early to deduce a plaintext language from this results.

Comments and critique welcome, as always.

*) Of course, the roles of upper and lower case could be switched. It’s all guesswork at this time.

**) Note that in this attempt I worked with Currier B, whereas the previous two rounds were played on Currier A. Just for the kicks.

***) I know this is quite cumbersome to follow, and for internal use I employ greek letters, but I havent found a ready way of incorporating them into this blog or my program code.

Advertisements

7 thoughts on “Strokes, Round 3: Fast forward

  1. How about Latin, German, French, Italian and English of the SAME text? This would add a level of control. Done your way, with source text of different meanings, containing different words, it would be harder to know if it is the language or content which are responsible for a greater or lesser similarity to VMs patterns.

    I love this, though. What you have done is what I was calling “working backwards with cipher”. Your testing your premise from the backside… putting something into cipher/code, to see if the patterns resemble patterns of the VMs. I hope you keep going with this.

  2. Well, the problem is to find people fluent enough in 15th century French/Italian/English… ;-)

    But I don’t think the differences resulting from the various topics should be that grave. Mind you, we’re still in a very preliminary phase where I’m just trying to see whether the scheme is feasible at all.

  3. You are probably correct, and at this point in your experiment it would not make much difference to control the subject matter. But for later, I think it is something to use to look for meaningful results. I’m jumping ahead, and you are probably planning this… but anyway, I have (as I have mentioned to you) planned to approach different methods like this. A breakdown for input into any experimental cipher might be:

    English:
    A poem.
    A passage from the Bible.
    An alchemical text.
    A song.
    A work of medieval literature.
    A work of Renaissance literature.

    French: All the same source texts for the above, for:
    A poem.
    A passage from the Bible.
    An alchemical text.
    A song.
    A work of medieval literature.
    A work of Renaissance literature.

    German: The same, etc.

    That way, say your basic program results for, say, the song in Italian, suddenly shows a very high similar pattern to Voynichese. The control of using the same input would then isolate this as a possibly promising result. The language, subject, and phrasing are “zeroed in on”, without one of those individual factors, alone, being the cause of the similarity… because the other factors, in other combinations, did not come up with as great a similarity. Now one might want to focus on Italian songs, to play with the stroke theory, and see what input in that language and genre might increase the results, and what tweaking the cipher might need to make increase, etc…

  4. Hi Rich,

    As of now, any such analysis would work on a letter-to-letter level, ie which letters are how frequent, which letter sequences are forbidden or mandatory (“q” always followed by “u”, “c” (in German) always followed by either “h” or “k”) — that kind of stuff.

    So, while I think tests on different languages might perhaps help in narrowing down the plaintext language, I’d guess that the differences in sytle between works of the same language (poem, Bible, …) would be too subtle to show up here.

    So many questions…

  5. Pingback: You Have Betrayed Me « Thoughts about the Voynich Manuscript

  6. Pingback: Chasing your own Tail « Thoughts about the Voynich Manuscript

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s