Jump to content

WoR Chapter 84 code


Satsuoni

Recommended Posts

Yes, it's a small sample size, no contest there. I still think there should be WAY more 2's if it's a 26-letter substitution, though. There's no way letters 20-26 of the alphabet just so happen to be all the least common letters - that would be absurd, but it's the only explanation I can think of for why we have 64 1's and only 18 2's. That much at least is so dramatic of an outlier that I don't think we can ascribe it to the sample size. I suggested 16 partly because of the distribution, but partly also because having so many 1's means we have to have a lot of numbers that are in the teens in order to account for all of them if this is any sort of substitution cipher. 10 letters doesn't begin to account for it; if the 10-11-12 pattern is valid, we must have at least 12 letters, and 17, 18 and 19 aren't really terribly significant for the Cosmere. 

 

That said, I will freely admit that my argument is not really based on cryptography but more on a generalized sense of logic and pattern recognition. I don't have the background in math to do any sort of serious cryptoanalysis, so I'm more just trying to poke at this for fun and see what I can come up with. I have no idea what necessary pairs are, though I'd be happy to learn. :)

Link to comment
Share on other sites

Necessary pairs are pairs where if you don't pair them with a one you get 35 and the like.  Besides which, the fact we don't see a single 2 7 28 or 29 really suggests it's a 25-26 letter substitution.

 

Also, —From the Diagram, Book of the 2nd Ceiling Rotation: pattern 15,  Could this be a hint somehow, a part of the cypher?

 

If all the vowels are in the teens that would explain the difference too.  

Edited by Aminar
Link to comment
Share on other sites

Ok, my prefix splitter spits out strings like "apearlytreeamedalrearconnvelinknmonesumsmsonrrkneerizethearonrfabidgiawnearstrexton" as possible decodings.

Note that it only works for English (as opposed to phonetic. If somebody could come up with utility that transforms english words into their phonetic approximations, like "the" to "ze", that would be very useful)

The script also only works on prefix codes, so something that has "1" and "14" as separate codes would stymy it.

OK, I am going to hardcode "12" "13" as codes and see what happens then.

Link to comment
Share on other sites

Quick thought.

 

Obviously they are fools The Desolation needs no usher It can and will sit where it wishes and the signs are obvious that the spren anticipate it doing so soon The Ancient of Stones must finally begin to crack It is a wonder that upon his will rested the prosperity and peace of a world for over four millennia
-          From the Diagram, Book of the 2nd Ceiling Rotation: pattern 1
That's pattern 1 of the Second ceiling.
This is pattern 15.  Could they be related?
Link to comment
Share on other sites

Remember that Althis only has 25 letters.  Which could mean 1-25.  The high number of 111's implies some form of frequent letter combination...  We need to get the Alethi Font guy in here...

 

Also 5 is a double letter.  Several 55s show up...

 

14 9 shows up twice in a row?

 

11 18 25 10 1 1 12 7 1 24 9 15 12 10 10 11 14 10 21 5 1 17 1 12 10 11 12 17 13

 

4 4 8 3 11 10 7 15 14 25 4 14 3 4 10 9 16 14 9 14 9 3 4 12 12 25 4 10 10 12 5

 

12 7 10 15 19 10 11 1 23 4 12 5 5 1 15 25 12 15 7 5 5 11 1 23 4 10 11 12 9 15 12 10 6 15 3 4 

 

Here's a possible doublet set to work with...

 

And Here's the Alethibet.  A B D E F G H I J K L M N O P R S T U V Y Z SH TH CH

 

I don't think it's a straight substitution.  We need to work out the vowels.

How do we know the Alethibet?

Link to comment
Share on other sites

How do we know the Alethibet?

There's a font and everything.  Somebody decoded the crazy Pitch Jump language from the Way of Kings artwork.

 

Anyway, here's a frequency analysis of Taravangian's Diagram as we've seen it, minus the locations, the numbers, and Q:/A:'s

 

e : 240

t : 186

o : 169

i : 144

a : 130

s : 127

n : 126

h : 116

r : 104

l : 71

u : 57

c : 50

d : 49

p : 47

f : 46

w : 45

m : 45

y : 35

b : 33

g : 27

k : 23

v : 17

x : 2

z : 1

j : 1

q : 1

Edited by Aminar
Link to comment
Share on other sites

There's a font and everything.  Somebody decoded the crazy Pitch Jump language from the Way of Kings artwork.

The problem is, to apply it to substitution I need the list of all words in tWoK written in alethi, to build probability tables.

Link to comment
Share on other sites

 

 

Necessary pairs are pairs where if you don't pair them with a one you get 35 and the like. 

Aminar, I'm still not entirely sure I understand your explanation. I think I understand generally what you're talking about, but I have no clue how it applies to the issue. I'm good at languages, but math is...not really terribly intuitive for me. 

 

As far as the 27-28-29 issue goes, I see two possible 17's, one possible 18 and one possible 19. Not a lot, although again, this is a small sample size. If we ignore the 10-11-12 theory, there are two possible 27's and one possible 29. If we assume that 10-11-12 is a word, there is still one possible 27 (it could be interpreted as 1-2-7, 12-7, or 1-27). I'm not really sure we have enough data to draw a firm conclusion, although I will admit that the idea of a substitution with more than 27 letters is...highly implausible to say the least. 

 

I don't think the location of the writing within the room has any particular significance to the meaning of the cipher...given the way the locations are described, it really sounds to me like Taravangian just wrote on every possible available writing surface, including the entire room and most of its furnishings. The two "ceiling rotation" epigraphs could possibly be connected, I suppose. The only other two connected Diagram epigraphs are the ones where you read the alternating letters of the paragraph and get two different messages, and the link there is much more obvious.

 

If anyone is curious, I went ahead and found a website that counts letters. Here are the frequency statistics for all of the other Taravangian epigraphs, except for the one that is comprised entirely of dates. (I excluded the attribution at the bottom of each quote, obviously.) Edit: Actually, you should probably use the other one. I forgot about taking out the Q/A's...

A 142
B 36
C 56
D 51
E 263
F 48
G 27
H 125
I 158
J 1
K 22
L 77
M 51
N 138
O 182
P 48
Q 3
R 115
S 150
T 211
U 67
V 22
W 48
X 2
Y 37
Z 1
Edited by Bibliovortex
Link to comment
Share on other sites

 

Aminar, I'm still not entirely sure I understand your explanation. I think I understand generally what you're talking about, but I have no clue how it applies to the issue. I'm good at languages, but math is...not really terribly intuitive for me. 

 

As far as the 27-28-29 issue goes, I see two possible 17's, one possible 18 and one possible 19. Not a lot, although again, this is a small sample size. If we ignore the 10-11-12 theory, there are two possible 27's and one possible 29. If we assume that 10-11-12 is a word, there is still one possible 27 (it could be interpreted as 1-2-7, 12-7, or 1-27). I'm not really sure we have enough data to draw a firm conclusion, although I will admit that the idea of a substitution with more than 27 letters is...highly implausible to say the least. 

 

I don't think the location of the writing within the room has any particular significance to the meaning of the cipher...given the way the locations are described, it really sounds to me like Taravangian just wrote on every possible available writing surface, including the entire room and most of its furnishings. The two "ceiling rotation" epigraphs could possibly be connected, I suppose. The only other two connected Diagram epigraphs are the ones where you read the alternating letters of the paragraph and get two different messages, and the link there is much more obvious.

 

If anyone is curious, I went ahead and found a website that counts letters. Here are the frequency statistics for all of the other Taravangian epigraphs, except for the one that is comprised entirely of dates. (I excluded the attribution at the bottom of each quote, obviously.) Edit: Actually, you should probably use the other one. I forgot about taking out the Q/A's...

A 142
B 36
C 56
D 51
E 263
F 48
G 27
H 125
I 158
J 1
K 22
L 77
M 51
N 138
O 182
P 48
Q 3
R 115
S 150
T 211
U 67
V 22
W 48
X 2
Y 37
Z 1

 

Well our minds are in the same place...

Link to comment
Share on other sites

Haha, yeah...I noticed yours right after I posted. I haven't gone through and compared all of the frequencies to the overall distribution for English, but O stands out to me as being unusually frequent - normally E, I and T are the most common. 

 

Also, I'd like to venture a guess that whatever number of letters we are working with, 5 is almost certainly a consonant. If the 10-11-12 is "the," we have a double 5 right after a (possible) E, which means it can't be O; AA, II and UU are all so rare in English that we can safely discard them. (If that's a 1-2 and not a 12, 5 could be O, but I tend to think not.)

Link to comment
Share on other sites

Haha, yeah...I noticed yours right after I posted. I haven't gone through and compared all of the frequencies to the overall distribution for English, but O stands out to me as being unusually frequent - normally E, I and T are the most common. 

 

Also, I'd like to venture a guess that whatever number of letters we are working with, 5 is almost certainly a consonant. If the 10-11-12 is "the," we have a double 5 right after a (possible) E, which means it can't be O; AA, II and UU are all so rare in English that we can safely discard them. (If that's a 1-2 and not a 12, 5 could be O, but I tend to think not.)

Agreed.  Also, I'm pretty sure 25 or 2 is a vowel...

Link to comment
Share on other sites

Okay, after looking at Wikipedia's frequency chart, it looks like we have a pretty normal distribution of letters after all. The lower half of the chart is relatively distorted, but I'd expect that with pretty much any sample that's under 10,000 words or so, and not all the charts agree on the frequencies for English in general anyway. All the most common letters are within a couple places of each other in the rankings, though.

 

If 25 is a letter. ;) You may be right about 2, especially since it's the second most common number and vowels are more common than consonants. I think that if we have any 25's they have to be consonants - there's a 2-5-1-2-1-2 sequence, and I'm almost positive the 1-2-1-2 is 12-12, which would make it EE. There's also a 2-5-1-2 a bit later, which means that two of the four possible 25's are probably right before an E. You can't have any vowel before a double E, and IE and UE are not terribly common. AE and OE are almost nonexistent in American English. 

Edit: Of course, we could have a word break, which would neutralize my objection in the second case. Words starting with EE are rare, though, so if I have that right it probably is in the middle of a word.

 

Edit: Yeah, I should look more carefully. It's 1-2-1-2-2-5; the problem is still there, but it would be a vowel after the double E. *headdesk*

Edited by Bibliovortex
Link to comment
Share on other sites

How do we know the Alethibet?

Alethi Bookmark 02 02 02

 
I'm inclined to think the code uses English letters.
Pattern pointed out that the other epigraphs use English letters that aren't in Alethi (e.g. c, x, q), and that they reference some unknown heiroglyphics that the Diagram *actually* used. This code (much like Navani's notes) is most likely transliterated into English to give us English readers the "feel" of what it it was like for native Rosharians.  I also played around a trying to decode the numbers into Alethi, and nothing came up.
Link to comment
Share on other sites

Either that or 25 is Y...  But even that is questionable.  It's also possible 1 8 or 18 make up a vowel.  There is only one 18 on the board and it"s right at the beginning.  But I'm off to bed, may this be solved when I wake up.  (Also, we should try to find Odium within the sample.

Edited by Aminar
Link to comment
Share on other sites

Good morning everybody - I see, you were busy while I slept. Don't stick to much to abundances. We have a very short sample here, and some epigraphs contain "insane" phrases like "WhereWhereWhere". Letter combination "Wh" could be much more abundant than statistically predicted, if there are many questions in this piece of coded text.

 

who, where, why ...

 

Considering double letters: letters can be double or the ending of one word and the beginning of another.

 

Capitals: Outsch, yes. There are capital letters in the other epigraphs. That gives us the possibility to assign two

different numbers to one letter, if needed to make sense. One for minor, one for capital letter.

 

I suspect for the moment our 111s as "wh" - so I will split them up as 1 -11 making 1=w and 11=h. (doing 11-1 would just swap,

could be neccessary for isolated 11)

Having told that, I play around. Here the link again, for all those of you who don't want to use pen and paper or program by yourself:

http://cryptoclub.org/tools/cracksub_topframe.php

 

Edit: http://home.comcast.net/~acabion/numb_pairs_extended.html (from harakeke) - This could also be useful to solve a substitution cipher. Enter a guessed word as crib and rotate it through.

 

I am again stuck with the assumtion above. But I realized how many possibilities there are to split numbers.

1118251011 alone could be 11-18-25-10-11; 1-11-8-25-10-11, 11-1-8-25-10-11,11-1-8-2-5-10-11, and so on. Brute force probably won't help us, so we need somebody with good intuition. 

 

First let's look, which numbers must stick together/stand alone and represent a letter.

 

1118-25-10-1112-7-124-9-15-12-10-10-1114-10-215-117-112-10-111217-13-

4-4-8-3-11-10-7-15-1425-4-14-3-4-10-9-16-14-9-14-9-3-4-121225-4-10-10-125- 

12-7-10-15-19-10-11123-4-125-5-115-25-1215-7-5-5-11123-4-10-1112-9-1512-10-6-15-3-4

 

  • 10, because every 0 is preceded by a 1, so 0 alone does not exist.
  • 3,4,5,6,7,8,9, because concatted with number before or after yields number larger than 25(26)
  • - after 3-9 because doublet would be larger than 25(26)

So, lets go on with this...

Edited by Pattern
Link to comment
Share on other sites

Well, I'm signing off for the night.  Good luck Pattern!

 

Some general thoughts after today:

 

It's most likely English, not Alethi. Alethi doesn't make sense for a variety of reasons.

 

I don't think it uses a keyword (a la Vigenere etc.). Nothing in the context of the other epigraphs jumps out as being a keyword, and I just get the feeling (both in and out of character) that this code is supposed to be self-contained.

 

It might be a two-step code that combines a substitution cipher and an every-other-letter transposition cipher. Which adds another wrinkle to looking for patterns (double letters, etc.) in the ciphertext.

 

I'm partial toward the parsing of "11 18 25 10 11 12 71 24 91 51 21 01 01 11 41 02 15 11 71 12 10 11 12 17 13 44 83 11 10 71 51 42 54 14 34 10 91 61 49 14 93 41 21 22 54 10 10 12 51 27 10 15 19 10 11 12 34 12 55 11 52 51 21 57 55 11 12 34 10 11 12 91 51 21 06 15 34" This is a simple "split the whole thing into pairs of 2" organization. It's straightforward for the encoder and unambiguous for the decoder. On the other hand, it isn't great statistically. The strings of numbers >26, such as "71 51 42 54" are problematic even if you only take every other letter.  But it might work in a Numbered Key Cipher with redundant cells. There's a handy tool for working on those at: http://home.comcast.net/~acabion/numb_pairs_extended.html , though I haven't found a key that works.  Might be worth trying out with some of the other proposed parsings.

 

Another thought is that it could be a polyphonic cipher that just uses the numbers 1-10. But I think that would be just too nasty.

Edited by harakeke
Link to comment
Share on other sites

Concerning Numbered Key Cipher: I see a problem with cipherlenght, largest number in cipher and resulting key-lenght:

 

I read about solving Numbered Key Ciphers:  https://sites.google.com/site/bionspot/solving-a-numbered-key-cipher

 

In our code we have 154 digits =77 doublets.

The largest number is 93, means the length of the keyed alphabet is 93+1=94 ( 1 for 00).

Expected number of blanks: approx. 6

results in estimated keylenght: 94-6=88.

 

The key would likely be longer than our plaintext message consisting of 77 doublets. Does that make sense, or did I misunderstand something?

Edited by Pattern
Link to comment
Share on other sites

Ok, do I get this right? Homophonic substitution smears out the statistical distribution of letters.

The code parsed to numbers from 1 to 25(26) yields a distribution which fairly renders a distribution in

English. (not exactly, but that is not expected with approximately 100 signs)

Therefore I would disband homophonic substitution

Link to comment
Share on other sites

Ok, do I get this right? Homophonic substitution smears out the statistical distribution of letters.

The code parsed to numbers from 1 to 25(26) yields a distribution which fairly renders a distribution in

English. (not exactly, but that is not expected with approximately 100 signs)

Therefore I would disband homophonic substitution

It can be used for that, or used for something else, depending on how many codes you  use for each letter.

 

I cannot stop thinking that, in Brandon's place, I would have encoded the message like this:

1. Split it into blocks of ten letters.

2. Take every second letter of each block splitting it in two of 5 letters each

3. Write out the letter numbers of each subblock as a decimal number

4. Take the resulting decimal number to the power of 15 modulo however many digits it had originally

5  Write all resulting blocks in sequence.

6 Encode sequence into any distribution at all using arithmetic coding of Huffman mods

7. Laugh madly looking at 17thsharders trying to decode it.

:ph34r: :ph34r: :ph34r:

Link to comment
Share on other sites

And yet you consider a nonprefix code which is an order of magnitude harder than prefix one XD

Article on that topic:

http://scholarworks.sjsu.edu/cgi/viewcontent.cgi?article=1175&context=etd_projects

Edit: ok, straightforward bruteforce doesn't work when '10' is considered a code and code length is limited to 2 (It is impossible to parse message in a meaningful way in that case). Good to know, I guess... Unless I made a mistake in the program somewhere.

Edited by Satsuoni
Link to comment
Share on other sites

Guest
This topic is now closed to further replies.
  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...