New test

BION some time ago requested anyone with a way to distinguish a Two-Square ciphertext(ct) from a Foursquare ciphertext to share it. I may have found a simple test for that. See this chart.

I ran my new test on the 10,000 examples in BION’s data set of plaintext enciphered twice, once using Two-square and once using Foursquare. There was a significant difference in the test results for the two types. In the graph, the bars represent how many of the 10,000 ciphertexts of each type scored in the range indicated, with the X-axis numbers representing the top of the range. For example, as indicated by the bars directly over the label of 0.6, 1657 of the 10,000 Two-Square cts scored between 0.55 and 0.6 on my new test, while 530 of the Foursquare cts scored in that same range. All the ct samples here were of length 200. The results were less clear for shorter texts. Normal plaintext (i.e., not the stilted P-12 types) averaged about 0.3, depending on length. For those of length 200 or more, the average was .25, with very few scoring over 0.3. For my plaintext samples I used my solutions to ACA ciphers, including stilted types like the P-12s.

The test is similar to the Normor test and is simple to program, but not as simple for pen-and-pencil solvers. To compute it, count the number of appearances of every letter of the alphabet and for each divide that by the length of the ciphertext. Then take a sum of all the differences (absolute values) between the observed and expected values for each letter. I used the frequency counts shown in Caxton Foster’s Cryptanalysis for Microcomputers Appendix A (English) as my baseline expected values. For example, if I counted 20 E’s in the ciphertext (using BION’s data, i.e. a 200-letter ct), that’s 10% or 0.1 of the total. The expected value in English is .125 for E. The difference is .025. Add that to the difference for every other letter, including the ones that do not appear in the ct. That is the result of the test. My code in Delphi is below.

for i := 1 to length(S) do if S[i] in ['a'..'z'] then 
freq[ord(S[i]),2] := freq[ord(S[i]),2]+1;
for j := 97 to 122 do freq[j,2] := freq[j,2]/Length(S);
r := 0;
for k := 97 to 122 do r := r + abs(freq[k,1]-freq[k,2]);

The value of r is the result of the test. The first row of the “freq” array (i.e. freq[97,1], freq[98,1] etc.) must first be populated with the expected frequencies of the letters in the target language. S is the ciphertext string. The first line counts the observed letters and populates the array [.,2] positions with the results. The second line divides those values by the length of the ciphertext. The final line computes the running total of all the differences. The test is so simple that someone before me must have already invented it, but, if so, it has escaped my notice.

I didn’t intend this as a diagnostic test for type. Rather, I was hoping to solve another problem. Most cipher types, when decrypted with a wrong key, produce fewer high-frequency letters than normal plaintext, and more low-frequency letters than normal plaintext. The Normor test produces high numbers for these while the tetragram frequency scores are low. As the solution gets closer to correct, this changes to be closer to normal; the Normor test results get smaller as tetragram frequency scoring gets higher. However, some types, such as Morse-based types and the Grandpre, during hillclimbing often produce false solutions that outscore the correct solution because they produce many high-frequency letters. Since the frequency order of the letters may be close to normal, the Normor test doesn’t help. I was hoping this new test, which for now I’m just calling the frequency test, would provide a tool during hillclimbing scoring to prevent these false solutions. I tried implementing it on a Grandpre word-based hillclimbing program without success so far, but I may still try using it for that purpose. I have not incorporated it in my Analyzer yet, nor have I tested it on other cipher types besides the two shown in the chart.

One thought on “New test”

  1. That’s really interesting. I can’t think of a reason why two-square ciphertext would be closer to plaintext than four-square ciphertext. But I did notice something like that when I did a neural net ID test for distinguishing between two and four-squares. I included the frequencies for the even and odd-positioned ciphertext and including those frequencies improved the ID test results. Of course I did’t include a test against standard english frequencies but the neural net training algorithm might have given higher weights to more frequent plaintext letters.

    Also after testing your Portax algorithm, I used the same even and odd positioned ciphertext frequencies as in the two-square versus four-square test to train a neural net to distinguish between portax and other ciphers. The result acted very like your portax test. Neural nets don’t use division so this test didn’t use ratios of frequencies just the frequency differences but the results were almost identical.

Leave a Reply

Your email address will not be published. Required fields are marked *