Ultimate strictly academic ID test.

Is there a test to distinguish a Gromark cipher (letters only, no primer given) from a running key cipher? I think this is the ultimate useless ID test. In the Cryptogram, a Gromark is always accompanied by its primer, there’s no question about what type it is. Also according to ACA rules a running key is about 50 letters, always shorter then a Gromark cipher.

Nevertheless I got interested in IDing Gromark ciphers when only the letters are given — no primer. But my ID tests gave poor results. The most common error was misidentifying a Gromark as a Progressive key cipher. I came up with a progressive key ID test using logs of digraphs that worked pretty well at IDing progressive key ciphers. But it didn’t help ID Gromark ciphers. When I ran my improved ID test on a Gromark cipher, the test no longer misidentified them as progressive keys. Instead it misidentified them as Running key ciphers.

I have a feeling there is a test based only on ciphertext letter frequencies that will distinguish running keys from Gromarks. In case anybody wants to try coming up with such a test, I generated 10,000 Gromarks and 10,000 running keys. They are in a zip file with the link below:

https://drive.google.com/file/d/1jPP6vjUFzpK4PqFwPprtXqRlWfMchCCX/view?usp=sharing

Each line has 7 fields separated by commas: (1) The running key’s “key” which is just the first half of its plaintext, (2) the second half of the running key’s plaintext, which is equal to the Gromark’s entire plaintext, (3) Gromark key, (4) Gromark primer, (5) Gromark ciphertest (6) running key type: 0 is vegenere, 1 is variant, (7) Running key ciphertext.

The running key plaintext is twice as long as the Gromark plaintext so their ciphertext lengths are equal.

2 thoughts on “Ultimate strictly academic ID test.”

  1. My Analyzer has a test for Gromark which is fairly accurate compared to other types of its length, like Progressive Key. You produced a file with both those types last year and my program worked well on those two. For Running Key, though, my program mostly uses length, as you suggest, because of the ACA guidelines. I commented out the length test part and ran the program on the first five Running Key cts and in every case Gromark outscored Running Key. Other types were mostly on top with both Gromark and RK down below third place most of the time.

    Since Running Key and Autokey both involve plaintext being enciphered by plaintext, it seems logical that letter frequencies should reflect a specific skew. Without actual testing, my hypothesis would be to identify the ct letters produced by enciphering ETAION, or even ETA, with those same letters in Vigenere and expect those letters to predominate. The same could be done for Beaufort and Variant.
    For example, here are those letters enciphered in Vigenere, with key on left in caps, and pt the top row, ct in the columns:
    A aeinot
    E eimrsx
    I imqvwb
    N nrvabg
    O oswbch
    T txbghm
    The letters B,I, and M are the three most often produced letters in this chart, which assumes the six letters AEINOT are of equal frequency in pt. Thus one would expect BIM to be of higher than usual frequency in a Vig-enciphered RK. Since E and T, the two most frequent letters, when used as pt and key in either order produce ct X, I would expect the X to be in high frequency, too. I pasted a few thousand characters from your RK cts into my program and the A came out as most frequent. Since you mix Vigenere and Variant, this is unsurprising. The A enciphers itself as A in both. I’m too lazy to write a program separating the Vig from the Var types right now, but you might want to do so to see if these individual letter tests are predictive. The X did appear to have an elevated frequency in the short segments I tested.

  2. Yes, the plain letter pairs that are encrypted to give Vigenere or Variant/Beaufort cipher letters were used to calculate running key ciphertext frequencies by ROT13 in the JF 2002 Cryptogram. He used these frequencies to try telling Vigenere running keys from variant/Beaufort running keys. For the short running key ciphers used by the ACA this doesn’t work too often. But I think there is a good chance something like this will work in telling Gromarks from running keys.

    I’m thinking of using the 20,000 ciphers at the link to train a neural net using nothing but ciphertext frequencies to come up with a test that distinguishes running keys from Gromarks. It might work.

Leave a Reply

Your email address will not be published. Required fields are marked *