On Paillier Ciphersystem, Binary Search and the ASIS CTF 2014

The ASIS CTF happened last weekend. Although I ended up not playing all I wanted, I did spend some time working on a crypto challenge that was worth a lot of points in the game. The challenge was about a sort of not well-known system, the Paillier cryptosystem.


The Cryptosystem

The challenge was started by netcating to nc asis-ctf.ir 12445:

  > Here we use a well-known cryptosystem, which introduced in late 90s as a part of PhD Thesis. This cryptosystem is a probabilistic asymmetric algorithm, so computer nerds are familiar with the basics. The power of this cryptosystem is based on the fact that no efficient general method for computing discrete logarithms on conventional computers is known. In real world it could be used in a situation where there is a need for anonymity and a mechanism to validate, like election. What's the name of this cryptosystem?

The answer would return an oracle:

paillier
The secret is: 642807145082286247713777999837639377481387351058282123170326710069313488038832353876780566208105600079151229044887906676902907027064003780445277944626862081731861220016415929268942783162708816500946772808327134830134079094377390069335173892453677535279678315885291394526580800235039893009001625481049390361761336337647597773237774304907266052473708273977012064983893047728185071148350402161227727584760493541436392061714945686426966193498593237383322044894184438689354989491800002299012669235551727100161976426628760839759603818593410342738847167887121724862806632881028892880165650108111619269651597119870237519410
Tell us your choice:
[E]ncrypt: [D]ecrypt:

Of course, simply decrypting the secret wouldn't work.

Pai-what?

The Paillier cryptosystem was named after Pascal Paillier, its inventor, in 1999. It is a probabilistic asymmetric algorithm used for applications such as electronic voting.

Being probabilistic means that a message is encrypted with some randomness, so it can generate several different ciphers. All the ciphers will be decrypted back to the same message (but not the other way around).

Being asymmetric means that it is based on public-key cryptography, i.e., two keys are generated, a public and a private. The public key is used to encrypt the message and the private key is used to decipher a ciphered message.

All right, now tell me something interesting...

What matter for us is the fact that this system has a homomorphic propriety. Well, from Latin, homo means the same (like homogeneous) and morphic means form (like metamorphosis). So this propriety says that the cipher conserves the form of the message, even if we play with it. This propriety in a cryptographic algorithm is also called malleability.

In other words, if we know only the public key (which is the modulom n, a multiplication of two large prime numbers), we can manipulate the cipher and still get the original content of the plain message.

For example, the multiplication of a cipher 1 by a cipher 2 is decrypted as a sum of the message 1 to message 2:

And, look at this: we can also exponentiate the cipher by some constant k!

Pretty cool, huh? I think so... :)

Simple implementation of a Paillier system

Let's highlight the implementation of this system in Python. All we need is a large prime number generator, which we can borrow from pyCrypto.

The public key has five elements:

The script below shows how this works:

import Crypto.Util.number as cry
import gmpy
import random

def generate_keys(nbits):
    p = cry.getPrime(nbits//2)
    q = cry.getPrime(nbits//2)
    n = p*q
    g = n+1
    l = (p-1)*(q-1)
    mu = gmpy.invert(((p-1)*(q-1)), n)
    return (n, g, l, mu)

def encrypt(key, msg, rand):
    # c = g^m * r^m % n^2
    n_sqr = key[0]**2
    return (pow(key[1], msg, n_sqr)*pow(rand, key[0], n_sqr) ) % n_sqr

def decrypt(key, cipher):
    # m = L(c^l mod n^2)*mu mod n
    return (((pow(cipher, key[2], key[0]*key[0]) - 1)// key[0]) * key[3]) % key[0]

def get_random_number(n):
    return random.randint(n//2, n*2)

def paillier_poc():
    N_BITS = 1024
    MSG = 1337

    key = generate_keys(N_BITS)
    rand = get_random_number(N_BITS)

    cipher = encrypt(key, MSG, rand)
    decipher = decrypt(key, cipher)

    print("The message is {}\n".format(MSG))
    print("The cipher is {}\n".format(cipher))
    print("The decipher is {}\n".format(decipher))

if __name__ == '__main__':
    paillier_poc()

All right, time to go back to our challenge.


Understanding the Challenge

Performing some recon in the oracle, I noticed the following:

  • Encryption and decryption of any integer works...
[E]ncrypt: [D]ecrypt: e
Tell us your message to encrypt: 1
Your secret is: 73109965080485247131710209266123910705889636744106672869822932981580432295328645599823550448181731566435402609978665387224898646975403769881196399448975370668935092605229755765060164052771714510987944591017615792396157094596290728393053648253053017939625091326878542241485082342371560710778399247063411414649475517288243167425022137869055256778307340931947663486971023680806406250041891606619955393621120918102708442427400288119511466304393700124201965017764148482926998000012235997591413309617388902575733355188418714479900913342627281937156809563150498906460101268562252351167461233533852277300215020108137992142
Tell us your choice:
------------------------
[E]ncrypt: [D]ecrypt: d
Tell us your secret to decrypt: 73109965080485247131710209266123910705889636744106672869822932981580432295328645599823550448181731566435402609978665387224898646975403769881196399448975370668935092605229755765060164052771714510987944591017615792396157094596290728393053648253053017939625091326878542241485082342371560710778399247063411414649475517288243167425022137869055256778307340931947663486971023680806406250041891606619955393621120918102708442427400288119511466304393700124201965017764148482926998000012235997591413309617388902575733355188418714479900913342627281937156809563150498906460101268562252351167461233533852277300215020108137992142
Your original message is: 1
  • ... but up to a size! I tried to input a ridiculous large number and it was rejected. Anything a bit larger than an encrypted message was rejected. This is important! It means that the we might have a modulo here. It also means that we cannot just multiply two ciphers and ask the oracle, since the message would be too large.

  • The secret was changing periodically (probably each hour). It means that the keys (the module) were changing too. This ruins any plan of brute forcing it.

  • Remember I said that the Paillier encryption is probabilistic, i.e., different ciphers can map back to the same value? This was clear from the oracle: if you asked repetitively to encrypt some number, say 1, it would return different messages each time. Interesting enough, however, if you restarted the system (leaving the session and netcating again), the same order of different ciphers would appear again. Completely deterministic.

  • Everything that is not an integer number was rejected by the oracle (no shortcuts here).

Automatizing Responses

Whenever we have a netcat challenge, I like to have a clean script to get and send messages (copying from terminal is lame).

In addition, all the numbers in this challenge were really long (the encrypted messages had 614 chars), so we need to perform operations in a consistent and efficient way.

To work with long numbers in Python, I use the Decimal library. For instance, considering that we could want to add two messages, I set the context to 1240 bytes:

decimal.getcontext().prec = 1240

For now on I will be using modifications of this snippet to interact with the oracle:

import decimal
import socket

def nc_paillier():
    # create socket
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST, PORT))

    # answer the initial question
    print s.recv(4096)
    s.send(b'paillier')

    # get the secret
    print s.recv(4096)
    m = s.recv(4096)

    # cleaning it
    m = (m.split(": ")[1]).split('\n')[0]

    # it's good to print (because it changes periodically)
    print("The secret is: ")
    print(m)

    # change from str to long decimal
    mdec = decimal.Decimal(m)

    '''
        From here you can do whatever you want.
    '''
    # If you want to encrypt messages
    msg_to_e = '1'

    s.send(b'E')
    print s.recv(4096)
    s.send(msg_to_e)
    me = s.recv(4096)
    me = me.split(": ")[1]

    print("Secret for %s is:" %(msg_to_e))
    print(me)

    medec = decimal.Decimal(me)

    # If you want to decrypt messages
    msg_to_d = me

    s.send(b'D')
    s.recv(4096)
    s.recv(4096)
    s.send(msg_to_d)
    md = s.recv(4096)
    md = md.split(": ")[1].strip()

    print("Decryption is: ")
    print(md)

    mddec = decimal.Decimal(md)

if __name__ == "__main__":
    # really long numbers
    decimal.getcontext().prec = 1240

    PORT = 12445
    HOST = 'asis-ctf.ir'

    nc_paillier()

Solving the Challenge

Finding out the Modulo

At this point we know that we cannot do anything in the Paillier system without knowing the modulo, n. Of course, this value was not given. However, from the recon we have learned that we can find it from the oracle.

The exactly value when the oracle cycles back to the beginning is our n. This value should not return any message since n%n = 0. So, we are looking for this None result.

What's the 101 way to search for a number? Binary search, of course! Adapting one of my scripts, I wrote the snippet below to look for this number.

The only problem is, the secret has 614 chars and we cannot scan all the 1E614 integer numbers! However, the only way that adding two encrypted messages would work was if the module n lied at somewhere around 614/2. For this reason we set our low and high search values around 1E306 and 1E308:

import decimal
import socket

def bs_paillier(lo, hi, s):
    if hi < lo: return None
    mid = (hi + lo) / 2
    print("We are at: ")
    print(mid)

    s.send(b'E')
    s.recv(4096)
    s.recv(4096)
    s.send(str(mid))
    ans = s.recv(4096)
    print ans

    if 'None' in ans:
        print "Found it!"
        return mid + 1
    elif 'Your message is' in ans:
        return bs_paillier(lo, mid-1, s)
    else:
        return bs_paillier(mid+1, hi, s)

def get_mod_paillier():

    # create socket, answer first question
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect((HOST, PORT))
    s.recv(4096)
    s.send(b'paillier')
    s.recv(4096)

    # start binary search
    hi, lo = 10**306, 10**308
    mod = bs_paillier(lo, hi, s)
    print mod

if __name__ == "__main__":

    PORT = 12445
    HOST = 'asis-ctf.ir'

    get_mod_paillier()

The script took around 5 minutes to be finished, returning our n for the round:

17671943390317527594740575037779239788090749028363849573873871285525785364877468659238291287413782918855995881353189626069716161186805808731291508724925847487655603905895106750055611619881911787280882269077856999823769344599404478814635216943095238063240285592085964648122007040660676934950342692770738186633

We are ready to break this system!

Convoluting and Decrypting the Answer

We use the first homomorphic propriety to craft a message which will give us the flag:

def convolution(e1, e2, m2, mod):
    return (e1 * e2 )%(mod*mod)

This returns:

112280008116052186646368021111109237871341887479615992317427354059600212357857288108129264030737539972269011409996912208818076010168198529262326772763879996822102998582534816837233418191109543582310863266347760528961946352593880742472731062537568071764692096627743690922908221673060800965135164509962029222789740380393803936034322030637125612078147029021325858011668393839188326210425016990153989714767763564793169128967154011677910768035056556423954036535193602627342043345257626256977045594687342553610052190731351090959432138598081567054374046432685112246445454537701332996220698854386609070078979440083610540257

Sending it back with the decrypting option returns our (possible) flag minus 1 (remember, this is the message 2).

Getting the Flag!

Of all the possible options of decoding our flag, it was obvious that they were hexadecimal values that should be converted to ASCII.

It also helped the fact that I had in mind the ASCII representation of the word ASIS, which is given by 41 53 49 53:

$ python -c "print '0x41534953'[2:].decode('hex')"
ASIS

This is easier with the following script:

def print_hex(secret):
    # cutting L in the end
    a = hex(secret)[:-1]
    # cutting the \x symbol
    b = a[2:].decode('hex')
    return b

Running it, we get the hexadecimal form of our flag:

32487808320243150435316584796155571093777738593139558163862909500838275925645449950017590

And its ASCII decoding,

0x415349535f3835633966656264346331353935306162316631396136626437613934663836

Leads to the flag:

ASIS_85c9febd4c15950ab1f19a6bd7a94f87

Cool, right?

If you think so, all scripts I mentioned are here.

Hack all the things! (: