I had fun thinking up the stupidest data expansion protocol: AONT hiding mearly the parity of data. Want data to take up 8x as much space? make 8 random bits whose parity matches the bit you’re replacing. Unsure of the security of it? Loop it several times and take up A LOT of space.

This code uses the same ‘barely glanced at wikipedia’ knowledge of an All Or Nothing Transform, but with the added knowledge of what an AEAD mode like GCM is doing. Yeah, I’m happy with that.

Wonder what would happen if you combined this with an error correction algorithm or probabalistic encryption and then intentionally introduced errors?

Or… what if the amount of time you wanted to spend reading the data was no object? After all, it’s just meant to be incompressible nonsense.

Anywho, here’s the code. After it all, I’ll explain what I’m trying to do here (still slightly broken as I’m not holding bits in python 3 vs 2…)

#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import absolute_import
from __future__ import unicode_literals
from __future__ import print_function
import sys
from six.moves import zip, range
from Crypto.Hash import SHA256
from Crypto.Protocol.KDF import PBKDF2
from Crypto.Random import get_random_bytes
from Crypto.Cipher import AES

PY3 = True
if sys.version_info < (3, 0):
    PY3 = False

def parityOf(int_type): #bin()
    int_type = int.from_bytes(int_type, byteorder="big")
    parity = 0
    while (int_type):
        parity = ~parity
        int_type = int_type & (int_type - 1)
    if parity == -1:
        parity = 0
    if parity == 1:
        raise Exception('hello')
    return parity

def enlarge(data,ratio=8):
    # data = AE(data)
    result = b''
    for byte in data:
        print(byte)
        first = get_random_bytes(ratio-1)
        firstPar = parityOf(first)
        bytePar = parityOf(byte)
        last = 0
        if bytePar and not firstPar:
            last = 1
        elif not bytePar and firstPar:
            last = 1
        print('byte: "{}" parity:{}'.format(byte,bytePar))
        print('first: "{}", parity:{}'.format(first,firstPar))
        result = result + first
        result = result + int.to_bytes(last, 1, byteorder="big")
    return result
    # return AE(result)
def emit(data,ratio=8):
    n = max(1, ratio)
    result = b''
    for index in range(0, len(data),n):
        section = data[index:index+n]

        print(section)
        byte = parityOf(section)
        print('section: {}, retrived byte: {}'.format(section,byte))
        result = result + int.to_bytes(byte, 1, byteorder="big")
        print('result so far: {}'.format(result))
    return result

def AE(content):
    """
    generate a key using deriviation,
    although you won't need to remember
    that password if you have all the
    content.
    """
    key_raw = PBKDF2(
        get_random_bytes(32),
        salt=get_random_bytes(16),
        dkLen=32,  # 16,
        count=1000  # , #1000
        # prf=None
    )
    token = WA(key_raw).encrypt(content)
    """
    hash the token, then xor with the 32 bit key.
    concattenate token with xor'd key.
    """
    hashable = SHA256.new(token).digest()
    if PY3:  # py2 won't test
        chard = int.from_bytes(hashable, byteorder="big") ^ int.from_bytes(
            key_raw, byteorder="big")
        chard = chard.to_bytes(32, byteorder="big")
    else:
        chard = b"".join(
                [chr(ord(a) ^ ord(b)) for a, b in
                 zip(hashable, key_raw)])
    return token + chard


def AD(cyphertext):
    """
    The last 32 bits of the cyphertext are the xor of
    the hash of the preceeding cypher, and the 32 bit key.
    pulling that together into a base64 string allows
    AES to decrypt the content.
    """
    hashable = SHA256.new(cyphertext[:-32]).digest()
    key_xored = cyphertext[-32:]
    if PY3:  # py2 won't test
        key2 = int.from_bytes(hashable, byteorder="big") ^ int.from_bytes(
            key_xored, byteorder="big")
        key2 = key2.to_bytes(32, byteorder="big")
    else:
        key2 = b"".join(
            [chr(ord(a) ^ ord(b)) for a, b in
             zip(hashable, key_xored)])
    return WA(key2).decrypt(cyphertext[:-32])


class WA():
    secret = None

    def __init__(self, secret):
        self.secret = secret

    def encrypt(self, data):
        # padded = pad(data, 16, 'pkcs7')
        iv = get_random_bytes(AES.block_size)
        lock = AES.new(self.secret, AES.MODE_GCM, iv)
        return iv + lock.encrypt(data)

    def decrypt(self, data):
        iv = data[:AES.block_size]
        templock = AES.new(self.secret, AES.MODE_GCM, iv)
        return templock.decrypt(data[AES.block_size:])


data = enlarge(b'hello there, this is a message to you.',8)
print(data)
result = emit(data,8)
print(result)


cyp = AE(b"Secret Message!")
print("cyphertext is {}".format(str(cyp)))
plaintext = AD(cyp)
print("plaintext is: " + str(plaintext))

The “Parity Of” function is from S.O., which explains half my problems for not knowing what I’m after.

The “enlarge” function loops through each… well whatever unit python wants to be using. I should normalize it into an object where I know what a slice of it is… usually a byte.

Anywho, the method I chose was to get, say 7/8, compute parity, and then add the 1/8 on to it. One could simply make a parity and then chose between two of them, or add to one of them if rounding wasn’t a thing (yeah, it’s a thing).

Emit really only has to take your declared ratios (say, in a header), split the data into blocks, and compute parity.

Memories

Remember the last time I wrote about an AONT? Well, this one’s a little less dramatic.

From the Congredi days, I know we’re after a nice XOR-SHA-AES based transform, but in this case we’re grown-ups and we’ll use an AEAD-enabled system, GCM mode. Other than that, still the same.

Are there weaknesses to this sort of parity based expansion?

Well, time, space, stupid programmers like me.

If you flip one bit on that message, without flipping another one too, you’ve hurt the message. So… technically, after a fashion, the resulting “expanded” code, before it get’s AONT’d, can have any bits within an N length chunk flipped, so long as parity is preserved. That’s kinda cool.

If you’ve got a cleaner way to handle that darn parity calculation that’ll actually work smoothly, I’d like to know.