Wednesday, June 23, 2010

Reed Muller Code and Plotkin Composition

Reed Muller Code is an elegant code that can be used for error correction. Reed Muller Code RM(r,m) is constructed using monomials of m logical variables and maximum degree r. For example,

RM(0,m): 1
RM(1,2): 1 + x1 + x2
RM(1,3): 1 + x1 + x2 + x3
RM(2,2): 1 + x1 + x2 + x1 x2
RM(1,3): 1 + x1 + x2 + x3
RM(2,3): 1 + x1 + x2 + x3 + x1 x2 + x1 x3 + x2 x3
RM(3,3): 1 + x1 + x2 + x3 + x1 x2 + x1 x3 + x2 x3 + x1 x2 x3

The monomial for RM(0,m) consists of only value 1. Hence, the generator matrix for RM(0,m) = [1 1 ... 1] (m times), which is repetition code.

The generator matrix of RM(1,2) can be constructed by considering all combination of terms 1, x_1 and x_2:
1| 1111
x1| 1100
x2| 1010

We see that the generator matrix consists of 3 rows (input bits) and 4 columns (output bits). This means out of 2^4 possible code words, only 2^3 of them are valid. The code rate is then 3/4. The minimum Hamming distance of the code is 2.

As an example for those without background in information coding, if the input bits are x=[1 1 0]', then the output bits are Gx = [0 0 1 1]' in modulo 2 arithmetic. The following table lists all 16 possible 4-bit combination, and the valid code words in RM(1,2):

PatternValid?PatternValid?
0000Yes1000No
0001No1001Yes
0010No1010Yes
0011Yes1011No
0100No1100Yes
0101Yes1101No
0110Yes1110No
0111No1111Yes

The generator matrix of RM(2,2) can be constructed by considering all combination of terms 1, x_1, x_2 and x_1 x_2:
1| 1111
x1| 1100
x2| 1010
x1 x2| 1000

The four rows are linearly independent. Hence, the code spans the whole space {0,1}^4. In other words, RM(2,2) has rate 1, and every 4-bit sequence is a valid code word.

It is easy to see that, in general, RM(r,m) is composed of monomials that D = \sum_{i=0}^{r} nchoosek(m,i) number of terms, which gives a code rate of D/2^{m}. The minimum Hamming distance is 2^{m-r}.

Plotkin Composition
Plotkin observed that any monomial f(x1, x2, ... , xm) can be decomposed into

f(x1, x2, ... , xm) = g(x1, x2, ... , xm) + xi h(x1, x2, ... , xm)

where g(x1, x2, ... , xm) contains all the terms from f(x1, x2, ... , xm) that do not contain xi. After factorization, h(x1, x2, ... , xm) also does not contain any xi. In other words, both g(.) and h(.) have one less variable than f(.). Furthermore, g(.) has the same order has f(.), and h(.) has an order that is one less than f(.). That is, RM(r,m) can be constructed by RM(r,m-1) and RM(r-1,m-1).

Let u be a valid codeword in RM(r,m-1), and v be a valid codeword in RM(r-1,m-1), then a (u, u+v) is a valid codeword in RM(r,m), with reordering or indexes. Beautiful!

As an example, consider RM(2,3). Its generator matrix is
1| 11111111
x1| 111100
0
0
x2| 11
0
0110
0
x3| 10
10
1
0
1
0
x1 x2| 11
0
00
0
0
0
x1 x3| 101
00
0
0
0
x2 x3| 100010
0
0

We break down the 8-bit code into two 4-bit codes:

1 + x1 + x2 + x3 + x1 x2 + x1 x3 + x2 x3 = 1 + x2 + x3 + x2 x3 + x1 (1 + x2 + x3)

On the right hand side, in blue, the four non-zero vectors are exactly as same as those present in RM(2,2), 1 + x2 + x3 + x2 x3. On the right side, in red, the vectors spans the same codeword space as those spanned from addition of bases from RM(1,2), 1 + x2 + x3, and RM(2,2), 1 + x2 + x3 + x2 x3.

For example, consider a valid codeword in RM(2,3), [0 1 0 1 1 0 1 0]'. We know that u = [1 0 1 0]' and u+v = [0 1 0 1] '. It is easy to find v = u + (u+v) = [1 1 1 1]'. Indeed, u is in RM(1,2) and v is in RM(2,2).