Algorithmic Results in List Decoding
Algorithmic Results in List Decoding
Abstract
Error-correcting codes are used to cope with the corruption of data
by noise during communication or storage. A code uses an encoding
procedure that judiciously introduces redundancy into the data to produce an associated codeword. The redundancy built into the codewords
enables one to decode the original data even from a somewhat distorted
version of the codeword. The central trade-o in coding theory is the
one between the data rate (amount of non-redundant information per
bit of codeword) and the error rate (the fraction of symbols that could
be corrupted while still enabling data recovery). The traditional decoding algorithms did as badly at correcting any error pattern as they
would do for the worst possible error pattern. This severely limited the
maximum fraction of errors those algorithms could tolerate. In turn,
this was the source of a big hiatus between the error-correction performance known for probabilistic noise models (pioneered by Shannon)
and what was thought to be the limit for the more powerful, worst-case
noise models (suggested by Hamming).
In the last decade or so, there has been much algorithmic progress
in coding theory that has bridged this gap (and in fact nearly eliminated it for codes over large alphabets). These developments rely on
an error-recovery model called list decoding, wherein for the pathological error patterns, the decoder is permitted to output a small list of
candidates that will include the original message. This book introduces
and motivates the problem of list decoding, and discusses the central
algorithmic results of the subject, culminating with the recent results
on achieving list decoding capacity.
Part I
General Literature
1
Introduction
1.1
109
110 Introduction
Remark 1. [Arbitrarily varying channel]
The stochastic noise model assumes knowledge of the precise probability law governing the channel. The worst-case model takes a conservative, pessimistic view of the power of the channel assuming only
a limit on the total amount of noise. A hybrid model called Arbitrarily
Varying Channel (AVC) has also been proposed to study communication under channel uncertainty. Here the channel is modeled as a
jammer which can select from a family of strategies (corresponding to
dierent probability laws) and the sequence of selected strategies, and
hence the channel law, is not known to the sender. The strategy can in
general vary arbitrarily from symbol to symbol, and the goal is to do
well against the worst possible sequence. A less powerful model is that
of the compound channel where the jammer has a choice of strategies,
but the chosen channel law does not change during the transmission of
various symbols of the codeword. AVCs have been the subject of much
research the reader can nd a good introduction to this topic as well
as numerous pointers to the extensive literature in a survey by Lapidoth and Narayan [54]. To the authors understanding, it seems that
much of the work has been of a non-constructive avor, driven by the
information-theoretic motivation of determining the capacity under different AVC variants. There has been less focus on explicit constructions
of codes or related algorithmic issues.
1.2
111
claim holds with high probability for a random code drawn from a natural ensemble.
In fact, the proof of Shannons capacity theorem for q-ary symmetric channels can be
viewed in this light. For ReedSolomon codes, which will be our main focus later on, this
claim has been shown to hold, see [19, 58, 59].
112 Introduction
patterns one can hope to correct many more errors than the above
limit faced by the worst-case error pattern. However, since we assume
a worst-case noise model, we do have to deal with bad received words
such as r. List decoding provides an elegant formulation to deal with
worst-case errors without compromising the performance for typical
noise patterns the idea is that in the worst-case, the decoder may
output multiple answers. Formally, the decoder is required to output a
list of all codewords that dier from the received word in a fraction p
of symbols.
Certainly returning a small list of possibilities is better and more
useful than simply giving up and declaring a decoding failure. Even if
one deems receiving multiple answers as a decoding failure, as mentioned above, for many error patterns in the target noise range, the
decoder will output a unique answer, and we did not have to model
the channel stochastics to design our code or algorithm! It may also be
possible to pick the correct codeword from the list, in case of multiple
answers, using some semantic context or side information (see [23]).
Also, if in the output list, there is a unique closest codeword, we can
also output that as the maximum likelihood choice. In general, list
decoding is a stronger error-recovery model than outputting just the
closest codeword(s), since we require that the decoder output all the
close codewords (and we can always prune the list as needed). For
several applications, such as concatenated code constructions and also
those in complexity theory, having the entire list adds more power
to the decoding primitive than deciding solely on the closest codeword(s).
Some other channel and decoding models. We now give pointers
to some other relaxed models where one can perform unique decoding
even when the number of errors exceeds half the minimum Hamming
distance between two codewords. We already mentioned one model
where an auxiliary channel can be used to send a small amount of side
information which can be used to disambiguate the list [23]. Another
model that allows one to identify the correct message with high probability is one where the sender and recipient share a secret random key,
see [53] and a simplied version in [67].
113
Finally, there has been work where the noisy channel is modeled
as a computationally bounded adversary (as opposed to an all-powerful
adversary), that must introduce the errors in time polynomial in the
block length. This is a very appealing model since it is a reasonable
hypothesis that natural processes can be implemented by ecient computation, and therefore real-world channels are, in fact, computationally bounded. The computationally bounded channel model was put
forth by Lipton [56]. Under standard cryptographic assumptions, it
has been shown that in the private key model where the sender and
recipient share a secret random seed, it is possible to decode correctly
from error rates higher than half-the-minimum-distance bound [21,48].
Recently, similar results were established in a much simpler cryptographic setting, assuming only that one-way functions exist, and that
the sender has a public key known to the receiver (and possibly to the
channel as well) [60].
1.3
The number of codewords within Hamming distance pn of the worstcase received word r is clearly a lower bound on the runtime of any
list decoder that corrects a fraction p of errors. Therefore, in order
for a polynomial time list decoding algorithm to exist, the underlying
codes must have the a priori combinatorial guarantee of being p-listdecodable, namely every Hamming ball of radius pn has a small number,
say L(n), of codewords for some polynomially bounded function L().2
This packing constraint poses a combinatorial upper bound on the
rate of the code; specically, it is not hard to prove that we must have
R 1 p or otherwise the worst-case list size will grow faster than any
polynomial in the block length n.
Remarkably, this simple upper bound can actually be met. In other
words, for every p, 0 < p < 1, there exist codes of rate R = 1 p o(1)
which are p-list-decodable. That is, non-constructively we can show the
existence of codes of rate R that oer the potential of list decoding up
to a fraction of errors approaching (1 R). We will refer to the quantity (1 R) as the list decoding capacity. Note that the list decoding
2 Throughout
114 Introduction
capacity is twice the fraction of errors that one could decode if we
insisted on a unique answer always quite a substantial gain! Since the
message has Rn symbols, information-theoretically we need at least a
fraction R of correct symbols at the receiving end to have any hope
of recovering the message. Note that this lower bound applies even if
we somehow knew the locations of the error and could discard those
misleading symbols. With list decoding, therefore, we can potentially
reach this information-theoretic limit and decode as long as we receive
slightly more than Rn correct symbols (the correct symbols can be
located arbitrarily in the received word, with arbitrary noise aecting
the remaining positions).
To realize this potential, however, we need an explicit description
of such capacity-achieving list-decodable codes, and an ecient algorithm to perform list decoding up to the capacity (the combinatorics
only guarantees that every Hamming ball of certain radius has a small
number of codewords, but does not suggest any ecient algorithm to
actually nd those codewords). The main technical result in this survey will achieve precisely this objective we will give explicit codes of
rate R with a polynomial time list decoding algorithm for a fraction
(1 R ) of errors, for any desired > 0.
The above description was deliberately vague on the size of the
alphabet . The capacity 1 R for codes of rate R applies in the limit
of large alphabet size. It is also of interest to ask how well list decoding
performs for codes over a xed alphabet size q. For the binary (q = 2)
case, to correct a fraction p of errors, list decoding oers the potential
of communicating at rates up to 1 H(p). This is exactly the capacity
of the binary symmetric channel with cross-over probability p that we
discussed earlier. With list decoding, therefore, we can deal with worstcase errors without any loss in rate. For binary codes, this remains a
non-constructive result and constructing explicit codes that achieve list
decoding capacity remains a challenging goal.
1.4
List decoding was proposed in the late 50s by Elias [13] and
Wozencraft [78]. Curiously, the original motivation in [13] for formulating list decoding was to prove matching upper and lower bounds on
115
1.5
The goal of this survey is to obtain algorithmic results in list decoding. The main technical focus will be on giving a complete presentation of the recent algebraic results achieving list decoding capacity. We
will only provide pointers or brief descriptions for other works on list
decoding.
The survey is divided into two parts. The rst part (Chapters 15)
covers the general literature, and the second part focuses on achieving
list decoding capacity. The authors Ph.D. dissertation [28] provides
a more comprehensive treatment of list decoding. In comparison with
116 Introduction
[28], most of Chapter 5 and the entire Part II of this survey discuss
material developed since [28].
We now briey discuss the main technical contents of the various
chapters. The basic terminology and denitions are described in Chapter 2. Combinatorial results which identify the potential of list decoding
in an existential, non-constructive sense are presented in Chapter 3. In
particular, these results will establish the capacity of list decoding (over
large alphabets) to be 1 R. We begin the quest for explicitly and algorithmically realizing the potential of list decoding in Chapter 4, which
discusses a list decoding algorithm for ReedSolomon (RS) codes the
algorithm is based on bivariate polynomial interpolation. We conclude
the rst part with a brief discussion in Chapter 5 of algorithmic results
for list decoding certain codes based on expander graphs.
In Chapter 6, we discuss folded ReedSolomon codes, which are RS
codes viewed as a code over a larger alphabet. We present a decoding
algorithm for folded RS codes that uses multivariate interpolation plus
some other algebraic ideas concerning nite elds. This lets us approach
list decoding capacity. Folded RS codes are dened over a polynomially
large alphabet, and in Chapter 7 we discuss techniques that let us bring
down the alphabet size to a constant independent of the block length.
We conclude with some notable open questions in Chapter 8.
2
Denitions and Terminology
2.1
119
2.2
2.3
121
construct codes over smaller alphabets starting with codes over a larger
alphabets.
The basic idea behind code concatenation is to combine two codes,
an outer code Cout over a larger alphabet (of size Q, say), and an inner
code Cin with Q codewords over a smaller alphabet (of size q, say),
to get a combined q-ary code that, loosely speaking, inherits the good
features of both the outer and inner codes. These were introduced by
Forney [15] in a classic work. The basic idea is very natural: to encode a
message using the concatenated code, we rst encode it using Cout , and
then in turn encode each of the resulting symbols into the corresponding
codeword of Cin . Since there are Q codewords in Cin , the encoding
procedure is well dened. Note that the rate of the concatenated code
is the product of the rates of the outer and inner codes.
The big advantage of concatenated codes for us is that we can get
a good list decodable code over a small alphabet (say, binary codes)
based on a good list decodable outer code over a large alphabet (like
a ReedSolomon code) and a suitable binary inner code. The block
length of the inner code is small enough to permit a brute-force search
for a good code in reasonable time.
Code concatenation works rather naturally in conjunction with list
recovering of the outer code to give algorithmic results for list decoding.
The received word for the concatenated code is broken into blocks corresponding to the inner encodings of the various outer symbols. These
blocks are list decoded, using a brute-force inner decoder, to produce
a small set of candidates for each symbol of the outer codeword. These
sets can then be used as input to a list recovering algorithm for the
outer code to complete the decoding. It is not dicult to prove the
following based on the above algorithm:
Lemma 2.1. If the outer code is (p1 , , L)-list-recoverable and the inner
code is (p2 , )-list-decodable, then the concatenated code is (p1 p2 , L)list-decodable.
Code concatenation and list recovering will be important tools for
us in Chapter 7 where we will construct codes approaching capacity
over a xed alphabet.
2.4
We recap basic facts and notation concerning nite elds. For any prime
p, the set of integers modulo p form a eld, which we denote by Fp .
The ring of univariate polynomials in variable X with coecients
from a eld F is denoted by F[X]. A polynomial f (X) is said to be
irreducible over F, if f (X) = r(X)s(X) for r(X), s(X) F[X] implies
that either r(X) or s(X) is a constant polynomial. A polynomial is
said to be monic if its leading coecient is 1. The ring F[X] has unique
factorization: Every monic polynomial can be written uniquely as a
product of monic irreducible polynomials.
If h(X) is an irreducible polynomial of degree e over F, then the
quotient ring F[X]/(h(X)), consisting of polynomials modulo h(X), is
a nite eld with |F|e elements (just as Fp = Z/(p) is a eld, where Z is
the ring of integers and p is a prime). The eld F[X]/(h(X)) is called
an extension eld of degree e over F; the extension eld also forms a
vector space of dimension e over F.
The prime elds Fp , and their extensions as dened above, yield all
nite elds. The size of a nite eld is thus always a prime power.
The characteristic of a nite eld equals p if it is an extension of
the prime eld Fp . Conversely, for every prime power q, there is a
unique (up to isomorphism) nite eld Fq . We denote by Fq the set
of nonzero elements of Fq . It is known that Fq is a cyclic group
(under the multiplication operation), generated by some Fq , so
that Fq = {1, , 2 , . . . , q2 }. Any such is called a primitive element.
There are in fact (q 1) such primitive elements, where (q 1) is
the number of positive integers less than q 1 that are relatively prime
to q 1.
We owe a lot to the following basic property of elds: Let f (X)
F[X] be a nonzero polynomial of degree d. Then f (X) has at most d
roots in F.
3
Combinatorics of List Decoding
In this chapter, we prove combinatorial results concerning listdecodable codes, and study the relation between the list decodability
of a code and its other basic parameters such as minimum distance and
rate. We will show that every code can be list decoded using small lists
beyond half its minimum distance, up to a bound we call the Johnson
radius. We will also prove existential results for codes that will highlight the sort of rate vs. list decoding radius trade-o one can hope for.
Specically, we will prove the existence of (p, L)-list-decodable codes
of good rate and thereby pinpoint the capacity of list decoding. This
then sets the stage for the goal of realizing or coming close to these
trade-os with explicit codes, as well as designing ecient algorithms
for decoding up to the appropriate list decoding radius.
3.1
If a code has distance d, every Hamming ball of radius less than d/2 has
at most one codeword. The list decoding radius for a list size of 1 thus
equals d1
2 . Could it be that already at radius slightly greater than
d/2 we can have a large number of codewords within some Hamming
123
[(x, y)];
{x,y}(B
2)
e = E [(r, x)].
xB
Then, |B|
1
tor is positive.
q
q1
e
n
q
q1
2
d
n
1
q
q1
d
n
1
.
2
125
1
1
1 2
1 2
= 2
.
= 1 + 2
1+ 2
2
q
The bound n 1 1q 1 1 q1
nd is called the (q-ary)
Johnson radius, and in every q-ary code of relative distance d/n, every
Hamming ball of this radius is guaranteed to have few codewords.
Proof. (of Theorem 3.1) To keep the notation simple, we will assume
that the alphabet is {0, 1, . . . , q 1} and that r = 0n . Let = ne and
M = |B|. Pick distinct codewords x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) B,
uniformly at random. We will obtain a lower bound (in terms of e
and M ) on the expected number of coordinates where x and y agree.
We know that this expectation is n d. The theorem will follow by
comparing these two quantities.
For i [n] and [q], let ki () = |{x B : xi = }|. Note that
M
2
1
ki ()
[q]
q1
ki ()
M 1 ki (0)
+
=
2
2
2
=1
1
M ki (0)
M
ki (0)
q1
,
+ (q 1)
2
2
2
q
q1
e
n
q
q1
2
d
n
1
q
q1
d
n
.
nd
.
(n e)2 n(n d)
127
q e2
d
2e
+
2
q 1n
n
n
q
q1
e
1
n
d
1
n
.
nd
.
(ne)2 n(nd)
3.2
We now turn to determining the trade-o between the rate of a code and
its list decoding radius, i.e., the capacity of list decoding. Throughout, q will be the alphabet size and p the fractional error-correction
(3.2)
pn n
i
i=1 i (q 1) . It is known, see for example [75, Chapter 1], that
for 0 < p < 1 1/q and growing n,
q Hq (p)no(n) |Bq (0, pn)| q Hq (p)n .
(3.3)
129
be xed later in the proof. We will show that with high probability the
resulting code will be (p, L)-list decodable.
The probability that a xed set of (L + 1) codewords all lie
in a xed Hamming sphere (in the space [q]n ) of radius pn
equals (|Bq (0, pn)|/q n )L+1 . By (3.3), this probability is at most
q (L+1)(1Hq (p))n . Therefore, by a union bound, the probability that
some tuple of (L + 1) codewords all lie in some Hamming ball of radius
pn is at most
M
q n q (L+1)(1Hq (p))n .
L+1
If M = q rn for r = 1 Hq (p) 1/(L + 1), then the above probability is at most 1/(L + 1)! < 1/3. Also, among the M chosen codewords
there are at least M/2 distinct codewords with probability at least 1/2.
Hence, there exists a (p, L)-list-decodable with q (1Hq (p)1/(L+1))n /2
q (1Hq (p)1/L)n distinct codewords, or in other words with rate at least
1 Hq (p) 1/L, as claimed.
Remark 2. Using a variant of the above random coding argument together with expurgation the rate lower bound can be
improved slightly to 1 Hq (p)(1 + 1/L); see [28, Theorem 5.5] for
details.
The above two results imply that, over alphabets of size q, the
optimal rate possible for list decoding to radius p is 1 Hq (p). The
proof of Theorem 3.5 is non-constructive, and the big challenge is
to construct an explicit code with rate close to capacity. It turns
out one can also approach capacity using a linear code, as the following result, which rst appeared implicitly in the work of Zyablov
and Pinsker [79], shows. This result is also non-constructive and is
proved by picking a random linear code of the stated rate. We skip
the proof here; the reader may nd the detailed proof in [28, Theorem 5.6]. The key point is that in any L-tuple of nonzero messages,
there must be at least logq (L + 1) linearly independent messages, and
these are mapped to completely independent codewords by a random
linear code.
1
.
logq (L + 1)
,
1 Hq (p) = 1 p + p logq
q1
log q
where H(x) is the binary entropy function and log x denotes logarithm
to the base 2. In the limit of large q, the list decoding capacity thus
approaches 1 p. In other words, we can list decode a code of rate R up
to a fraction of errors approaching 1 R, as mentioned in Chapter 1.
Since 1 Hq (p) 1 p log1 q , we can get within of capacity, i.e.,
achieve a rate of 1 p for list decoding up to a fraction p of errors,
131
4
Decoding ReedSolomon Codes
4.1
We begin with a simple setting, rst considered in [5], that will motivate
the main approach underlying the list decoding algorithm. Under list
decoding, if we get a received word that has two closest codewords,
we need to return both those codewords. In this section, we develop
an algorithm that, given received words that are formed by mixing
up two codewords, recovers those codewords. In a sense, this is the
simplest form in which a list decoding style question arises.
4.1.1
133
distinct eld elements. Let us assume k < n/2 for this section. Suppose
we are given two unknown codewords, corresponding to two unknown
polynomials p1 (X) and p2 (X), in a scrambled fashion as follows: For
each i = 1, 2, . . . , n, we are given a pair (ai , bi ) such that either ai =
p1 (i ) and bi = p2 (i ), or ai = p2 (i ) and bi = p1 (i ). The goal is to
recover the polynomials p1 (X) and p2 (X) from this data.
For each i, since we know both p1 (i ) and p2 (i ) in some order, we
can compute p1 (i ) + p2 (i ) = ai + bi , as well as p1 (i )p2 (i ) = ai bi .
def
We now consider a related question, where for each codeword position i, we are given either yi which equals either p1 (i ) or p2 (i ) (and
we do not know which value we are given). Given such a mixture of
two codewords, our goal is to identify p1 (X) and p2 (X). Now clearly,
we may only be given the value p1 (i ) for all i, and in this case we
have no information about p2 (X). Under the assumption k < n/6, the
algorithm below will identify both the polynomials if they are both well
represented in the following sense: for both j = 1, 2, we have pj (i ) = yi
for at least 1/3 of the i s.
The following simple observation sets the stage for the algorithm:
For each i [n], (yi p1 (i ))(yi p2 (i )) = 0. In other words, the
polynomial Q(X, Y ) = (Y p1 (X))(Y p2 (X)) satises Q(i , yi ) = 0
for every i [n].
Y ) be of the
vanishes at all the pairs (i , yi ), we can require that Q(X,
form
k
2k
q1j X j +
q2j X j .
(4.1)
Q(X,
Y)=Y2 +Y
j=0
j=0
p(X)) 0,
that p(i ) = yi for at least n/3 values of i [n], then Q(X,
135
4.2
Q(X, Y ) =
j=0
qj X Y j .
(4.2)
=0
The conditions Q(i , yi ) = 0 for all i [n] give a system of n homogeneous linear equations in the unknowns qj . A nonzero solution to
this system is guaranteed to exist if the number of unknowns, say U ,
exceeds the number n of equations (and if a nonzero solution exists,
we can clearly nd one in polynomial time by solving the above linear
system). We turn to estimating U .
D/k Djk
U=
j=0
=0
= (D + 1)
D/k
1=
j=0
(D jk + 1)
k D
D
D
+1
+1
k
2 k
k
(D + 1)(D + 2)
.
2k
137
> kn, we have U > n, and a nonzero solution exists to
Thus if D+2
2
the linear system.
The above discussion motivates the following algorithm
for polynomial
Step 1: (Interpolation) Let D = [ 2kn]. Find a nonzero polynomial Q(X, Y ) of (1, k)-weighted degree at most D
satisfying Q(i , yi ) = 0 for i = 1, 2, . . . , n. (Lemma 4.4
guarantees the success of this step.)
Step 2: (Root nding/Factorization) Find all degree k polynomials p(X) such that Q(X, p(X)) 0. For each such
polynomial, check if p(i ) = yi for at least t values
of i [n], and if so, include p(X) in the output list.
(Lemma 4.3 guarantees that this step will nd
all relevant polynomials with agreement t > D = [ 2kn].)
The following records the performance of this algorithm. The claim
about the output list size follows since the number of factors Y p(X)
of Q(X, Y ) is at most the degree of Q(X, Y ) in Y , which is at
most D
k .
Theorem 4.5. The above algorithm solves the polynomial reconstruction problem in polynomial time if the agreement parameter t satises
t> 2kn . The size of the list output by the algorithm never exceeds
2n/k.
The above implies thata ReedSolomon code of rate R
can be list
decoded to a fraction 1 2R of errors using lists of size O( 1/R). In
particular, for low-rate codes, we can eciently correct close to a fraction 100% of errors! This qualitative feature is extremely useful in many
areas of complexity theory, such as constructions of hardcore predicates
from one-way functions, randomness extractors and pseudorandom generators, hardness amplication, transformations from worst-case to
average-case hardness, etc. we point to the surveys [29,70,74] and [28,
Chapter 12] for further information.
4.3
4.3.1
139
Fig. 4.1 Example 1: The set of 14 input points. We assume that the center-most point is
the origin and assume a suitable scaling of the other points.
Fig. 4.2 A degree 4 t through the 14 points. The curve is given by the equation: Y 4
X 4 Y 2 + X 2 = 0.
141
Fig. 4.4 The ve lines that pass throught at least 4 of the 10 points.
4.3.2
We now generalize Sudans decoding algorithm by allowing for multiplicities at the interpolation points. This generalization is due to
Guruswami and Sudan [40].
(4.3)
143
of Q(i) (X, Y ) = Q(X + i , Y + yi ) can clearly be expressed as a linear combination of the coecients qj of Q(X, Y ) (where Q(X, Y ) is
expressed as in (4.2)). Thus the condition that Q(X, Y ) has a zero of
homomultiplicity r at (i , yi ) can be expressed as a system of r+1
2
geneous linear equations in the unknowns qj , one equation for each
pair (j1 , j2 ) of nonnegative integers with j1 + j2 < r. In all, for all n
homogeneous linear equations. The rest of
pairs (i , yi ), we get n r+1
2
the argument follows the proof of Lemma 4.4 the only change is that
replaces n for the number of equations.
n r+1
2
The above discussion motivates the following algorithm using interpolation with multiplicities for polynomial reconstruction (the parameter r 1 equals the number of multiplicities):
knr(r + 1) . Find a
Step 1: (Interpolation) Let D =
nonzero polynomial Q(X, Y ) of (1, k)-weighted degree
at most D such that Q(X, Y ) has a zero of multiplicity r at (i , yi ) for each i = 1, 2, . . . , n. (Lemma 4.7
guarantees the success of this step.)
Step 2: (Root nding/Factorization) Find all degree k polynomials p(X) such that Q(X, p(X)) 0. For each such
polynomial, check if p(i ) = yi for at least t values
of i [n], and if so, include p(X) in the output list.
(Lemma 4.6 guarantees that this step will nd all relevant polynomials with agreement t > D/r.)
The following records the performance of this algorithm. Again, the
size of the output list never exceeds the degree of Q(X, Y ) in Y , which
is at most
D/k.
that for every rate R, 0 < R < 1, the decoding radius 1 R exceeds
the best decoding radius (1 R)/2 that we can hope for with unique
decoding.
x = knr(r + 1). The condition t > D/r with D = [x] is implied by t > [x/r], since
x/r [x]/r < [x/r] + 1.
1 Let
145
4.4
4.4.1
Soft-decision decoding
List recovering dealt with the case when for each position we had a
set of more than one candidate symbols. More generally, we could be
given a weight for each of the candidates, with the goal being to nd
all codewords with good weighted agreement, summed over all positions. The weight for position i and symbol would presumably be
a measure of the condence of symbol being the i-th symbol of the
actual codeword that was transmitted. Making use of such weights in
the decoding is called soft-decision decoding (the weights constitute
soft information). Note that list recovering is just a special case when
for each position the weights for some symbols equal 1 and the rest
equal 0. Soft-decision decoding is important in practice as the eld
elements corresponding to each position are obtained by some sort of
demodulation of real-valued signals, and soft-decision decoding can
retain more of the information from this process compared with hard
decoding which loses a lot of information by quantizing the signal to a
single symbol. It is also useful in decoding concatenated codes, where
the inner decoder can provide weights along with the choices it outputs,
which can then be used by a soft-decision decoder for the outer code.
As mentioned in [40], the multiplicity based interpolation lends itself
naturally to a soft-decision version, since the multiplicity required at
a point can encode the importance of that point. Given weights wi,
for positions i [n] and eld elements F, we set the multiplicity of
the point (i , ) to be proportional to wi, . This leads to the following
claim, which is explicit for instance in [28, Chapter 6]:
Theorem 4.11. (Soft-decision decoding of RS codes) Consider a
ReedSolomon code of block length n and dimension k + 1 over a eld
F. Let 1 , . . . , n F be the evaluation points used for the encoding.
Let > 0 be an arbitrary constant. For each i [n] and F, let wi,
be a non-negative rational number. Then, there exists a deterministic
algorithm with runtime poly(n, |F|, 1/) that, when given as input the
147
weights wi, for i [n] and F, nds a list of all polynomials p(X)
F[X] of degree at most k that satisfy
n
n 2
wi,p(i ) k
wi, + max wi, .
(4.4)
i=1
i=1 F
i,
Koetter and Vardy [51] developed a front end that chooses weights
that are optimal in a certain sense as inputs to the above algorithm,
based on the channel observations and the channel transition probability matrix. This has led to a soft-decision decoding algorithm for RS
codes that has led to substantial improvements in practice.
4.5
Deterministic algorithm
149
by h(X) till we are left with Q0 (X, Y ) that is not divisible by h(X).
Then, viewing Q0 (X, Y ) as a polynomial in Y with coecients in Fq [X],
reduce the coecients modulo h(X), to get a nonzero univariate poly ] where F
is the extension eld Fq [X]/(h(X)) of
nomial T (Y ) F[Y
degree q 1 over Fq . The desired polynomials f (X) now all occur
all of which can be found
amongst the roots of T (Y ) that lie in F,
using Berlekamps algorithm in time polynomial in degree(T ) and q.
We will make even more crucial use of polynomials over the extension eld Fq [X]/(h(X)) in Chapter 6.
4.6
5
Graph-Based List-Decodable Codes
5.1
The notion of list recovering (recall Denition 2.2) plays a crucial role
in the graph-based constructions. The rst step is to reduce list decoding to the problem of list recovering with a much smaller noise fraction (but with large input list sizes). In this section, we will present
the details of such a reduction. The reduction uses the notion of a
151
153
5.2
the erasure noise model, certain symbols are lost during transmission, and the rest
are received intact. The receiver knows the locations of the erasures as well as of the
received symbols. The goal is to decode the erased symbols.
155
decoding procedure consists of two cases. The rst case occurs when
the set I has size at least n/10. In this case, we know at least 10% of
symbols of c, and thus we can recover c using the linear-time erasuredecoding algorithm for the code C. It remains to consider the second
case, when the size of the set I is smaller than n/10. Consider any
i
/ I. Observe that, for all (i, j) E, all sets L(i, j) must be equal to
Ki . The set Ki contains two distinct symbols that are candidates for ci .
Note that although for each c only one of this symbols is correct, each
symbol in Ki can be correct for some codeword c. From now on, we will
consider each Ki (respectively Sj ) as ordered pairs of two symbols (Ki )0
and (Ki )1 (respectively (Sj )0 and (Sj )1 ), for an arbitrary ordering.
To recover c from the Ki s, we create an auxiliary graph HH.
Intuitively, the graph HH is obtained by splitting each node in
G into two nodes, each corresponding to two decoding options, and
then putting edges between compatible options. Formally, HH =
(A , B , E ) is a bipartite graph such that A = A {0, 1} and B =
B {0, 1}. For each i
/ I, and (i, j) E, let k be such that k (j) = i
(i.e., j is the k-th neighbor of i). Then the edge set E contains
edges {(i, 0), (j, t)} and {(i, 1), (j, 1 t)}, where t {0, 1} is such that
(Ki )0 = ((Sj )t )k .
Dene V (c) = {(i, t) : i
/ I, ci = (Ki )t }, i.e., the subgraph of HH
that corresponds to the codeword c. In other words, V (c) contains the
nodes in A that are compatible with c. The key fact, easily proved by
induction, is that if (i, t) V (c), and (i , t ) A is reachable from (i, t)
in HH, then (i , t ) V (c). Hence V (c) will be the union of the vertices
in A that belong to some subset of connected components of HH. The
fault-tolerance property of G can be shown to imply that one of these
connected components must contain at least n/10 vertices. Therefore, if
we enumerate all large connected components of HH, one of them will
give us a large subset S of V (c) (of size at least n/10). Given S , we can
recover c from a vector x such that xi is equal to (Ki )t if (i, t) S , and
is declared as an erasure otherwise (note that the fraction of erasures
is at most 0.9).
Since the graph H has only O(n) edges, all its connected components, and in particular the large ones, can be found in O(n) time.
Thus, the whole decoding process can be performed in linear time. In
5.3
We now turn to the list recovering problem when the input list size is
greater than 2, and more signicantly when we allow a small fraction
of lists to be erroneous. The following result is shown in [33]:
Theorem 5.2. For every integer 1, there exist R > 0, > 0, and
a nite alphabet for which there is an explicit family of codes of
rate R() over alphabet that are encodable as well as ( , , )-listrecoverable in linear time.
Combining the above with Lemma 5.1, one gets the following result
of Guruswami and Indyk [33] on linear-time list-decodable codes for
correcting any desired constant fraction of errors.
Theorem 5.3. For every , 0 < < 1, there exists an explicit family
of (1 , O(1/))-list-decodable codes of positive rate R() > 0 that
can be encoded as well as list decoded from a fraction (1 ) of errors
in time linear in the block length.2
Below we will sketch the ideas behind the proof of Theorem 5.2.
Both expanders and the notion of list recovering play a prominent role
throughout the construction and proof in [33].
The construction ( , , )-list recoverable codes proceeds recursively
using a construction of (1 , 1, 1)-list-recoverable codes ( will
recursively depend on 1 ). For the recursion, it is convenient to construct a slightly stronger object; for a xed small , we will construct
( , , , )-list recoverable codes that have the following property: Given
a collection of sets Si all but a fraction of which satisfy |Si | , there
are at most codewords whose i-th symbol belongs to Si for at least
a fraction (1 ) of the locations. (The case = 0 corresponds to
( , , )-list-recovering.)
2 The
complexity is linear in the unit cost RAM model, see [33] for details.
157
5.4
5.4.1
Graph-based constructions have also been useful in obtaining good listdecodable codes over smaller alphabets compared with purely algebraic
codes.
For example, consider using the construction scheme of Denition 5.2 with the following components C, G:
C is a rate () code that is (1/2, (1/), (1/))-list recoverable. Such a code can be constructed via an appropriate
concatenation involving a rate () outer RS code, and a
good list-recoverable inner code found by brute-force search
(see [31] for details).
G is an n n bipartite graph of degree d = O(1/) with the
property that any subset of fraction of nodes on the right
159
Juxtaposed codes
Extractor-based codes
Part II
6
Folded ReedSolomon Codes
This part of the survey gives some exciting recent progress that achieves
the capacity of list decoding over large alphabets. In this chapter, we
present a simple variant of ReedSolomon codes
called folded Reed
Solomon codes for which we can beat the 1 R decoding radius, we
achieved for RS codes in Chapter 4. In fact, by choosing parameters
suitably, we can decode close to the optimal fraction 1 R of errors
with rate R. In the next chapter, we will discuss techniques that let us
achieve a similar result over an alphabet of constant size that depends
only on the distance to list decoding capacity.
The starting point for the above capacity-achieving result is the
breakthrough work of Parvaresh and Vardy [62] who described a novel
variant of ReedSolomon codes together with a new decoding algorithm. While the new decoding
algorithm led to improvements over
the decoding radius of 1 R, it only did so for low rates (specically, for R < 1/16). Subsequently, Guruswami and Rudra [37] proved
that yet another variant of ReedSolomon codes, namely folded RS
codes that are compressed versions of certain ParvareshVardy codes,
are able to leverage the PV algorithm, essentially as is, but on codes
162
163
of high rate. Together, this gives explicit codes with polynomial time
decoding algorithms that achieve list decoding capacity.
In this chapter, we will describe this combined code and algorithm.
We note that this presentation deviates signicantly from the historical development in the original papers [37,62], in that we are using the
benet of hindsight to give a self-contained, and hopefully simpler, presentation. The last section of this chapter contains more comprehensive
bibliographic notes on the original development of this material.
6.1
Consider a ReedSolomon code C = RSF,F [n, k] consisting of evaluations of degree k polynomials over F at the set F of nonzero elements
of F. Let q = |F| = n + 1. Let be a generator of the multiplicative
group F , and let the evaluation points be ordered as 1, , 2 , . . . , n1 .
Using all nonzero eld elements as evaluation points is one of the most
commonly used instantiations of ReedSolomon codes.
Let m 1 be an integer parameter called the folding parameter. For
ease of presentation, we will assume that m divides n = q 1.
Denition 6.1. (Folded ReedSolomon code) The m-folded version of the RS code C, denoted FRSF,,m,k , is a code of block length
N = n/m over Fm . The encoding of a message f (X), a polynomial over
F of degree at most k, has its j-th symbol, for 0 j < n/m, the m-tuple
(f ( jm ), f ( jm+1 ), . . . , f ( jm+m1 )). In other words, the codewords of
C = FRSF,,m,k are in one-one correspondence with those of the RS
code C and are obtained by bundling together consecutive m-tuple of
symbols in codewords of C.
We illustrate the above construction for the choice m = 4 in Figure 6.1.
The polynomial f (X) is the message, whose ReedSolomon encoding
consists of the values of f at x0 , x1 , . . . , xn1 where xi = i . Then, we
perform a folding operation by bundling together tuples of four symbols
to give a codeword of length n/4 over the alphabet F4 .
Note that the folding operation does not change the rate R of the
original ReedSolomon code. The relative distance of the folded RS
code also meets the Singleton bound and is at least 1 R.
6.2
Since folding seems like such a simplistic operation, and the resulting
code is essentially just a RS code but viewed as a code over a large
alphabet, let us now understand why it can possibly give hope to correct
more errors compared with the bound for RS codes.
Consider the above example with folding parameter m = 4. First of
all, decoding the folded RS code up to a fraction p of errors is certainly
not harder than decoding the RS code up to the same fraction p of
errors. Indeed, we can unfold the received word of the folded RS
code and treat it as a received word of the original RS code and run
the RS list decoding algorithm on it. The resulting list will certainly
include all folded RS codewords within distance p of the received word,
and it may include some extra codewords which we can, of course, easily
prune.
In fact, decoding the folded RS code is a strictly easier task. To
see why, say we want to correct a fraction 1/4 of errors. Then, if we
use the RS code, our decoding algorithm ought to be able to correct an
error pattern which corrupts every fourth symbol in the RS encoding of
f (X) (i.e., corrupts f (x4i ) for 0 i < n/4). However, after the folding
operation, this error pattern corrupts every one of the symbols over
the larger alphabet F4 , and thus need not be corrected. In other words,
for the same fraction of errors, the folding operation reduces the total
number of error patterns that need to be corrected, since the channel
has less exibility in how it may distribute the errors.
165
It is of course far from clear how one may exploit this to actually
correct more errors. To this end, algebraic ideas that exploit the specic
nature of the folding and the relationship between a polynomial f (X)
and its shifted counterpart f (X) will be used. These will be clear once
we describe our algorithms later in the chapter.
We note that above simplication of the channel is not attained
for free, since the alphabet size increases after the folding operation.
For folding parameter m that is an absolute constant, the increase in
alphabet size is moderate and the alphabet remains polynomially large
in the block length. (Recall that the RS code has an alphabet size that
is linear in the block length.) Still, having an alphabet size that is a
large polynomial is somewhat unsatisfactory. Fortunately, our alphabet
reduction techniques in the next chapter can handle polynomially large
alphabets, so this does not pose a big problem.
6.3
6.3.1
We begin with some basic denitions and facts concerning trivariate polynomials which are straightforward extensions of the bivariate
counterparts.
Lemma 6.1. Let {(i , yi1 , yi2 )}ni=1 be an arbitrary set of n triples
from F3 . Let Q(X, Y1 , Y2 ) F[X, Y1 , Y2 ] be a nonzero polynomial of
(1, k, k)-weighted degree at most D that has a zero of multiplicity r at
(i , yi1 , yi2 ) for every i [n]. Let f (X), g(X) be polynomials of degree
at most k such that for at least t > D/r values of i [n], we have
f (i ) = yi1 and g(i ) = yi2 . Then, Q(X, f (X), g(X)) 0.
Proof. The proof is very similar to that of Lemma 4.6. If we dene
R(X) = Q(X, f (X), g(X)), then R(X) is a univariate polynomial of
degree at most D, and for every i [n] for which f (i ) = yi1 and
g(i ) = yi2 , (X i )r divides R(X). Therefore if rt > D, then R(X)
has more roots (counting multiplicities) than its degree, and so it must
be the zero polynomial.
Lemma 6.2. Given an arbitrary set of n triples {(i , yi1 , yi2 )}ni=1 from
F3 and an integer parameter r 1, there exists a nonzero polynomial
Q(X, Y1 , Y2 ) over F of (1, k, k)-weighted degree at most D such that
Q(X, Y1 , Y2 ) has a zero of multiplicity r at (i , yi1 , yi2 ) for all i [n],
r+2
D3
provided 6k
2 > n
3 . Moreover, we can nd such a Q(X, Y1 , Y2 ) in
time polynomial in n, r by solving a system of homogeneous linear
equations over F.
167
Proof. This is just the obvious trivariate extension of Lemma 4.7. The
condition that Q(X, Y1 , Y2 ) has a zero of multiplicity r at a point
homogeneous linear conditions in the coecients
amounts to r+2
3
of Q. The number of monomials in Q(X, Y1 , Y2 ) equals the number,
say N3 (k, D), of triples (i, j1 , j2 ) of nonnegative integers which obey
i + kj1 + kj2 D. One can show that the number N3 (k, D) is at least
as large as the volume of the three-dimensional region {x + ky1 + ky2
D | x, y1 , y2 0} R3 [62]. An easy calculation shows that the latr+2
D3
D3
ter volume equals 6k
2 . Hence, if 6k 2 > n
3 , then the number of
unknowns exceeds the number of equations, and we are guaranteed
a nonzero solution.
6.3.2
Let us now see how trivariate interpolation can be used in the context of
decoding the folded RS code C = FRSF,,m,k of block length N = (q
1)/m. (Throughout this section, we denote q = |F|, and n = q 1.)
Given a received word z (Fm )N for C that needs to be list decoded,
we dene y Fn to be the corresponding unfolded received word.
(Formally, let the j-th symbol of z be (zj,0 , . . . , zj,m1 ) for 0 j < N .
Then y is dened by yjm+l = zj,l for 0 j < N and 0 l < m.)
Suppose f (X) is a polynomial whose encoding agrees with z on at
least t locations. Then, here is an obvious but important observation:
For at least t(m 1) values of i, 0 i < n, both the
equalities f ( i ) = yi and f ( i+1 ) = yi+1 hold.
Dene the notation g(X) = f (X). Therefore, if we consider the n
triples ( i , yi , yi+1 ) F3 for i = 0, 1, . . . , n 1 (with the convention
yn = y0 ), then for at least t(m 1) triples, we have f ( i ) = yi and
g( i ) = yi+1 . This suggests that interpolating a polynomial Q(X, Y1 , Y2 )
through these n triples and employing Lemma 6.1, we can hope that
f (X) will satisfy Q(X, f (X), f (X)) = 0, and then somehow use this to
nd f (X). We formalize this in the following lemma. The proof follows
immediately from the preceding discussion and Lemma 6.1.
169
Proof. By Lemma 6.3, we know that any f (X) whose encoding agrees
with z on t or more locations will be output in Step 2, provided
D
. For the choice of D in (6.1), this condition is met for
t > (m1)r
k2 n
1
1
1 + 2r + (m1)r
. The decoding
the choice t = 1 + 3 (m1)
3 1 + r
radius is equal to N t, and recalling that n = mN , we get bound
claimed in the lemma.
The rate of the folded ReedSolomon code is R = ((k + 1)/n) > k/n,
m
R2/3 . Letting the
and so the fraction of errors corrected is 1 m1
parameter m grow, we can approach a decoding radius of 1 R2/3 .
6.4
Root-nding step
In light of the above discussion, the only missing piece in our decoding
algorithm is an ecient way to solve the following trivariate rootnding type problem:
Given a nonzero polynomial Q(X, Y1 , Y2 ) with coecients from a nite eld F of size q, a primitive element
of the eld F, and an integer parameter k < q 1,
nd a list of all polynomials f (X) of degree at most k
such that Q(X, f (X), f (X)) 0.
The following simple algebraic lemma is at the heart of our solution to
this problem.
Lemma 6.5. Let F be the eld Fq of size q, and let be a primitive element that generates its multiplicative group. Then we have the
171
F of R(Y1 ).
Now R(Y1 ) is a nonzero polynomial since R(Y1 ) = 0 i Y2 Y1q
divides T (Y1 , Y2 ), and this cannot happen as T (Y1 , Y2 ) has degree
less than q in Y1 . The degree of R(Y1 ) is at most dq where d is the
is at most q, and
total degree of Q(X, Y1 , Y2 ). The characteristic of F
its degree over the underlying prime eld is at most q log q. Therefore, we can nd all roots of R(Y1 ) by a deterministic algorithm running in time polynomial in d, q [6] (see discussion in Section 4.5.2).
Each of the roots will be a polynomial in F[X] of degree less than
q 1. Once we nd all the roots, we prune the list and only output
those roots f (X) that have degree at most k and satisfy Q0 (X, f (X),
f (X)) = 0.
With this, we have a polynomial time implementation of the algorithm trivariate-FRS-decoder. There is the technicality that the degree
of Q(X, Y1 , Y2 ) in Y1 should be less than q. This degree
is at most D/k,
which by the choice of D in (6.1) is at most (r + 3) 3 n/k < (r + 3)q 1/3 .
For a xed r and growing q, the degree is much smaller than q. (In fact,
6.5
173
Proof. The rst part follows from (i) a simple lower bound on the
number of monomials X a Y1b1 Ysbs with a + k(b1 + b2 + + bs )
D, which gives the number of coecients of Q(X, Y1 , . . . , Ys ), and (ii) an
estimation of the number of (s + 1)-variate monomials of total degree
less than r, which gives the number of interpolation conditions per
(s + 1)-tuple.
The second part is similar to the proof of Lemma 6.3. If f (X) has
agreement on at least t locations of z, then for at least t(m s + 1)
of the (s + 1)-tuples ( i , yi , yi+1 , . . . , yi+s1 ), we have f ( i+j ) = yi+j
def
175
6.6
List recovering
6.7
Two independent works by Coppersmith and Sudan [11] and Bleichenbacher, Kiayias, and Yung [7] considered the variant of RS codes where
the message consists of two (or more) independent polynomials over F,
and the encoding consists of the joint evaluation of these polynomials
at elements of F (so this denes a code over F2 ).1 A naive way to decode
these codes, which are also called interleaved ReedSolomon codes,
would be to recover the two polynomials individually, by running separate instances of the RS decoder. Of course, this gives no gain over the
performance of RS codes. The hope in these works was that something
can possibly be gained by exploiting that errors in the two polynomials
happen at synchronized locations.
However, these works could not
give any improvement over the 1 R bound known for RS codes for
worst-case errors. Nevertheless, for random errors, where each error
replaces the correct symbol by a uniform random
eld element, they
were able to correct well beyond a fraction 1 R of errors. In fact,
as the order of interleaving (i.e., number of independent polynomials)
1 The
resulting code is in fact just a ReedSolomon code where the evaluation points belong
to the subeld F of the extension eld over F of degree two.
177
1 R bound for ReedSolomon codes. The key obstacle in improving this bound was the following: for the case when the messages are
pairs (f (X), g(X)) of polynomials, two algebraically independent relations were needed to identify both f (X) and g(X). The interpolation
method could only provide one such relation in general (of the form
Q(X, f (X), g(X)) = 0 for a trivariate polynomial Q(X, Y, Z)). This still
left too much ambiguity in the possible values of (f (X), g(X)). (The
approach in [61] was to nd several interpolation polynomials, but there
was no guarantee that they were not all algebraically dependent.)
Then, in [62], Parvaresh and Vardy put forth the ingenious idea of
obtaining the extra algebraic relation essentially for free by enforcing
it as an a priori condition satised at the encoder. Specically, instead
of letting the second polynomial g(X) to be an independent degree
k polynomial, their insight was to make it correlated with f (X) by a
specic algebraic condition, such as g(X) = f (X)d mod h(X) for some
integer d and an irreducible polynomial h(X) of degree k + 1.
Then, once we have the interpolation polynomial Q(X, Y, Z), f (X)
can be found as described in this chapter: Reduce the coecients of
Q(X, Y, Z) modulo h(X) to get a polynomial T (Y, Z) with coecients
from F[X]/(h(X)) and then nd roots of the univariate polynomial
T (Y, Y d ). This was the key idea in [62] to improve the 1 R decoding
radius for rates less than 1/16. For rates R 0, their decoding radius
approached 1 O(R log(1/R)).
The modication in using independent polynomials does not come
for free, however. In particular, since one sends at least twice as much
information as in the original RS code, there is no way to construct
codes with rate more than 1/2 in the PV scheme. If we use s 2
correlated polynomials for the encoding, we incur a factor 1/s loss
in the rate. This proves quite expensive, and as a result the improvements over RS codes oered by these codes are only manifest at very
low rates.
The central idea in the work of Guruswami and Rudra on list
decoding folded RS codes [37] was to avoid this rate loss by making
the correlated polynomial g(X) essentially identical to the rst (say
g(X) = f (X)). Then the evaluations of g(X) can be inferred as a
simple cyclic shift of the evaluations of f (X), so intuitively there is
no need to explicitly include those too in the encoding. The folded RS
encoding of f (X) compresses all the needed information, without any
extra redundancy for g(X). In particular, from a received word that
agrees with folded RS encoding of f (X) in many places, we can infer
a received word (with symbols in F2 ) that matches the value of both
f (X) and f (X) = g(X) in many places, and then run the decoding
algorithm of Parvaresh and Vardy.
The terminology of folded RS codes was coined in [52], where an
algorithm to correct random errors in such codes was presented (for
a noise model similar to the one used in [7, 11] that was mentioned
earlier). The motivation was to decode RS codes from many random
phased burst errors. Our decoding algorithm for folded RS codes
can
also be likewise viewed as an algorithm to correct beyond the 1 R
bound for RS codes if errors occur in large, phased bursts (the actual
errors can be adversarial).
7
Achieving Capacity over Bounded Alphabets
The capacity-achieving codes from the previous chapter have two main
shortcomings: (i) their alphabet size is a large polynomial in the block
length, and (ii) the bound on worst-case list size as well as decoding
time complexity grows as n(1/) , where is the distance to capacity.
In this chapter, we will remedy the alphabet size issue (Section 7.2).
We begin by using the folded RS codes in a concatenation scheme to
get good list-decodable binary codes.
7.1
The optimal list recoverability of the folded RS codes discussed in Section 6.6 plays a crucial role in establishing the following result concerning list decoding binary codes. The decoding radius achieved matches
the standard product bound on the designed relative distance of
binary concatenated codes, namely the product of the relative distance
of an outer MDS code with the relative distance of an inner code that
meets the GilbertVarshamov bound.
179
181
0.5
0.4
0.3
0.2
0.1
0
0
0.2
0.6
0.4
0.8
R (RATE) --->
Fig. 7.1 Error-correction radius of our algorithm for binary codes plotted against the rate
R. The best possible trade-o, i.e., capacity, is = H 1 (1 R), and is also plotted.
7.2
183
O( )), the errors in decoding the few deviant inner blocks can be
handled when list recovering C1 .
We skip the formal details that are not hard to work out and follow
along the same lines as the arguments in [35].
8
Concluding Thoughts
We have surveyed the topic of list decoding with a focus on the recent
advances in decoding algorithms for ReedSolomon codes and close
variants, culminating with a presentation of how to achieve the list
decoding capacity over large alphabets. We conclude by mentioning
some interesting open questions and directions for future work.
8.1
185
8.2
was the case in the work of Parvaresh and Vardy [62], since they had complete
exibility in picking the degree d (it just had to be larger than some absolute constant)
and they dened the correlated polynomial to be g(X) = f (X)d mod h(X) (instead of
f (X)).
8.3
The motivation of the above questions on achieving list decoding capacity is primarily a coding-theoretic one. Resolving these questions may
not directly impact any of the applications of list decoding outside
coding theory, for example, the various complexity-theoretic applications. However, codes are by now a central combinatorial object in
the toolkit of computer science, and they are intertwined with many
other fundamental pseudorandom objects such as expander graphs
and extractors. Therefore, any new idea to construct substantially better codes could potentially have broader impact. As an example of this
phenomenon, we mention the recent work [44], which proved that the
ParvareshVardy codes yield excellent condensers that achieve nearoptimal compression of a weak-random source while preserving essentially all of its entropy. The compressed source has a high enough
entropy rate to enable easy extraction of almost all its randomness
using previously known methods. Together, this yields simple and selfcontained randomness extractors that are optimal up to constant factors, matching the previously best known construction due to [57] (that
was quite complicated and built upon on a long line of previous work).
Therefore, the algebraic ideas underlying the recent developments
in achieving list decoding capacity have already found powerful applications in areas outside coding theory. In the other direction, pseudorandom constructs have already yielded many new developments in
coding theory, such as expander based codes [1,31,33,66,68], extractor
codes [26, 72], and codes from the XOR lemma [47, 73]. It is our hope
that exploring these connections in greater depth might lead to further
interesting progress in coding theory; for example, it is interesting to
see if insights from some exciting recent combinatorial techniques, such
187
8.4
Combinatorial questions
8.5
8.6
New vistas
Acknowledgments
The author thanks Madhu Sudan and the anonymous reviewer for a
careful reading of the manuscript and for their very useful comments
on improving the quality and coverage of the presentation.
189
References
[1] N. Alon, J. Bruck, J. Naor, M. Naor, and R. Roth, Construction of asymptotically good low-rate error-correcting codes through pseudo-random graphs,
IEEE Transactions on Information Theory, vol. 38, pp. 509516, 1992.
[2] N. Alon and F. R. K. Chung, Explicit construction of linear sized tolerant
networks, Discrete Mathematics, vol. 72, pp. 1519, 1988.
[3] N. Alon and M. Luby, A linear time erasure-resilient code with nearly optimal
recovery, IEEE Transactions on Information Theory, vol. 42, no. 6, pp. 1732
1736, 1996.
[4] N. Alon and J. Spencer, The Probabilistic Method. John Wiley and Sons, Inc.,
1992.
[5] S. Ar, R. Lipton, R. Rubinfeld, and M. Sudan, Reconstructing algebraic functions from mixed data, SIAM Journal on Computing, vol. 28, no. 2, pp. 488
511, 1999.
[6] E. Berlekamp, Factoring polynomials over large nite elds, Mathematics of
Computation, vol. 24, pp. 713735, 1970.
[7] D. Bleichenbacher, A. Kiayias, and M. Yung, Decoding of interleaved Reed
Solomon codes over noisy data, in Proceedings of the 30th International Colloquium on Automata, Languages and Programming, pp. 97108, 2003.
[8] V. M. Blinovsky, Bounds for codes in the case of list decoding of nite volume,
Problems of Information Transmission, vol. 22, no. 1, pp. 719, 1986.
[9] V. M. Blinovsky, Code bounds for multiple packings over a nonbinary nite
alphabet, Problems of Information Transmission, vol. 41, no. 1, pp. 2332,
2005.
190
References
191
[10] D. Boneh, Finding smooth integers in short intervals using CRT decoding,
in Proceedings of the 32nd Annual ACM Symposium on Theory of Computing,
pp. 265272, 2000.
[11] D. Coppersmith and M. Sudan, Reconstructing curves in three (and higher)
dimensional spaces from noisy data, in Proceedings of the 35th Annual ACM
Symposium on Theory of Computing, pp. 136142, June 2003.
[12] I. Dumer, D. Micciancio, and M. Sudan, Hardness of approximating the minimum distance of a linear code, IEEE Transactions on Information Theory,
vol. 49, no. 1, pp. 2237, 2003.
[13] P. Elias, List decoding for noisy channels, Technical Report 335, Research
Laboratory of Electronics, MIT, 1957.
[14] P. Elias, Error-correcting codes for list decoding, IEEE Transactions on
Information Theory, vol. 37, pp. 512, 1991.
[15] G. D. Forney, Concatenated Codes. MIT Press, Cambridge, MA, 1966.
[16] P. Gemmell and M. Sudan, Highly resilient correctors for multivariate polynomials, Information Processing Letters, vol. 43, no. 4, pp. 169174, 1992.
[17] O. Goldreich and L. Levin, A hard-core predicate for all one-way functions,
in Proceedings of the 21st Annual ACM Symposium on Theory of Computing,
pp. 2532, May 1989.
[18] O. Goldreich, D. Ron, and M. Sudan, Chinese remaindering with errors,
IEEE Transactions on Information Theory, vol. 46, no. 5, pp. 13301338, July
2000.
[19] O. Goldreich, R. Rubinfeld, and M. Sudan, Learning polynomials with queries:
The highly noisy case, in Proceedings of the 36th Annual IEEE Symposium on
Foundations of Computer Science, pp. 294303, 1995.
[20] O. Goldreich, R. Rubinfeld, and M. Sudan, Learning polynomials with queries:
The highly noisy case, SIAM Journal on Discrete Mathematics, vol. 13, no. 4,
pp. 535570, November 2000.
[21] P. Gopalan, R. Lipton, and Y. Ding, Error correction against computationally
bounded adversaries, Theory of Computing Systems, to appear.
[22] V. Guruswami, Iterative decoding of low-density parity check codes (A Survey), CoRR, cs.IT/0610022, 2006. Appears in Issue 90 of the Bulletin of the
EATCS.
[23] V. Guruswami, List decoding with side information, in Proceedings of the
18th IEEE Conference on Computational Complexity (CCC), pp. 300309,
July 2003.
[24] V. Guruswami, Limits to list decodability of linear codes, in Proceedings of
the 34th ACM Symposium on Theory of Computing, pp. 802811, 2002.
[25] V. Guruswami, List decoding from erasures: Bounds and code constructions,
IEEE Transactions on Information Theory, vol. 49, no. 11, pp. 28262833, 2003.
[26] V. Guruswami, Better extractors for better codes?, in Proceedings of 36th
Annual ACM Symposium on Theory of Computing (STOC), pp. 436444, June
2004.
[27] V. Guruswami, Error-correcting codes and expander graphs, SIGACT News,
pp. 2541, September 2004.
192 References
[28] V. Guruswami, List decoding of error-correcting codes, Lecture Notes in
Computer Science, no. 3282, Springer, 2004.
[29] V. Guruswami, List decoding in pseudorandomness and average-case complexity, in IEEE Information Theory Workshop, March 2006.
[30] V. Guruswami, J. Hastad, M. Sudan, and D. Zuckerman, Combinatorial
bounds for list decoding, IEEE Transactions on Information Theory, vol. 48,
no. 5, pp. 10211035, 2002.
[31] V. Guruswami and P. Indyk, Expander-based constructions of eciently
decodable codes, in Proceedings of the 42nd Annual IEEE Symposium on
Foundations of Computer Science, pp. 658667, 2001.
[32] V. Guruswami and P. Indyk, Near-optimal linear-time codes for unique decoding and new list-decodable codes over smaller alphabets, in Proceedings of the
34th Annual ACM Symposium on Theory of Computing (STOC), pp. 812821,
2002.
[33] V. Guruswami and P. Indyk, Linear-time encodable and list decodable codes,
in Proceedings of the 35th Annual ACM Symposium on Theory of Computing
(STOC), pp. 126135, June 2003.
[34] V. Guruswami and P. Indyk, Linear-time list decoding in error-free settings,
in Proceedings of the 31st International Colloquium on Automata, Languages
and Programming (ICALP), pp. 695707, July 2004.
[35] V. Guruswami and P. Indyk, Linear-time encodable/decodable codes with
near-optimal rate, IEEE Transactions on Information Theory, vol. 51, no. 10,
pp. 33933400, October 2005.
[36] V. Guruswami and A. Patthak, Correlated algebraic-geometric codes:
Improved list decoding over bounded alphabets, in Proceedings of the 47th
IEEE Symposium on Foundations of Computer Science (FOCS), pp. 227238,
October 2006. Journal version to appear in Mathematics of Computation.
[37] V. Guruswami and A. Rudra, Explicit capacity-achieving list-decodable
codes, in Proceedings of the 38th Annual ACM Symposium on Theory of Computing, pp. 110, May 2006.
[38] V. Guruswami and A. Rudra, Limits to list decoding Reed-Solomon codes,
IEEE Transactions on Information Theory, vol. 52, no. 8, August 2006.
[39] V. Guruswami, A. Sahai, and M. Sudan, Soft-decision decoding of Chinese
remainder codes, in Proceedings of the 41st IEEE Symposium on Foundations
of Computer Science, pp. 159168, 2000.
[40] V. Guruswami and M. Sudan, Improved decoding of Reed-Solomon and
algebraic-geometric codes, IEEE Transactions on Information Theory, vol. 45,
pp. 17571767, 1999.
[41] V. Guruswami and M. Sudan, List decoding algorithms for certain concatenated codes, in Proceedings of the 32nd Annual ACM Symposium on Theory
of Computing (STOC), pp. 181190, 2000.
[42] V. Guruswami and M. Sudan, Decoding concatenated codes using soft information, in Proceedings of the 17th Annual IEEE Conference on Computational
Complexity (CCC), pp. 148157, 2002.
References
193
194 References
[59] R. J. McEliece and L. Swanson, On the decoder error probability for ReedSolomon codes, IEEE Transactions on Information Theory, vol. 32, no. 5,
pp. 701703, 1986.
[60] S. Micali, C. Peikert, M. Sudan, and D. A. Wilson, Optimal error correction
against computationally bounded noise, in Proceedings of the 2nd Theory of
Cryptography Conference (TCC), pp. 116, 2005.
[61] F. Parvaresh and A. Vardy, Multivariate interpolation decoding beyond the
Guruswami-Sudan radius, in Proceedings of the 42nd Allerton Conference on
Communication, Control and Computing, 2004.
[62] F. Parvaresh and A. Vardy, Correcting errors beyond the Guruswami-Sudan
radius in polynomial time, in Proceedings of the 46th Annual IEEE Symposium
on Foundations of Computer Science, pp. 285294, 2005.
[63] W. W. Peterson, Encoding and error-correction procedures for BoseChaudhuri codes, IEEE Transactions on Information Theory, vol. 6, pp. 459
470, 1960.
[64] J. Radhakrishnan, Proof of q-ary Johnson bound, 2006. Personal
Communication.
[65] C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal, vol. 27, pp. 379423, 623656, 1948.
[66] M. Sipser and D. Spielman, Expander codes, IEEE Transactions on Information Theory, vol. 42, no. 6, pp. 17101722, 1996.
[67] A. Smith, Scrambling adversarial errors using few random bits, optimal information reconciliation, and better private codes, Cryptology ePrint Archive,
Report 2006/020, http://eprint.iacr.org/, 2006.
[68] D. Spielman, Linear-time encodable and decodable error-correcting
codes, IEEE Transactions on Information Theory, vol. 42, no. 6, pp. 1723
1732, 1996.
[69] M. Sudan, Decoding of Reed-Solomon codes beyond the error-correction
bound, Journal of Complexity, vol. 13, no. 1, pp. 180193, 1997.
[70] M. Sudan, List decoding: Algorithms and applications, SIGACT News,
vol. 31, pp. 1627, 2000.
[71] M. Sudan, Ideal error-correcting codes: Unifying algebraic and numbertheoretic algorithms, in Proceedings of AAECC-14: The 14th Symposium on
Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, pp. 3645,
November 2001.
[72] A. Ta-Shma and D. Zuckerman, Extractor codes, IEEE Transactions on
Information Theory, vol. 50, no. 12, pp. 30153025, 2004.
[73] L. Trevisan, List-decoding using the XOR lemma, in Proceedings of the 44th
IEEE Symposium on Foundations of Computer Science, pp. 126135, 2003.
[74] L. Trevisan, Some applications of coding theory in computational complexity,
Quaderni di Matematica, vol. 13, pp. 347424, 2004.
[75] J. H. van Lint, Introduction to coding theory, Graduate Texts in Mathematics, vol. 86, 3rd Edition, Springer-Verlag, Berlin, 1999.
[76] J. von zur Gathen, Modern Computer Algebra. Cambridge University Press,
1999.
References
195