0% found this document useful (0 votes)
48 views

Algorithmic Results in List Decoding

This document discusses list decoding of error-correcting codes. It begins by introducing error-correcting codes and different noise models, including stochastic models and the worst-case adversarial model. It then motivates the problem of list decoding, which allows the decoder to output a list of candidate messages rather than a single message. List decoding offers the potential to achieve the optimal tradeoff between rate and error tolerance against worst-case noise, matching the performance of codes under stochastic noise models. The document provides an overview of key algorithmic results in list decoding.

Uploaded by

Iip Satriani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views

Algorithmic Results in List Decoding

This document discusses list decoding of error-correcting codes. It begins by introducing error-correcting codes and different noise models, including stochastic models and the worst-case adversarial model. It then motivates the problem of list decoding, which allows the decoder to output a list of candidate messages rather than a single message. List decoding offers the potential to achieve the optimal tradeoff between rate and error tolerance against worst-case noise, matching the performance of codes under stochastic noise models. The document provides an overview of key algorithmic results in list decoding.

Uploaded by

Iip Satriani
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

R

Foundations and Trends


in
Theoretical Computer Science
Vol. 2, No 2 (2006) 107195
c 2006 V. Guruswami

DOI: 10.1561/0400000007

Algorithmic Results in List Decoding


Venkatesan Guruswami
Department of Computer Science & Engineering, University of Washington
Seattle WA 98195, USA, venkat@cs.washington.edu

Abstract
Error-correcting codes are used to cope with the corruption of data
by noise during communication or storage. A code uses an encoding
procedure that judiciously introduces redundancy into the data to produce an associated codeword. The redundancy built into the codewords
enables one to decode the original data even from a somewhat distorted
version of the codeword. The central trade-o in coding theory is the
one between the data rate (amount of non-redundant information per
bit of codeword) and the error rate (the fraction of symbols that could
be corrupted while still enabling data recovery). The traditional decoding algorithms did as badly at correcting any error pattern as they
would do for the worst possible error pattern. This severely limited the
maximum fraction of errors those algorithms could tolerate. In turn,
this was the source of a big hiatus between the error-correction performance known for probabilistic noise models (pioneered by Shannon)
and what was thought to be the limit for the more powerful, worst-case
noise models (suggested by Hamming).
In the last decade or so, there has been much algorithmic progress
in coding theory that has bridged this gap (and in fact nearly eliminated it for codes over large alphabets). These developments rely on

an error-recovery model called list decoding, wherein for the pathological error patterns, the decoder is permitted to output a small list of
candidates that will include the original message. This book introduces
and motivates the problem of list decoding, and discusses the central
algorithmic results of the subject, culminating with the recent results
on achieving list decoding capacity.

Part I

General Literature

1
Introduction

1.1

Codes and noise models

Error-correcting codes enable reliable transmission of information over


a noisy communication channel. The idea behind error-correcting codes
is to encode the message to be transmitted into a longer, redundant
string (called a codeword) and then transmit the codeword over the
noisy channel. The redundancy is judiciously chosen in order to enable
the receiver to decode the transmitted codeword even from a somewhat
distorted version of the codeword. Naturally, the larger the amount of
noise (quantied appropriately, according to the specic channel noise
model) one wishes to correct, the greater the redundancy that needs
to be introduced during encoding. A convenient measure of the redundancy is the rate of an error-correcting code, which is the ratio of the
number of information bits in the message to the number of bits in the
codeword. The larger the rate, the less redundant the encoding.
The trade-o between the rate and the amount of noise that can be
corrected is a fundamental one, and understanding and optimizing the
precise trade-o is one of the central objectives of coding theory. The
optimal rate for which reliable communication is possible on a given
108

1.1. Codes and noise models

109

noisy channel is typically referred to as capacity. The challenge is


to construct codes with rate close to capacity, together with ecient
algorithms for encoding and error correction (decoding).
The underlying model assumed for the channel noise crucially governs the rate at which one can communicate while tolerating noise.
One of the simplest models is the binary symmetric channel; here the
channel ips each bit independently with a certain cross-over probability p. It is well-known that the capacity of this channel equals
1 H(p) where H(x) = x log2 x (1 x) log2 (1 x) is the binary
entropy function. In other words, there are codes of rate up to
1 H(p) that achieve probability of miscommunication approaching 0
(for large message lengths), and for rates above 1 H(p), no such codes
exist.
The above was a stochastic model of the channel, wherein we took
an optimistic view that we knew the precise probabilistic behavior of
the channel. This stochastic approach was pioneered by Shannon in
his landmark 1948 paper that marked the birth of the eld of information theory [65]. An alternate, more combinatorial approach, put
forth by Hamming [46], models the channel as a jammer or adversary
that can corrupt the codeword arbitrarily, subject to a bound on the
total number of errors it can cause. This is a stronger noise model since
one has to deal with worst-case or adversarial, as opposed to typical,
noise patterns. Codes and algorithms designed for worst-case noise are
more robust and less sensitive to inaccuracies in modeling the precise channel behavior (in fact, they obviate the need for such precise
modeling!).
This survey focuses on the worst-case noise model. Our main objective is to highlight that even against adversarial channels, one can
achieve the information-theoretically optimal trade-o between rate
and fraction of decodable errors, matching the performance possible
against weaker, stochastic noise models. This is shown for an error
recovery model called list decoding, wherein for the pathological, worstcase noise patterns, the decoder is permitted to output a small list
of candidate messages that will include the correct message. We next
motivate the list decoding problem, and discuss how it oers the hope
of achieving capacity against worst-case errors.

110 Introduction
Remark 1. [Arbitrarily varying channel]
The stochastic noise model assumes knowledge of the precise probability law governing the channel. The worst-case model takes a conservative, pessimistic view of the power of the channel assuming only
a limit on the total amount of noise. A hybrid model called Arbitrarily
Varying Channel (AVC) has also been proposed to study communication under channel uncertainty. Here the channel is modeled as a
jammer which can select from a family of strategies (corresponding to
dierent probability laws) and the sequence of selected strategies, and
hence the channel law, is not known to the sender. The strategy can in
general vary arbitrarily from symbol to symbol, and the goal is to do
well against the worst possible sequence. A less powerful model is that
of the compound channel where the jammer has a choice of strategies,
but the chosen channel law does not change during the transmission of
various symbols of the codeword. AVCs have been the subject of much
research the reader can nd a good introduction to this topic as well
as numerous pointers to the extensive literature in a survey by Lapidoth and Narayan [54]. To the authors understanding, it seems that
much of the work has been of a non-constructive avor, driven by the
information-theoretic motivation of determining the capacity under different AVC variants. There has been less focus on explicit constructions
of codes or related algorithmic issues.

1.2

List decoding: Context and motivation

Given a received word r, which is a distorted version of some codeword


c, the decoding problem strives to nd the original codeword c. The
natural error recovery approach is to place ones bet on the codeword
that has the highest likelihood of being the one that was transmitted,
conditioned on receiving r. This task is called Maximum Likelihood
Decoding (MLD), and is viewed as the holy grail in decoding. MLD
amounts to nding the codeword closest to r under an appropriate
distance measure on distortions (for which a larger distortion is less
likely than a smaller one). In this survey, we will measure distortion by
the Hamming metric, i.e., the distance between two strings x, y n

1.2. List decoding: Context and motivation

111

is the number of coordinates i {1, 2, . . . , n} for which xi = yi . MLD


thus amounts to nding the codeword closest to the received word in
Hamming distance. No approach substantially faster than a brute-force
search is known for MLD for any non-trivial code family. One, therefore
settles for less ambitious goals in the quest for ecient algorithms. A
natural relaxed goal, called Bounded Distance Decoding (BDD), would
be to perform decoding in the presence of a bounded number of errors.
That is, we assume at most a fraction p of symbols are corrupted by
the channel, and aim to solve the MLD problem under this promise.
In other words, we are only required to nd the closest codeword when
there is a codeword not too far away (within distance pn) from the
received word.
In this setting, the basic trade-o question is: What is the largest
fraction of errors one can correct using a family of codes of rate R? Let
C : Rn n be the encoding function of a code of rate R (here n is
the block length of the code, and is the alphabet to which codeword
symbols belong). Now, a simple pigeonholing argument implies there
must exist x = y such that the codewords C(x) and C(y) agree on the
rst Rn 1 positions. In turn, this implies that when C(x) is transmitted, the channel could distort it to a received word r that is equidistant
from both C(x) and C(y), and diers from each of them in about a
fraction (1 R)/2 of positions. Thus, unambiguous bounded distance
decoding becomes impossible for error fractions exceeding (1 R)/2.
However, the above is not a compelling reason to be pessimistic
about correcting larger amounts of noise. This is due to the fact that
received words such as r reect a pathological case. The way Hamming
spheres pack in high-dimensional space, even for p much larger than
(1 R)/2 (and in fact for p 1 R) there exist codes of rate R (over a
larger alphabet ) for which the following holds: for most error patterns
e that corrupt fewer than a fraction p of symbols, when a codeword c
gets distorted into z by the error pattern e, there will be no codeword
besides c within Hamming distance pn of z.1 Thus, for typical noise
1 This

claim holds with high probability for a random code drawn from a natural ensemble.
In fact, the proof of Shannons capacity theorem for q-ary symmetric channels can be
viewed in this light. For ReedSolomon codes, which will be our main focus later on, this
claim has been shown to hold, see [19, 58, 59].

112 Introduction
patterns one can hope to correct many more errors than the above
limit faced by the worst-case error pattern. However, since we assume
a worst-case noise model, we do have to deal with bad received words
such as r. List decoding provides an elegant formulation to deal with
worst-case errors without compromising the performance for typical
noise patterns the idea is that in the worst-case, the decoder may
output multiple answers. Formally, the decoder is required to output a
list of all codewords that dier from the received word in a fraction p
of symbols.
Certainly returning a small list of possibilities is better and more
useful than simply giving up and declaring a decoding failure. Even if
one deems receiving multiple answers as a decoding failure, as mentioned above, for many error patterns in the target noise range, the
decoder will output a unique answer, and we did not have to model
the channel stochastics to design our code or algorithm! It may also be
possible to pick the correct codeword from the list, in case of multiple
answers, using some semantic context or side information (see [23]).
Also, if in the output list, there is a unique closest codeword, we can
also output that as the maximum likelihood choice. In general, list
decoding is a stronger error-recovery model than outputting just the
closest codeword(s), since we require that the decoder output all the
close codewords (and we can always prune the list as needed). For
several applications, such as concatenated code constructions and also
those in complexity theory, having the entire list adds more power
to the decoding primitive than deciding solely on the closest codeword(s).
Some other channel and decoding models. We now give pointers
to some other relaxed models where one can perform unique decoding
even when the number of errors exceeds half the minimum Hamming
distance between two codewords. We already mentioned one model
where an auxiliary channel can be used to send a small amount of side
information which can be used to disambiguate the list [23]. Another
model that allows one to identify the correct message with high probability is one where the sender and recipient share a secret random key,
see [53] and a simplied version in [67].

1.3. The potential of list decoding

113

Finally, there has been work where the noisy channel is modeled
as a computationally bounded adversary (as opposed to an all-powerful
adversary), that must introduce the errors in time polynomial in the
block length. This is a very appealing model since it is a reasonable
hypothesis that natural processes can be implemented by ecient computation, and therefore real-world channels are, in fact, computationally bounded. The computationally bounded channel model was put
forth by Lipton [56]. Under standard cryptographic assumptions, it
has been shown that in the private key model where the sender and
recipient share a secret random seed, it is possible to decode correctly
from error rates higher than half-the-minimum-distance bound [21,48].
Recently, similar results were established in a much simpler cryptographic setting, assuming only that one-way functions exist, and that
the sender has a public key known to the receiver (and possibly to the
channel as well) [60].

1.3

The potential of list decoding

The number of codewords within Hamming distance pn of the worstcase received word r is clearly a lower bound on the runtime of any
list decoder that corrects a fraction p of errors. Therefore, in order
for a polynomial time list decoding algorithm to exist, the underlying
codes must have the a priori combinatorial guarantee of being p-listdecodable, namely every Hamming ball of radius pn has a small number,
say L(n), of codewords for some polynomially bounded function L().2
This packing constraint poses a combinatorial upper bound on the
rate of the code; specically, it is not hard to prove that we must have
R  1 p or otherwise the worst-case list size will grow faster than any
polynomial in the block length n.
Remarkably, this simple upper bound can actually be met. In other
words, for every p, 0 < p < 1, there exist codes of rate R = 1 p o(1)
which are p-list-decodable. That is, non-constructively we can show the
existence of codes of rate R that oer the potential of list decoding up
to a fraction of errors approaching (1 R). We will refer to the quantity (1 R) as the list decoding capacity. Note that the list decoding
2 Throughout

the survey, we will be dealing with the asymptotics of a family of codes of


increasing block lengths with some xed rate.

114 Introduction
capacity is twice the fraction of errors that one could decode if we
insisted on a unique answer always quite a substantial gain! Since the
message has Rn symbols, information-theoretically we need at least a
fraction R of correct symbols at the receiving end to have any hope
of recovering the message. Note that this lower bound applies even if
we somehow knew the locations of the error and could discard those
misleading symbols. With list decoding, therefore, we can potentially
reach this information-theoretic limit and decode as long as we receive
slightly more than Rn correct symbols (the correct symbols can be
located arbitrarily in the received word, with arbitrary noise aecting
the remaining positions).
To realize this potential, however, we need an explicit description
of such capacity-achieving list-decodable codes, and an ecient algorithm to perform list decoding up to the capacity (the combinatorics
only guarantees that every Hamming ball of certain radius has a small
number of codewords, but does not suggest any ecient algorithm to
actually nd those codewords). The main technical result in this survey will achieve precisely this objective we will give explicit codes of
rate R with a polynomial time list decoding algorithm for a fraction
(1 R ) of errors, for any desired > 0.
The above description was deliberately vague on the size of the
alphabet . The capacity 1 R for codes of rate R applies in the limit
of large alphabet size. It is also of interest to ask how well list decoding
performs for codes over a xed alphabet size q. For the binary (q = 2)
case, to correct a fraction p of errors, list decoding oers the potential
of communicating at rates up to 1 H(p). This is exactly the capacity
of the binary symmetric channel with cross-over probability p that we
discussed earlier. With list decoding, therefore, we can deal with worstcase errors without any loss in rate. For binary codes, this remains a
non-constructive result and constructing explicit codes that achieve list
decoding capacity remains a challenging goal.

1.4

The origins of list decoding

List decoding was proposed in the late 50s by Elias [13] and
Wozencraft [78]. Curiously, the original motivation in [13] for formulating list decoding was to prove matching upper and lower bounds on

1.5. Scope and organization of the book

115

the decoding error probability under maximum likelihood decoding on


the binary symmetric channel. In particular, Elias showed that, when
the decoder is allowed to output a small list of candidate codewords
and a decoding error is declared only when the original codeword is not
on the output list, the average error probability of all codes is almost
as good as that of the best code, and in fact almost all codes are almost
as good as the best code. Despite its origins in the Shannon stochastic
school, it is interesting that list decoding ends up being the right notion
to realize the true potential of coding in the Hamming combinatorial
school, against worst-case errors.
Even though the notion of list decoding dates back to the late
1950s, it was revived with an algorithmic focus only recently, beginning
with the GoldreichLevin algorithm [17] for list decoding Hadamard
codes, and Sudans algorithm in the mid 1990s for list decoding Reed
Solomon codes [69]. It is worth pointing out that this modern revival
of list decoding was motivated by questions in computational complexity theory. The GoldreichLevin work was motivated by constructing
hard-core predicates, which are of fundamental interest in complexity
theory and cryptography. The motivation for decoding ReedSolomon
and related polynomial-based codes was (at least partly) establishing
worst-case to average-case reductions for problems such as the permanent. These and other more recent connections between coding theory
(and specically, list decoding) and complexity theory are surveyed
in [29, 70, 74] and [28, Chapter 12].

1.5

Scope and organization of the book

The goal of this survey is to obtain algorithmic results in list decoding. The main technical focus will be on giving a complete presentation of the recent algebraic results achieving list decoding capacity. We
will only provide pointers or brief descriptions for other works on list
decoding.
The survey is divided into two parts. The rst part (Chapters 15)
covers the general literature, and the second part focuses on achieving
list decoding capacity. The authors Ph.D. dissertation [28] provides
a more comprehensive treatment of list decoding. In comparison with

116 Introduction
[28], most of Chapter 5 and the entire Part II of this survey discuss
material developed since [28].
We now briey discuss the main technical contents of the various
chapters. The basic terminology and denitions are described in Chapter 2. Combinatorial results which identify the potential of list decoding
in an existential, non-constructive sense are presented in Chapter 3. In
particular, these results will establish the capacity of list decoding (over
large alphabets) to be 1 R. We begin the quest for explicitly and algorithmically realizing the potential of list decoding in Chapter 4, which
discusses a list decoding algorithm for ReedSolomon (RS) codes the
algorithm is based on bivariate polynomial interpolation. We conclude
the rst part with a brief discussion in Chapter 5 of algorithmic results
for list decoding certain codes based on expander graphs.
In Chapter 6, we discuss folded ReedSolomon codes, which are RS
codes viewed as a code over a larger alphabet. We present a decoding
algorithm for folded RS codes that uses multivariate interpolation plus
some other algebraic ideas concerning nite elds. This lets us approach
list decoding capacity. Folded RS codes are dened over a polynomially
large alphabet, and in Chapter 7 we discuss techniques that let us bring
down the alphabet size to a constant independent of the block length.
We conclude with some notable open questions in Chapter 8.

2
Denitions and Terminology

2.1

Basic coding terminology

We review the terminology that will be needed and used throughout


the survey. Facility with basic algebra concerning nite elds and eld
extensions is assumed, though we recap some basic notation and facts
at the end of this Chapter. Some comfort with probabilistic and combinatorial arguments, and analysis of algorithms would be a plus.
Let be a nite alphabet. An error-correcting code, or simply code,
over the alphabet , is a subset of n for some positive integer n. The
elements of the code are referred to as codewords. The number n is
called the block length of the code. If || = q, we say that the code
is q-ary, with the term binary used for the q = 2 case. Note that the
alphabet size q may be a function of the block length. Associated with
an error-correcting code C is an encoding function E : {1, 2, . . . , |C|}
n that maps a message to its associated codeword. Sometimes, we nd
it convenient to abuse notation and identify the encoding function with
the code, viewing the code itself as a map from messages to codewords.
logq |C|
.
The rate of a q-ary error-correcting code is dened to be
n
The rate measures the amount of actual information transmitted per
117

118 Denitions and Terminology


bit of channel use. The larger the rate, less redundant the encoding.
The minimum distance (or simply distance) of a code C is equal to the
minimum Hamming distance between two distinct codewords of C. It
is often convenient to measure distance by a normalized quantity in
the range [0, 1] called the minimum relative distance (or simply relative
distance), which equals the ratio of the minimum distance to the block
length. A large relative distance bestows good error-correction potential
on a code. Indeed, if a codeword is corrupted in less than a fraction /2
of the positions, where is the relative distance, then it may be correctly
recovered as the unique codeword that is closest to the received word
in Hamming distance.
A word on the asymptotics is due. We will be interested in families
of codes with increasing block lengths all of which have rate bounded
below by an absolute constant R > 0 (i.e., the rate does not tend to
0 as the block length grows). Moreover, we would also like the relative distance of all codes in the family to be bounded below by some
absolute constant > 0. If the alphabet of the code family is xed
(independent of the block length), such code families are said to be
asymptotically good. There is clearly a trade-o between the rate R and
relative distance . The Singleton bound is a simple bound which says
that R  1 . This bound is achieved by certain codes (called MDS
codes) over large alphabets (with size growing with the block length),
but there are tighter bounds known for codes over alphabets of xed
constant size. For example, for q-ary codes, the Plotkin bound states
q
for 0  < 1 1/q, and the rate must vanish for
that R  1 q1
 1 1/q, so that one must have < 1 1/q in order to have positive
rate. The Gilbert-Varshamov bound asserts that there exist q-ary codes
with rate R  1 Hq () where Hq (x) is the q-ary entropy function
dened as Hq (x) = x logq (q 1) x logq x (1 x) logq (1 x). The
EliasBassalygo upper
bound
states


 that R  1 Hq (J(, q)) where
q
J(, q) = (1 1/q) 1 1 q1
. We will run into the quantity
J(, q) in Chapter 3 when we discuss the Johnson bound on list decoding radius.
A particularly important subclass of codes are linear codes. A linear
code is dened over an alphabet that is a nite eld, say F. A linear code

2.2. Denitions for list decoding

119

of block length n is simply a subspace of Fn (viewed as a vector space


over F). The dimension of C as a vector space is called the dimension
of the code. Note that if the dimension of C is k then |C| = |F|k and
the rate of C equals k/n. The encoding function of a linear code can
be viewed as a linear transformation E : Fk Fn , where E(x) = Gx
for a matrix G Fnk called the generator matrix.

2.2

Denitions for list decoding

The problem of list decoding a code C of block length n up to a fraction


p of errors (or radius p) is the following: given a received word, output
the list of all codewords c C within Hamming distance pn from it. To
perform this task in worst-case polynomial time for a family of codes,
we need the a priori combinatorial guarantee that the output list size
will be bounded by a polynomial in the block length, irrespective of
which word is received. This motivates the following denition.
Denition 2.1. ((p,L)-list-decodability) For 0 < p < 1 and an integer L  1, a code C n is said to be list decodable up to a fraction
p of errors with list size L, or more succinctly (p, L)-list-decodable, if
for every y n , the number of codewords c C within Hamming
: Z+ Z+ , and
distance pn from y is at most L. For a function L

0 < p < 1, a family of codes is said to be (p, L)-list-decodable


if every

code C in the family is (p, L(n))-list-decodable,


where n is the block
takes on the constant value L for all
length of C. When the function L
block lengths, we simply say that the family is (p, L)-list-decodable.
Note that a code being (p, 1)-list-decodable is equivalent to saying
that its relative distance is greater than 2p. We will also need the
following denition concerning a generalization of list decoding.
Denition 2.2. (List recovering) For 0  p < 1 and integers 1 
  L, a code C n is said to be (p, , L)-list-recoverable if for all
sequences of subsets S1 , S2 , . . . , Sn with each Si satisfying |Si |  ,
there are at most L codewords c = (c1 , . . . , cn ) C with the property
that ci Si for at least (1 p)n values of i {1, 2, . . . , n}. The value

120 Denitions and Terminology


 is referred to as the input list size. A similar denition applies to
families of codes.
Note that a code being (p, 1, L)-list-recoverable is the same as it
being (p, L)-list-decodable. For (p, , L)-list-recovering with  > 1, even
a noise fraction p = 0 leads to non-trivial problems. In this noise-free
(i.e., p = 0) case, for each location we are given  possibilities one of
which is guaranteed to match the codeword, and the objective is to
nd all codewords for which this property holds. A special case of such
noise-free list recovering is when  codewords are given in scrambled
order and the goal is to recover all of them. We will consider this toy
decoding problem for ReedSolomon codes in Chapter 4, and in fact it
will be the stepping stone for our list decoding algorithms.

2.3

Useful code families

We now review some of the code constructions that will be heavily


used in this survey. We begin with the class of ReedSolomon Codes,
which are an important, classical family of algebraic error-correcting
codes.
Denition 2.3. A ReedSolomon code, RSF,S [n, k], is parameterized
by integers n, k satisfying 1  k  n, a nite eld F of size at least n,
and a tuple S = (1 , 2 , . . . , n ) of n distinct elements from F. The code
is described as a subset of Fn as:
RSF,S [n, k] = {(p(1 ), p(2 ), . . . , p(n )) | p(X) F[X] is a
polynomial of degree  k}.
In other words, the message is viewed as a polynomial, and it is encoded
by evaluating the polynomial at n distinct eld elements 1 , . . . , n . The
resulting code is linear of dimension k + 1, and its minimum distance
equals n k, which is the best possible for dimension k + 1 (attains
the Singleton bound).
Concatenated codes: ReedSolomon codes are dened over a large
alphabet, one of size at least the block length of the code. A simple, but powerful technique called code concatenation can be used to

2.3. Useful code families

121

construct codes over smaller alphabets starting with codes over a larger
alphabets.
The basic idea behind code concatenation is to combine two codes,
an outer code Cout over a larger alphabet (of size Q, say), and an inner
code Cin with Q codewords over a smaller alphabet (of size q, say),
to get a combined q-ary code that, loosely speaking, inherits the good
features of both the outer and inner codes. These were introduced by
Forney [15] in a classic work. The basic idea is very natural: to encode a
message using the concatenated code, we rst encode it using Cout , and
then in turn encode each of the resulting symbols into the corresponding
codeword of Cin . Since there are Q codewords in Cin , the encoding
procedure is well dened. Note that the rate of the concatenated code
is the product of the rates of the outer and inner codes.
The big advantage of concatenated codes for us is that we can get
a good list decodable code over a small alphabet (say, binary codes)
based on a good list decodable outer code over a large alphabet (like
a ReedSolomon code) and a suitable binary inner code. The block
length of the inner code is small enough to permit a brute-force search
for a good code in reasonable time.
Code concatenation works rather naturally in conjunction with list
recovering of the outer code to give algorithmic results for list decoding.
The received word for the concatenated code is broken into blocks corresponding to the inner encodings of the various outer symbols. These
blocks are list decoded, using a brute-force inner decoder, to produce
a small set of candidates for each symbol of the outer codeword. These
sets can then be used as input to a list recovering algorithm for the
outer code to complete the decoding. It is not dicult to prove the
following based on the above algorithm:
Lemma 2.1. If the outer code is (p1 , , L)-list-recoverable and the inner
code is (p2 , )-list-decodable, then the concatenated code is (p1 p2 , L)list-decodable.
Code concatenation and list recovering will be important tools for
us in Chapter 7 where we will construct codes approaching capacity
over a xed alphabet.

122 Denitions and Terminology

2.4

Basic nite eld algebra

We recap basic facts and notation concerning nite elds. For any prime
p, the set of integers modulo p form a eld, which we denote by Fp .
The ring of univariate polynomials in variable X with coecients
from a eld F is denoted by F[X]. A polynomial f (X) is said to be
irreducible over F, if f (X) = r(X)s(X) for r(X), s(X) F[X] implies
that either r(X) or s(X) is a constant polynomial. A polynomial is
said to be monic if its leading coecient is 1. The ring F[X] has unique
factorization: Every monic polynomial can be written uniquely as a
product of monic irreducible polynomials.
If h(X) is an irreducible polynomial of degree e over F, then the
quotient ring F[X]/(h(X)), consisting of polynomials modulo h(X), is
a nite eld with |F|e elements (just as Fp = Z/(p) is a eld, where Z is
the ring of integers and p is a prime). The eld F[X]/(h(X)) is called
an extension eld of degree e over F; the extension eld also forms a
vector space of dimension e over F.
The prime elds Fp , and their extensions as dened above, yield all
nite elds. The size of a nite eld is thus always a prime power.
The characteristic of a nite eld equals p if it is an extension of
the prime eld Fp . Conversely, for every prime power q, there is a
unique (up to isomorphism) nite eld Fq . We denote by Fq the set
of nonzero elements of Fq . It is known that Fq is a cyclic group
(under the multiplication operation), generated by some Fq , so
that Fq = {1, , 2 , . . . , q2 }. Any such is called a primitive element.
There are in fact (q 1) such primitive elements, where (q 1) is
the number of positive integers less than q 1 that are relatively prime
to q 1.
We owe a lot to the following basic property of elds: Let f (X)
F[X] be a nonzero polynomial of degree d. Then f (X) has at most d
roots in F.

3
Combinatorics of List Decoding

In this chapter, we prove combinatorial results concerning listdecodable codes, and study the relation between the list decodability
of a code and its other basic parameters such as minimum distance and
rate. We will show that every code can be list decoded using small lists
beyond half its minimum distance, up to a bound we call the Johnson
radius. We will also prove existential results for codes that will highlight the sort of rate vs. list decoding radius trade-o one can hope for.
Specically, we will prove the existence of (p, L)-list-decodable codes
of good rate and thereby pinpoint the capacity of list decoding. This
then sets the stage for the goal of realizing or coming close to these
trade-os with explicit codes, as well as designing ecient algorithms
for decoding up to the appropriate list decoding radius.

3.1

The Johnson bound

If a code has distance d, every Hamming ball of radius less than d/2 has
at most one codeword. The list decoding radius for a list size of 1 thus


equals d1
2 . Could it be that already at radius slightly greater than
d/2 we can have a large number of codewords within some Hamming
123

124 Combinatorics of List Decoding


ball? We will prove a result called the Johnson bound which rules out
such a phenomenon. It highlights the potential of list decoding with
small lists up to a radius, which we call the Johnson radius, that is
much larger than d/2. In turn, this raises the algorithmic challenge
of decoding well-known codes such as ReedSolomon codes up to the
Johnson radius, a task we will undertake in the next chapter.
The Johnson bound is a classical bound in coding theory and is
at the heart of the EliasBassalygo bound on rate as a function of
relative distance. The Johnson bound was stated and proved in the
context of list decoding, in a form similar to that presented here, in [20,
Section 4.1]. A simpler geometric proof of the Johnson bound appears
in [28, Chapter 3]. Here, for sake of variety, we present a combinatorial
proof that has not appeared in this form before. The proof was shown to
us by Jaikumar Radhakrishnan [64]. In the following, we use (a, b) to
denote the Hamming distance between strings a and b. We use B(r, e)
(or Bq (r, e) if we want to make the alphabet size explicit) to denote the
 
Hamming ball of radius e centered at r. For a set S, the notation S2
stands for all subsets of S of size 2.
Theorem 3.1. (Johnson bound) Suppose r [q]n , and B [q]n . Let
d=

[(x, y)];

{x,y}(B
2)

e = E [(r, x)].
xB

Then, |B|  

1
tor is positive.

q
q1

e
n

q
q1
2

d
n


1

q
q1

d
n

 , provided the denomina-

Corollary 3.2. Let C be any


 q-ary code of block length n and
1
minimum distance d = 1 q (1 )n for some (0, 1). Let e =


1 1q (1 )n for some (0, 1) be an integer. Suppose 2 > .
Then, for all r [q]n , |Bq (r, e) C| 

1
.
2

3.1. The Johnson bound

125

Proof. Let B = Bq (r, e) C. Let



1
(1  )n;
E [(x, y)] = 1
q
{x,y}(B
2)


1
(1  )n.
and E [(r, x)] = 1
q
xB


We then have   < 2  2 , and by Theorem 3.1,


|Bq (r, e) C| 

1
1 
1 2
1 2
= 2
.
= 1 + 2
1+ 2
2









q
The bound n 1 1q 1 1 q1
nd is called the (q-ary)
Johnson radius, and in every q-ary code of relative distance d/n, every
Hamming ball of this radius is guaranteed to have few codewords.
Proof. (of Theorem 3.1) To keep the notation simple, we will assume
that the alphabet is {0, 1, . . . , q 1} and that r = 0n . Let = ne and
M = |B|. Pick distinct codewords x = (x1 , . . . , xn ), y = (y1 , . . . , yn ) B,
uniformly at random. We will obtain a lower bound (in terms of e
and M ) on the expected number of coordinates where x and y agree.
We know that this expectation is n d. The theorem will follow by
comparing these two quantities.
For i [n] and [q], let ki () = |{x B : xi = }|. Note that

[q] ki () = M . Also, ki (0) is the number of codewords in B that


agree with r at location i. Thus, for i = 1, 2, . . . , n,

Pr [xi = yi ] =

M
2


1 
ki ()
[q]




q1 
ki ()
M 1 ki (0)
+
=
2
2
2
=1
 1 

 M ki (0)
M
ki (0)
q1

,
+ (q 1)
2
2
2


126 Combinatorics of List Decoding


using Jensens inequality. Then, the expected number of coordinates
where x and y agree is

 1

 M ki (0)
n
n

ki (0)
M
q1
Pr [xi = yi ] 
+ (q 1)
2
2
2
i=1
i=1
 M k
 1 
k
M
,
(3.1)
n
+ (q 1) q1

2
2
2



using Jensens inequality again, where k = n1 i ki (0) = M ne


n . This
expectation is exactly n d. Thus,
 
 M k

M
k
nd

+ (q 1) q1
n
2
2
2




M k
nd
M (M 1)  k(k 1) + (M k)
i.e.,
1 .
n
q1
k
, the above is equivalent to
Substituting 1 ne for M



e 2
e2
d
(M 1)  1
M+
M 1,
1
n
n
(q 1)n2

which upon rearranging gives


M

q
q1

e
n

q
q1
2

d
n


1

q
q1

d
n

.

We can also state the following alphabet independent version of the


Johnson bound:
Theorem 3.3. (Large alphabet) Let C be a q-ary code of block
length n and distance d. Suppose r [q]n and (n e)2 > n(n d).
Then,
|B(r, e) C| 

nd
.
(n e)2 n(n d)

In particular, if (n e)2  n(n d) + 1, then |B(r, e) C|  n2 .

3.2. The capacity of list decoding

127

Proof. The denominator of the upper bound on |B| in Theorem 3.1


equals
q
q1

q e2
d
2e
+

2
q 1n
n
n

Hence it follows that |B| 

q

q1



e
1
n

d
1
n


.

nd
.
(ne)2 n(nd)

Theorem 3.3 says that a code of relative distance can be list

decoded up to a fraction (1 1 ) of errors with polynomial size


lists. Note that for 1, the fraction of errors approaches 1, whereas
with unique decoding, we can never correct more than a fraction 1/2
of errors.
The reader may wonder whether the Johnson bound is tight, or
whether it may be possible to improve it and show that for every
code of a certain relative distance, Hamming balls of radius somewhat larger than the Johnson radius still have polynomially many
codewords. It turns out that purely as a function of the distance, the
Johnson bound is the best possible. That is, there exist codes which
have super-polynomially many codewords in a ball of radius slightly
bigger than the Johnson radius. For general codes, this is shown by
an easy random coding argument which picks, randomly and independently, several codewords of weight slightly greater than the Johnson
radius [20, Section 4.3]. For linear codes, the result is harder to prove,
and was shown for the binary case in [24].
We remark that, in a certain precise sense, it is known that for
most codes, the Johnson bound is not tight and list decoding radius
for polynomial-sized lists far exceeds the Johnson radius. This follows
from random coding arguments used in the proofs of Theorems 3.5 and
3.6 below.

3.2

The capacity of list decoding

We now turn to determining the trade-o between the rate of a code and
its list decoding radius, i.e., the capacity of list decoding. Throughout, q will be the alphabet size and p the fractional error-correction

128 Combinatorics of List Decoding


radius. The function Hq (x) denotes the q-ary entropy function, given by
Hq (x) = x logq (q 1) x logq x (1 x) logq (1 x).

(3.2)

The signicance of the entropy function for us is the fact that


asymptotically q Hq (p)n is a very good estimate of the volume |Bq (0, pn)|
of the q-ary Hamming ball of radius pn. Note that |Bq (0, pn)| =

pn n
i
i=1 i (q 1) . It is known, see for example [75, Chapter 1], that
for 0 < p < 1 1/q and growing n,
q Hq (p)no(n)  |Bq (0, pn)|  q Hq (p)n .

(3.3)

We begin with a simple upper bound on the list decoding radius.


Theorem 3.4. Let q  2 be an integer, 0 < p < 1 1/q, and > 0.
Then for all large enough n, if C is a q-ary code of block length n and
rate 1 Hq (p) + , then there is a center r [q]n such that |Bq (r, pn)
C|  q n/2 . In other words, C cannot be list decoded up to a fraction
p of errors with polynomial sized lists.
Proof. Pick a random r [q]n and consider the random variable Z =
|B (0,pn)|
which is at least q no(n)
|Bq (r, pn) C|. Clearly, E[Z] = |C| q qn
using (3.3). Therefore, there must exist a center r such that |Bq (r, pn)
C|  q no(n)  q n/2 .
It turns out that the above simple bound is in fact tight and a rate
approaching 1 Hq (p) can in fact be attained for correcting a fraction
p of errors, as shown below.
Theorem 3.5. [14] For integers q, L  2 and every p (0, 1 1/q),
there exists a family of (p, L)-list-decodable q-ary error-correcting codes
of rate R satisfying R  1 Hq (p) 1/L.
Proof. The proof follows a standard random coding argument. Fix
a large enough block length n; for simplicity assume that e = pn is an
integer. The idea is to pick (with replacement) a random code of block
length n consisting of M codewords, where M is a parameter that will

3.2. The capacity of list decoding

129

be xed later in the proof. We will show that with high probability the
resulting code will be (p, L)-list decodable.
The probability that a xed set of (L + 1) codewords all lie
in a xed Hamming sphere (in the space [q]n ) of radius pn
equals (|Bq (0, pn)|/q n )L+1 . By (3.3), this probability is at most
q (L+1)(1Hq (p))n . Therefore, by a union bound, the probability that
some tuple of (L + 1) codewords all lie in some Hamming ball of radius
pn is at most


M
q n q (L+1)(1Hq (p))n .
L+1
If M = q rn for r = 1 Hq (p) 1/(L + 1), then the above probability is at most 1/(L + 1)! < 1/3. Also, among the M chosen codewords
there are at least M/2 distinct codewords with probability at least 1/2.
Hence, there exists a (p, L)-list-decodable with q (1Hq (p)1/(L+1))n /2 
q (1Hq (p)1/L)n distinct codewords, or in other words with rate at least
1 Hq (p) 1/L, as claimed.
Remark 2. Using a variant of the above random coding argument together with expurgation the rate lower bound can be
improved slightly to 1 Hq (p)(1 + 1/L); see [28, Theorem 5.5] for
details.
The above two results imply that, over alphabets of size q, the
optimal rate possible for list decoding to radius p is 1 Hq (p). The
proof of Theorem 3.5 is non-constructive, and the big challenge is
to construct an explicit code with rate close to capacity. It turns
out one can also approach capacity using a linear code, as the following result, which rst appeared implicitly in the work of Zyablov
and Pinsker [79], shows. This result is also non-constructive and is
proved by picking a random linear code of the stated rate. We skip
the proof here; the reader may nd the detailed proof in [28, Theorem 5.6]. The key point is that in any L-tuple of nonzero messages,
there must be at least logq (L + 1) linearly independent messages, and
these are mapped to completely independent codewords by a random
linear code.

130 Combinatorics of List Decoding


Theorem 3.6. For every prime power q  2, every p (0, 1 1/q),
and every integer L  2, there exists a family of (p, L)-list-decodable
q-ary linear error-correcting codes of rate
R  1 Hq (p)

1
.
logq (L + 1)

Remark 3. In order for the rate to be within of the capacity, the


list size needed for linear codes as per Theorem 3.6 is exponentially
worse than for general codes. This is not known to be inherent, and
we suspect that it is not the case. We conjecture a trade-o similar to
Theorem 3.5 can also be obtained using linear codes. For the binary
case q = 2, this was shown in [30] via a subtle use of the semi-random
method. Generalizing this claim to larger alphabets has remained open.
It is also known that for each xed constant L, the rate of (p, L)list-decodable binary codes is strictly less than 1 H(p) [8], and so
unbounded list size is needed to achieve the capacity of list decoding.
A similar result should hold over all alphabets; for q-ary alphabets
with q > 2 this has been shown assuming the convexity of a certain
function [9]. Without any assumption, it is shown in [9] that for any
xed L, the rate of (p, L)-list decodable q-ary codes becomes zero for p
strictly less than 1 1/q.
3.2.1

Implication for large alphabets

We now inspect the behavior of the list decoding capacity 1 Hq (p)


as the alphabet size q grows. We have


H(p)
q

,
1 Hq (p) = 1 p + p logq
q1
log q
where H(x) is the binary entropy function and log x denotes logarithm
to the base 2. In the limit of large q, the list decoding capacity thus
approaches 1 p. In other words, we can list decode a code of rate R up
to a fraction of errors approaching 1 R, as mentioned in Chapter 1.
Since 1 Hq (p)  1 p log1 q , we can get within of capacity, i.e.,
achieve a rate of 1 p for list decoding up to a fraction p of errors,

3.2. The capacity of list decoding

131

over an alphabet of size 2O(1/) . Moreover, a list size of O(1/) suces


for this result. Conversely, if we x p and let 0, then an alphabet
size of 2(1/) is necessary in order to achieve a rate of 1 p for
list decoding up to a fraction p of errors. We record this discussion
below:
Corollary 3.7. Let 0 < R < 1, and > 0 be suciently small. Then
there exists a family of error-correcting codes of rate R over an alphabet
of size 2O(1/) that are (1 R , O(1/))-list-decodable. Conversely,
for a xed p and 0, if there exists a family of q-ary (p )-listdecodable codes of rate 1 p, then q  2(1/) (here the constant in
the (1/) depends on p).
The central goal in this survey will be to achieve the above result
constructively, i.e., give explicit codes of rate R and ecient algorithms
to list decode them up to a fraction (1 R ) of errors. We will
meet this goal, except we need a list size that is much larger than the
existential O(1/) bound above (the alphabet size of our codes will be
4
2O(1/ ) , which is not much worse compared to the 2(1/) lower bound).

4
Decoding ReedSolomon Codes

In this chapter, we will present a list decoding algorithm for Reed


Solomon codes that can decode up to the Johnson radius, namely a

fraction 1 1 of errors, where is the relative distance. Since


= 1 R for RS codes of rate R, this enables us to correct a fraction
1 R of errors with rate R.

4.1

A toy problem: Decoding a mixture of two codewords

We begin with a simple setting, rst considered in [5], that will motivate
the main approach underlying the list decoding algorithm. Under list
decoding, if we get a received word that has two closest codewords,
we need to return both those codewords. In this section, we develop
an algorithm that, given received words that are formed by mixing
up two codewords, recovers those codewords. In a sense, this is the
simplest form in which a list decoding style question arises.
4.1.1

Scrambling two codewords

Let C = RSF,S [n, k] be the ReedSolomon code obtained by evaluating


degree k polynomials over a eld F at a set S = {1 , 2 , . . . , n } of n
132

4.1. A toy problem: Decoding a mixture of two codewords

133

distinct eld elements. Let us assume k < n/2 for this section. Suppose
we are given two unknown codewords, corresponding to two unknown
polynomials p1 (X) and p2 (X), in a scrambled fashion as follows: For
each i = 1, 2, . . . , n, we are given a pair (ai , bi ) such that either ai =
p1 (i ) and bi = p2 (i ), or ai = p2 (i ) and bi = p1 (i ). The goal is to
recover the polynomials p1 (X) and p2 (X) from this data.
For each i, since we know both p1 (i ) and p2 (i ) in some order, we
can compute p1 (i ) + p2 (i ) = ai + bi , as well as p1 (i )p2 (i ) = ai bi .
def

In other words, we know the value of the polynomial S(X) = p1 (X) +


p2 (X) and P (X) = p1 (X)p2 (X) at all the i s. Now the degree of both
S(X) and P (X) is at most 2k < n. Therefore, using polynomial interpolation, we can completely determine both these polynomials from
their values at all the i s.
Now consider the bivariate polynomial Q(X, Y ) = Y 2 S(X)Y +
P (X). We just argued that we can compute the polynomial Q(X, Y ).
But clearly Q(X, Y ) = (Y p1 (X))(Y p2 (X)), and therefore we can
nd p1 (X) and p2 (X) by factorizing the bivariate polynomial Q(X, Y )
into its irreducible factors. In fact, we only need to nd the roots
of Q(X, Y ) (treated as a polynomial in Y with coecients from the
ring F[X]). This task can be accomplished in polynomial time (details
appear in Section 4.5).
4.1.2

Mixing two codewords

We now consider a related question, where for each codeword position i, we are given either yi which equals either p1 (i ) or p2 (i ) (and
we do not know which value we are given). Given such a mixture of
two codewords, our goal is to identify p1 (X) and p2 (X). Now clearly,
we may only be given the value p1 (i ) for all i, and in this case we
have no information about p2 (X). Under the assumption k < n/6, the
algorithm below will identify both the polynomials if they are both well
represented in the following sense: for both j = 1, 2, we have pj (i ) = yi
for at least 1/3 of the i s.
The following simple observation sets the stage for the algorithm:
For each i [n], (yi p1 (i ))(yi p2 (i )) = 0. In other words, the
polynomial Q(X, Y ) = (Y p1 (X))(Y p2 (X)) satises Q(i , yi ) = 0
for every i [n].

134 Decoding ReedSolomon Codes


Unlike the previous section, we are now no longer be able to construct the polynomial Q(X, Y ) directly by computing the sum and
product of p1 (X) and p2 (X). Instead, we will directly interpolate a

i , yi ) = 0 for all i [n], with the


polynomial Q(X,
Y ) that satises Q(
hope that will help us to nd Q(X, Y ). Of course, doing this without
is useless (for instance, we could
any restriction on the degree of Q

n

then just take Q(X,


Y ) = i=1 (X i ), which reveals no information
about p1 (X) or p2 (X)).
From the existence of Q(X, Y ) = (Y p1 (X))(Y p2 (X)), which

Y ) be of the
vanishes at all the pairs (i , yi ), we can require that Q(X,
form

k
2k

q1j X j +
q2j X j .
(4.1)
Q(X,
Y)=Y2 +Y
j=0

j=0

i , yi ) = 0 pose a system of n linear equations in the


The conditions Q(
variables q1j , q2j . From the existence of Q(X, Y ), we know this system

has a solution. Therefore, we can interpolate a Q(X,


Y ) of the form
(4.1) by solving this linear system. The following simple lemma shows
that we nd.
the utility of any such polynomial Q
Lemma 4.1. If p(X) is a polynomial of degree at most k < n/6 such

p(X)) 0,
that p(i ) = yi for at least n/3 values of i [n], then Q(X,

or in other words Y p(X) is a factor of Q(X,


Y ).
def

Proof. Dene the univariate polynomial R(X) = Q(X, p(X)). Let


S = {i | p(i ) = yi ; 1  i  n}. For each i S, we have R(i ) =
i , yi ) = 0. Since the i s are distinct, R(X) has at
i , p(i )) = Q(
Q(
least |S|  n/3 roots. On the other hand, the degree of R(X) is at
most 2k < n/3. The polynomial R(X) has more roots than its degree,
which implies that it must be the zero polynomial.
Corollary 4.2. If each received symbol yi equals either p1 (i ) or
p2 (i ), and moreover yi = pj (i ) for at least n/3 values of i [n]

for each j = 1, 2, then the interpolated polynomial Q(X,


Y ) equals
(Y p1 (X))(Y p2 (X)).

4.2. ReedSolomon list decoding

135

Therefore, once we nd Q(X,


Y ), we can recover the solutions p1 (X)
and p2 (X) by a root-nding step.

4.2

ReedSolomon list decoding

We now turn to the problem of list decoding ReedSolomon codes.


In this section, we will describe an algorithm due to Sudan [69] that
decodes close to a fraction 1 of errors for low rates. Let CRS be an [n, k +
1, n k]q ReedSolomon code over a eld F of size q  n, with a degree
k polynomial p(X) F[X] being encoded as (p(1 ), p(2 ), . . . , p(n )).
The problem of decoding such an RS code up to e errors reduces to the
following polynomial reconstruction problem with agreement parameter
t = n e:
Problem 4.1. (Polynomial reconstruction)
Input: Integers k, t, and n distinct pairs {(i , yi )}ni=1 where i , yi F.
Output: A list of all polynomials p(X) F[X] of degree at most k
which satisfy p(i ) = yi for at least t values of i [n].
Note that the polynomial reconstruction problem is more general
than RS list decoding since the i s are not required to be distinct.
Denition 4.1. ((1,k )-weighted degree) For a polynomial Q(X, Y )
F[X, Y ], its (1, k)-weighted degree is dened to be the maximum value
of  + kj taken over all monomials X  Y j that occur with a nonzero
coecient in Q(X, Y ).
The following fact generalizes Lemma 4.1 and has an identical proof:
Lemma 4.3. Suppose Q(X, Y ) is a nonzero polynomial with (1, k)weighted degree at most D satisfying Q(i , yi ) = 0 for every i [n].
Let p(X) be a polynomial of degree at most k such that p(i ) = yi for
at least t > D values of i [n]. Then Q(X, p(X)) 0, or in other words
Y p(X) is a factor of Q(X, Y ).
In view of the above lemma, if we could interpolate a nonzero
polynomial Q(X, Y ) of (1, k)-weighted degree less than t satisfying

136 Decoding ReedSolomon Codes


Q(i , yi ) = 0 for all i [n], then we can nd all polynomials that
have agreement at least t with the points amongst the factors of
Q(X, Y ). In Section 4.1.2, we knew that such a polynomial Q(X, Y ) of
(1, k)-weighted degree 2k existed since (Y p1 (X))(Y p2 (X)) was an
explicit example of such a polynomial. Note that once we are assured
of the existence of such a polynomial, we can nd one by solving a
linear system. But now that yi s are arbitrary, how do we argue about
the existence of a Q-polynomial of certain (1, k)-weighted degree? It
turns out a simple counting argument is all that it takes to guarantee
this.
Lemma 4.4. Given an arbitrary set of n pairs {(i , yi )}ni=1 from F F,
there exists a nonzero polynomial Q(X, Y ) of (1, k)-weighted degree


> kn.
at most D satisfying Q(i , yi ) = 0 for all i [n] provided D+2
2
Moreover, we can nd such a Q(X, Y ) in polynomial time by solving a
linear system over F.
Proof. A polynomial Q(X, Y ) with (1, k)-weighted degree at most D
can be expressed as
D/k Djk

Q(X, Y ) =


j=0

qj X  Y j .

(4.2)

=0

The conditions Q(i , yi ) = 0 for all i [n] give a system of n homogeneous linear equations in the unknowns qj . A nonzero solution to
this system is guaranteed to exist if the number of unknowns, say U ,
exceeds the number n of equations (and if a nonzero solution exists,
we can clearly nd one in polynomial time by solving the above linear
system). We turn to estimating U .
D/k Djk

U=


j=0

=0

= (D + 1)


D/k

1=




j=0

(D jk + 1)



   

k D
D
D
+1
+1
k
2 k
k

(D + 1)(D + 2)
.
2k

4.2. ReedSolomon list decoding

137



> kn, we have U > n, and a nonzero solution exists to
Thus if D+2
2
the linear system.
The above discussion motivates the following algorithm
for polynomial

reconstruction, for an agreement parameter t > 2kn:

Step 1: (Interpolation) Let D = [ 2kn]. Find a nonzero polynomial Q(X, Y ) of (1, k)-weighted degree at most D
satisfying Q(i , yi ) = 0 for i = 1, 2, . . . , n. (Lemma 4.4
guarantees the success of this step.)
Step 2: (Root nding/Factorization) Find all degree k polynomials p(X) such that Q(X, p(X)) 0. For each such
polynomial, check if p(i ) = yi for at least t values
of i [n], and if so, include p(X) in the output list.
(Lemma 4.3 guarantees that this step will nd
all relevant polynomials with agreement t > D = [ 2kn].)
The following records the performance of this algorithm. The claim
about the output list size follows since the number of factors Y p(X)
of Q(X, Y ) is at most the degree of Q(X, Y ) in Y , which is at
 
most D
k .
Theorem 4.5. The above algorithm solves the polynomial reconstruction problem in polynomial time if the agreement parameter t satises


t> 2kn . The size of the list output by the algorithm never exceeds
2n/k.
The above implies thata ReedSolomon code of rate R 
can be list
decoded to a fraction 1 2R of errors using lists of size O( 1/R). In
particular, for low-rate codes, we can eciently correct close to a fraction 100% of errors! This qualitative feature is extremely useful in many
areas of complexity theory, such as constructions of hardcore predicates
from one-way functions, randomness extractors and pseudorandom generators, hardness amplication, transformations from worst-case to
average-case hardness, etc. we point to the surveys [29,70,74] and [28,
Chapter 12] for further information.

138 Decoding ReedSolomon Codes


Remark 4. (Unique decoding) By interpolating a nonzero polynomial of the form Q(X, Y ) = A(X)Y + B(X), where the degree of




and n+k1
, respectively, we can
A(X), B(X) are at most nk1
2
2
recover the unique polynomial f (X) of degree at most k, if one exists,
that is a solution to the polynomial reconstruction problem for agreement parameter t > n+k
2 . In fact, in this case the solution polynomial
is just B(X)/A(X). The correctness follows from the following two
facts:
(i) Let e(X) to be the error-locator polynomial of degree
that has roots at all the error
at most n t < nk
2
locations. Then the choice A(X) = e(X) and B(X) =
f (X)e(X) gives a nonzero polynomial Q(X, Y ) meeting the degree restrictions and satisfying the interpolation conditions, and
(ii) the (1, k)-weighted degree of such a polynomial Q(X, Y )

 n+k1 

+
k
=
< t.
is at most nk1
2
2
This gives a polynomial time unique decoding algorithm for RS
codes that corrects up to a fraction (1 R)/2 of errors. This form
of the algorithm is due to Gemmell and Sudan [16], based on the
approach of Welch and Berlekamp [77]. The rst polynomial time algorithm for unique decoding RS codes was discovered as early as 1960 by
Peterson [63].

4.3
4.3.1

Improved decoding via interpolation with multiplicities


Geometric motivation for approach

We now illustrate of how the above interpolation based decoding works


using geometric examples. This will also motivate the idea of using
multiplicities in the interpolation stage to improve the performance of
the decoding algorithm.
To present the examples, we work over the eld R of real numbers.
The collection of pairs {(i , yi ) : 1  i  n} thus consists of n points in
the plane. We will illustrate how the algorithm nds all polynomials

4.3. Improved decoding via interpolation with multiplicities

139

Fig. 4.1 Example 1: The set of 14 input points. We assume that the center-most point is
the origin and assume a suitable scaling of the other points.

of degree k = 1, or in other words, lines, that pass through at least a


certain number t of the n points.
Example 1: For the rst example, we take n = 14 and t = 5. The 14
points on the plane are as in Figure 4.1.
We want to nd all lines that pass through at least 5 of the above
14 points. Since k = 1, the (1, k)-weighted degree of a bivariate polynomial is simply its total degree. The rst step of the algorithm must t
a nonzero polynomial Q(X, Y ) such that Q(i , yi ) = 0 for all 14 points.
By Lemma 4.4, we can nd such a polynomial of total degree 4.
One choice of a degree 4 polynomial that passes through the above
14 points is Q(X, Y ) = Y 4 X 4 Y 2 + X 2 . To see this pictorially, let
us plot the curve of all points on the plane where Q has zeroes. This
gives Figure 4.2 below.
Note that the two relevant lines that pass through at least 5 points
emerge in the picture. Algebraically, this corresponds to the fact that
Q(X, Y ) factors as Q(X, Y ) = (X 2 + Y 2 1)(Y + X)(Y X), and
the last two factors correspond to the two lines that are the solutions.
The fact that the above works correctly, i.e., the fact that the relevant
lines must be factors of any degree 4 t through the 14 points, is a
consequence of Lemma 4.3 applied to this example (with the choice
D = 4 and t = 5). Geometrically, this corresponds to the fact if a line
intersects a degree 4 curve in more than 4 points then the line must in
fact be a factor of the curve.

140 Decoding ReedSolomon Codes

Fig. 4.2 A degree 4 t through the 14 points. The curve is given by the equation: Y 4
X 4 Y 2 + X 2 = 0.

Fig. 4.3 Example 2: The set of 10 input points.

Example 2: For the second example, consider the 10 points in the


plane as in Figure 4.3. We want to nd all lines that pass through at
least 4 of the above 10 points. If we only wanted lines with agreement
at least 5, the earlier method of tting a degree 4 curve is guaranteed
to work. Figure 4.4 shows the set L of all the lines that pass through at
least 4 of the given points. Note that there are ve lines that must be
output. Therefore, if we hope to nd all these as factors of some curve,
that curve must have degree at least 5. But then Lemma 4.3 does not
apply since the agreement parameter t = 4 is less than 5, the degree of
Q(X, Y ).

4.3. Improved decoding via interpolation with multiplicities

141

Fig. 4.4 The ve lines that pass throught at least 4 of the 10 points.

The example illustrates an important phenomenon which gives the


cue for an improved approach. Each of the 10 points has two lines in
L that pass through it. In turn, this implies that if we hope to nd all
these lines as factors of some curve, that curve must pass through each
point at least twice! Now we can not expect a generic curve that is interpolated through a set of points to pass through each of them more than
once. This suggests that we should make passing through each point
multiple times an explicit requirement on the interpolated polynomial.
Of course, this stronger property cannot be enforced for free and would
require an increase in the degree of the interpolated polynomial. But
luckily it turns out that one can pass through each pointtwice with
less than a two-fold increase in degree (a factor of roughly 3 suces),
and the trade-o between multiplicities guaranteed vs. degree increase
is a favorable one. This motivates the improved decoding algorithm
presented in the next section.

4.3.2

Algorithm for decoding up to the Johnson radius

We now generalize Sudans decoding algorithm by allowing for multiplicities at the interpolation points. This generalization is due to
Guruswami and Sudan [40].

142 Decoding ReedSolomon Codes


Denition 4.2. [Multiplicity of zeroes] A polynomial Q(X, Y ) is
said to have a zero of multiplicity r  1 at a point (, ) F2 if
Q(X + , Y + ) has no monomial of degree less than r with a nonzero
coecient. (The degree of the monomial X i Y j equals i + j.)
The following lemma is the generalization of Lemma 4.3 that takes
multiplicities into account.

Lemma 4.6. Let Q(X, Y ) be a nonzero polynomial of (1, k)-weighted


degree at most D that has a zero of multiplicity r at (i , yi ) for every i
[n]. Let p(X) be a polynomial of degree at most k such that p(i ) = yi
for at least t > D/r values of i [n]. Then, Q(X, p(X)) 0, or in other
words Y p(X) is a factor of Q(X, Y ).
Proof. For Q, p as in the statement of the lemma, dene R(X) =
Q(X, p(X)). The degree of R(X) is at most D, the (1, k)-weighted
degree of Q(X, Y ). Let i [n] be such that p(i ) = yi . Dene the polydef

nomial Q(i) (X, Y ) = Q(X + i , Y + yi ). Now


R(X) = Q(X, p(X)) = Q(i) (X i , p(X) yi )
= Q(i) (X i , p(X) p(i )).

(4.3)

Since Q(X, Y ) has a zero of multiplicity r at (i , yi ), Q(i) (X, Y ) has


no monomials of total degree less than r. Now, X i clearly divides
p(X) p(i ). Therefore, every term in Q(i) (X i , p(X) p(i )) is
divisible by (X i )r . It follows from (4.3) that R(X) is divisible by
(X i )r .
Since there are at least t values of i [n] for which p(i ) = yi , we get
that R(X) is divisible by a polynomial of degree at least rt. If rt > D,
this implies that R(X) 0.
We now state the analog of the interpolation lemma when we want
desired multiplicities at the input pairs (i , yi ). Note that setting r = 1,
we recover exactly Lemma 4.4.

4.3. Improved decoding via interpolation with multiplicities

143

Lemma 4.7. Given an arbitrary set of n pairs {(i , yi )}ni=1 from F


F and an integer parameter r  1, there exists a nonzero polynomial
Q(X, Y ) of (1, k)-weighted degree at most D such that Q(X, Y ) has a


 
> kn r+1
zero of multiplicity r at (i , yi ) for all i [n], provided D+2
2
2 .
Moreover, we can nd such a Q(X, Y ) in time polynomial in n, r by
solving a linear system over F.
Proof. Fix an i [n]. The coecient of a particular monomial X j1 Y j2
def

of Q(i) (X, Y ) = Q(X + i , Y + yi ) can clearly be expressed as a linear combination of the coecients qj of Q(X, Y ) (where Q(X, Y ) is
expressed as in (4.2)). Thus the condition that Q(X, Y ) has a zero of
 
homomultiplicity r at (i , yi ) can be expressed as a system of r+1
2
geneous linear equations in the unknowns qj , one equation for each
pair (j1 , j2 ) of nonnegative integers with j1 + j2 < r. In all, for all n
 
homogeneous linear equations. The rest of
pairs (i , yi ), we get n r+1
2
the argument follows the proof of Lemma 4.4 the only change is that
 
replaces n for the number of equations.
n r+1
2
The above discussion motivates the following algorithm using interpolation with multiplicities for polynomial reconstruction (the parameter r  1 equals the number of multiplicities):


knr(r + 1) . Find a
Step 1: (Interpolation) Let D =
nonzero polynomial Q(X, Y ) of (1, k)-weighted degree
at most D such that Q(X, Y ) has a zero of multiplicity r at (i , yi ) for each i = 1, 2, . . . , n. (Lemma 4.7
guarantees the success of this step.)
Step 2: (Root nding/Factorization) Find all degree k polynomials p(X) such that Q(X, p(X)) 0. For each such
polynomial, check if p(i ) = yi for at least t values
of i [n], and if so, include p(X) in the output list.
(Lemma 4.6 guarantees that this step will nd all relevant polynomials with agreement t > D/r.)
The following records the performance of this algorithm. Again, the
size of the output list never exceeds the degree of Q(X, Y ) in Y , which
is at most
D/k .

144 Decoding ReedSolomon Codes


Theorem 4.8. The above algorithm, with multiplicity parameter r 
1, solves the polynomial reconstruction problem
in polynomial time if


the agreement parameter t satises t >
kn(1 + 1r ) .1 The size of the

list output by the algorithm never exceeds nr(r + 1)/k.

Corollary 4.9. A ReedSolomon code


of rate R can be list decoded
in polynomial
time up to a fraction 1 (1 + )R of errors using lists
of size O(1 / R).
By letting the multiplicity r grow with
as long as
n, we can decode
2
the agreement parameter satises t > kn. Indeed, if t > kn, picking


kn t
,
r=1+ 2
t kn

 

and D = rt 1, both the conditions t > Dr and D+2
> n r+1
are
2
2
satised, and thus the decoding algorithm successfully nds all polynomials with agreement at least t. The number of such polynomials is
at most D/k  nt  n2 .
Theorem 4.10. [40] The polynomial reconstruction problem with n
input pairs, degree k, and agreement
parameter t can be solved in

polynomial time whenever t > kn. Further, at most n2 polynomials


will ever be output by the algorithm.
We conclude thatan RS code of rate R can be list decoded
up to a fraction 1 R of errors. This equals the Johnson radius

1 1 of the code, since the relative distance of a RS code of


rate R equals 1 R. This is the main result of this chapter.
Note

that for every rate R, 0 < R < 1, the decoding radius 1 R exceeds
the best decoding radius (1 R)/2 that we can hope for with unique
decoding.

x = knr(r + 1). The condition t > D/r with D = [x] is implied by t > [x/r], since
x/r  [x]/r < [x/r] + 1.

1 Let

4.4. Extensions: List recovering and soft decoding

145

Remark 5. [Role of multiplicities] Using multiplicities in the interpolation led to the


improvement of the decoding radius to match the
Johnson radius 1 R and gave an improvement over unique decoding
for every rate R. We want to stress that the idea of using multiplicities
plays a crucial role in the nal result achieving capacity. The improvement it gives over a version that uses only simple zeroes is substantial
for the capacity-approaching codes, and in fact it seems crucial to get
any improvement over unique decoding (let alone achieve capacity) for
rates R > 1/2. Also, multiplicities can be used to naturally encode the
relative importance of dierent symbol positions, and this plays a crucial role in soft-decision decoding which is an important problem in
practice, see Section 4.4.2 below.

4.4
4.4.1

Extensions: List recovering and soft decoding


List recovering ReedSolomon codes

In the polynomial reconstruction problem with input pairs (i , yi ), the


eld elements i need not be distinct. Therefore, Theorem 4.10 in fact
gives a list recovering algorithm for ReedSolomon codes. (Recall Definition 2.2, where we discussed the notion of list recovering of codes.)
Specically, given input lists of size at most  for each position of an
RS code of block length n and dimension k + 1,
the algorithm can
nd all codewords with agreement on more than kn positions. In
other words, for any integer   1, an RS code of rate R and block
length ncan be (p, , O(n2 2 ))-list-recovered in polynomial time when
p < 1 R. Note that the algorithm can do list recovery even in the
noise-free (p = 0) case, only when R < 1/. In fact, as shown in [38],
there are inherent combinatorial reasons why 1/ is a limit on the rate
for list recovering certain RS codes, so this is not a shortcoming of just
the above algorithm.
Later on, for our capacity-approaching codes, we will not have this
strong limitation, and the rate R for list recovering can be a constant
independent of  (and in fact can approach 1 as the error fraction
p 0). This strong list recovering property will be crucially used in

146 Decoding ReedSolomon Codes


concatenation schemes that enable the reduction of the alphabet size
to a constant.
4.4.2

Soft-decision decoding

List recovering dealt with the case when for each position we had a
set of more than one candidate symbols. More generally, we could be
given a weight for each of the candidates, with the goal being to nd
all codewords with good weighted agreement, summed over all positions. The weight for position i and symbol would presumably be
a measure of the condence of symbol being the i-th symbol of the
actual codeword that was transmitted. Making use of such weights in
the decoding is called soft-decision decoding (the weights constitute
soft information). Note that list recovering is just a special case when
for each position the weights for some symbols equal 1 and the rest
equal 0. Soft-decision decoding is important in practice as the eld
elements corresponding to each position are obtained by some sort of
demodulation of real-valued signals, and soft-decision decoding can
retain more of the information from this process compared with hard
decoding which loses a lot of information by quantizing the signal to a
single symbol. It is also useful in decoding concatenated codes, where
the inner decoder can provide weights along with the choices it outputs,
which can then be used by a soft-decision decoder for the outer code.
As mentioned in [40], the multiplicity based interpolation lends itself
naturally to a soft-decision version, since the multiplicity required at
a point can encode the importance of that point. Given weights wi,
for positions i [n] and eld elements F, we set the multiplicity of
the point (i , ) to be proportional to wi, . This leads to the following
claim, which is explicit for instance in [28, Chapter 6]:
Theorem 4.11. (Soft-decision decoding of RS codes) Consider a
ReedSolomon code of block length n and dimension k + 1 over a eld
F. Let 1 , . . . , n F be the evaluation points used for the encoding.
Let > 0 be an arbitrary constant. For each i [n] and F, let wi,
be a non-negative rational number. Then, there exists a deterministic
algorithm with runtime poly(n, |F|, 1/) that, when given as input the

4.5. Root nding for bivariate polynomials

147

weights wi, for i [n] and F, nds a list of all polynomials p(X)
F[X] of degree at most k that satisfy


n

 n 2
wi,p(i )  k
wi, + max wi, .
(4.4)
i=1

i=1 F

i,

Koetter and Vardy [51] developed a front end that chooses weights
that are optimal in a certain sense as inputs to the above algorithm,
based on the channel observations and the channel transition probability matrix. This has led to a soft-decision decoding algorithm for RS
codes that has led to substantial improvements in practice.

4.5

Root nding for bivariate polynomials

We conclude this chapter by briey describing how to eciently (in


time polynomial in k, q) solve the bivariate root nding problem that
we encountered in the RS list decoding algorithm:
Given a bivariate polynomial Q(X, Y ) Fq [X, Y ] and
an integer k, nd a list of all polynomials f (X)
Fq [X] of degree at most k for which Y f (X) divides
Q(X, Y ).
4.5.1

A simple randomized algorithm

We discuss a simple randomized method, described in [5, Section 4.2],


that reduces the above problem to univariate polynomial factorization.
It suces to give an algorithm that either nds a polynomial f (X) of
degree at most k such that Y f (X) is a factor Q(X, Y ), or concludes
that none exists. We can then run this algorithm, divide Q(X, Y ) by
Y f (X) if such a factor is found, and recurse on the quotient.
Pick a monic polynomial r(X) Fq [X] of degree k + 1 by picking
its non-leading coecients uniformly at random from Fq . Compute the
def

polynomial R(X) = Q(X, r(X)). If R(X) = 0, then divide Q(X, Y ) by


Y r(X), and repeat the process with the resulting quotient polynomial. Otherwise, using a univariate polynomial factorization algorithm
(such as Berlekamps randomized factoring algorithm [6]), compute

148 Decoding ReedSolomon Codes


the list E1 (X), E2 (X), . . . , Et (X) of all the monic irreducible factors
of R(X) that have degree k + 1.
Now suppose f (X) has degree at most k and Y f (X) is a factor
of Q(X, Y ). Then, clearly r(X) f (X) is a monic polynomial of degree
k + 1 dividing the univariate polynomial R(X). Hence, if r(X) f (X)
is irreducible, then it must equal one of the Ej (X)s. In this case, we
can simply check if Y r(X) + Ej (X) divides Q(X, Y ) for each j =
1, 2, . . . , t, and if so output the corresponding polynomial.
The algorithm obviously never errs, if Q(X, Y ) has no factor of
form Y f (X) with degree(f )  k. Conversely, if there is such a factor Y f (X), the algorithm success if and only if r(X) f (X) is irreducible.
 The
 crucial point is that this happens with probability at least
1
1
1
k+1 1 q  2(k+1) [55, p. 84]. Upon running O(k log k) independent
trials of this procedure, the algorithm will succeed in nding the factor
with probability at least 1 1/k.
4.5.2

Deterministic algorithm

To get a deterministic algorithm, we will reduce the above bivariate root


nding problem to a univariate root nding problem over an extension
eld Fqe with e < q, under the assumption that k < q 1. Recall that
in our application, k + 1 is the dimension of the RS code, and so it is
safe to assume this upper bound on k.
For univariate polynomial factorization (and hence also root nding) of a polynomial f (X) Fpt [X], where p is the characteristic of the
eld, Berlekamp [6] (see also [76, Exercise 14.40]) gave a deterministic
algorithm with running time polynomial in degree(f ), t, and p (this was
achieved via a deterministic polynomial-time reduction from factoring
in Fpt [X] to root nding in Fp [X]). Combined with our reduction, we
get a deterministic algorithm for the above bivariate root nding problem in time polynomial in q, k, as desired.
The reduction is straightforward. We rst need to nd h(X) Fq [X]
of degree k + 1 or higher that is irreducible (over Fq ). We can describe
an explicit choice h(X) = X q1 , where is any generator of the
cyclic multiplicative group Fq (and can be found by a brute-force search
in Fq ). The algorithm proceeds as follows. Repeatedly, divide Q(X, Y )

4.6. Related results on algebraic list decoding

149

by h(X) till we are left with Q0 (X, Y ) that is not divisible by h(X).
Then, viewing Q0 (X, Y ) as a polynomial in Y with coecients in Fq [X],
reduce the coecients modulo h(X), to get a nonzero univariate poly ] where F
is the extension eld Fq [X]/(h(X)) of
nomial T (Y ) F[Y
degree q 1 over Fq . The desired polynomials f (X) now all occur
all of which can be found
amongst the roots of T (Y ) that lie in F,
using Berlekamps algorithm in time polynomial in degree(T ) and q.
We will make even more crucial use of polynomials over the extension eld Fq [X]/(h(X)) in Chapter 6.

4.6

Related results on algebraic list decoding

The principle behind the ReedSolomon list decoding algorithm is in


fact quite general, and can be used to decode other algebraic codes.
RS codes are an instance (possibly the most important one) of a class
of codes called redundant residue codes. In these codes, the message
is encoded by a collection of its residues modulo certain divisors. For
example, in RS codes, the message polynomial f (X) is encoded by its
residues modulo Mi (X), where Mi (X) = X i for i = 0, 1, . . . , n 1.
If the degree of f (X) is k, the rst k + 1 residues suce to uniquely
determine the message. The remaining are redundant residues included
in the codeword for protection against errors. The natural parallel of
RS codes in the number-theoretic world are the Chinese Remainder
codes. If p1 < p2 < < pn are distinct primes, and k < n, one can

encode an integer m, 0  m < ki=1 pi , using its residues modulo the pi s
as (m mod p1 , m mod p2 , . . . , m mod pn ). Once again, any k residues
suce to identify m and the rest are redundant residues. The algebraic
ideas underlying the (soft-decision) list decoding algorithm for Reed
Solomon codes can be used to list decode Chinese Remainder codes (up
to its respective Johnson bound). For details on this, see [39] (and the
earlier works [10, 18]).
ReedSolomon codes are a specic instance of a general family of
codes called algebraic-geometric (AG) codes. Underlying an AG code
is an algebraic curve, a linear space L of functions (from the so-called
function eld associated with the curve) that correspond to the messages, and a set S of rational points on the curve where each function

150 Decoding ReedSolomon Codes


in L has no poles. A message is encoded by evaluating the associated
function at each rational point in S. ReedSolomon codes correspond
to the special case when the curve is the ane line, and the functions
used for encoding correspond to rational functions that have at most a
certain number of poles at the point at innity and no poles elsewhere
(this class corresponds precisely to low-degree polynomials).
The ane line over Fq has q + 1 rational points (including the point
at innity). In turn this means that the alphabet size of RS codes must
grow with the block length. The benet of AG codes is that in general,
algebraic curves over Fq can have many more rational points, in fact
as many points as one needs. Thus one can dene an innite family of
codes of increasing block lengths over a xed alphabet Fq . The quality
of the codes depends on the quality of the curve as a curve contains
more and more rational points, it must necessarily get more twisted
and complicated, a phenomenon which is quantied by the genus of the
curve. The best codes in this framework are obtained using curves with
the best trade-o between number of rational points and the genus.
The distance property of AG codes follows from the fact that a nonzero
function cannot have more zeroes than poles (the fact that a nonzero
polynomial cannot have more roots than its degree is a special case of
this).
The RS list decoding algorithm described in this chapter can be generalized to work for any algebraic-geometric code [40]. The complexity
of the algorithm is polynomial assuming availability of a polynomial
amount of pre-processed information about the code [43]. For a family
of AG codes that achieve the best rate vs. distance trade-o, it was
recently shown how to compute the required pre-processed information
in polynomial time [36].
Even more generally, there is an abstract algebraic view of the
decoding algorithm in the language of rings and ideals. The algorithms
for RS codes, Chinese Remainder codes, and AG codes become just
specic instantiations of this general algorithmic scheme for specic
choices of the underlying ring and ideals. Details of the general algorithm for any ideal-based code that satises certain abstract axioms
can be found in [28, Chapter 7] and [71].

5
Graph-Based List-Decodable Codes

In this chapter, we briey survey some non-algebraic, graph-theoretic


approaches to constructing list-decodable codes and decoding them.
Besides providing an interesting alternate approach for list decoding,
these results also give linear-time encodable and list-decodable codes.
The rate vs. error-correction radius trade-o, however, is not optimized
and is signicantly worse than what can be achieved with algebraic
codes such as RS codes. The results of this chapter are not needed for
Part II of the survey, so we will be content with stating the main results
and denitions, and giving very high level descriptions of the central
ideas. We will also provide pointers to the literature where further
details on the proofs can be found.

5.1

Reducing list decoding to list recovering

The notion of list recovering (recall Denition 2.2) plays a crucial role
in the graph-based constructions. The rst step is to reduce list decoding to the problem of list recovering with a much smaller noise fraction (but with large input list sizes). In this section, we will present
the details of such a reduction. The reduction uses the notion of a
151

152 Graph-Based List-Decodable Codes


Ramanujan graph:
Denition 5.1. An undirected d-regular graph is said to be a Ramanujan expander if the second largest
eigenvalue of its adjacency matrix in

absolute value is at most 2 d.


We will also use the following operation to combine a bipartite graph
with a code to produce a code over a larger alphabet. This operation
was rst used in [1] to amplify the distance of codes, and has been put
to good algorithmic use in many recent works beginning with [31].
Denition 5.2. Given a code C n and a bipartite graph G =
(A, B, E) with A = {1, 2, . . . , n}, B = {1, 2, . . . , m}, and with right degree
d, the code G(C) (d )m is dened as follows. For any x n , rst
dene G(x) to be a vector y (d )m created in the following way: For
j B and k = 1, . . . d, let k (j) A be the k-th neighbor of j (as per
some arbitrary ordering of the neighbors of each node). The j-th symbol yj of y is dened as x1 (j) , . . . , xd (j) . In other words, we send
a copy of each symbol xi along all edges going out of the vertex i A,
and the symbol yj is obtained by concatenating all symbols received
by j B. The code G(C) is now obtained by taking all vectors G(c)
for c C.
The reduction is described formally below. Note that the error fraction is reduced from (1 1/) (which is close to 1 for large ) to for
an arbitrarily small constant > 0 (at the expense of a larger alphabet
size and lower rate).
Lemma 5.1. [33] Let , (0, 1) be arbitrary constants. Let C be
a code of block length n and rate r over alphabet . Also suppose
that C is (, , L)-list-recoverable in time T (n). Further, assume that
there exists a d-regular Ramanujan expander H on n vertices for d 
4
. Then there exists a code C  of block length n and rate r/d over
2
alphabet d which is explicitly specied given C, H, and which is (1
1

 , L)-list-decodable in time O(T (n) + n). Furthermore, C is lineartime encodable if C is.

5.2. A toy list recovering problem

153

Proof. Let G = (A, B, E) be an n n bipartite graph that is the double


cover of H (i.e., A, B are both copies of the vertex set of H, and we connect u A with v B if (u, v) is an edge of H). The code C  will just be
G(C) described in Denition 5.2. The claims about block length, rate,
alphabet size, and encoding time follow immediately. Using the fact
that H is Ramanujan and d  4/(2 ), and the pseudorandom properties of expander graphs (see for instance [4, Chapter 9, Section 2]), one
can show the following:
(**) For any S B, |S|  ( 1 + )|B|, the fraction of i A which
have at most a fraction 1 of their neighbors inside the set S
is at most .
The list decoding algorithm for C  proceeds as follows. Given a string
y (d )n , for each i A, create a set Si consisting of the 
most frequent elements in the multiset Ti of elements dened as:
Ti = {ak (i, j) E; yj = a1 , a2 , . . . , ad ; k (j) = i}.
Let c C  be a codeword such that cj = yj for at least ( 1 + )n
of the positions j. Let c C be the codeword for which c = G(c). By
/ Si for at most a fraction of the sets Si .
Property (**), we have ci
Thus we can nd c, and hence c , by running the (, , L)-recovering
algorithm for C with the sets Si as input. The output list size is at
most L, and the running time is T (n) plus the time to compute the
lists, which is at most O(nd) = O(n) word operations when d is a xed
constant depending only on , .

5.2

A toy list recovering problem

In light of Lemma 5.1, if we can construct (, , L)-list-recoverable codes


of positive rate for some =  > 0, then we can also construct (1
2/, L)-list-decodable codes of positive rate R > 0. Furthermore, if the
list recovering algorithm runs in linear time, we will have also have a
linear-time list-decoding algorithm for correcting a fraction (1 2/)
of errors.
In this section, we illustrate the basic approach toward such a result
on list recovering by considering a toy problem, namely constructing
a family of codes of positive rate that are (0, 2, L)-list-recoverable (for

154 Graph-Based List-Decodable Codes


some constant L) in linear time (we can actually achieve L = 2, but we
will not be concerned with this in the following description). This is
arguably the simplest list recovering setting: there is no noise, i.e., we
are guaranteed that all codeword symbols lie in the respective input
sets, and the sets have the smallest level of ambiguity (they identify
the codeword symbol as one of two possible values). Below we closely
follow the description in [33, Section 3.2].
The (0, 2, L)-list-recoverable code C  will be the code G(C) based
on the following two components:
(1) A code C n that has relative distance greater than 0.9,
has constant rate, and which can be decoded from 90% erasures in linear time.1 Such a code has been constructed in [3].
(2) An n n bipartite graph G = (A, B, E) with the following
fault tolerance property: if we delete n/10 nodes in A, then
the square of the remaining graph has a connected component containing more than, say, n/10 nodes in A. It was
shown in [2] that such a graph can be obtained by taking
a double cover of a Ramanujan graph with suciently high
(constant) degree d.
Our goal is to show that given sets S1 , S2 , . . . , Sn , we can reconstruct
the list of all codewords c C such that G(c)j Sj , j = 1, . . . , n, in
linear time. For each i A, j B, we dene L(i, j) to be the set of
symbols in that Sj suggests as potential symbols for ci . More
formally, L(i, j) contains symbols ak , such that a1 , . . . , ad Sj and
k (j) = i.
Dene Ki = (i,j)E L(i, j). We assume that all Ki s are non-empty,
since otherwise no codeword compatible with the Si s exists. Dene I to
be the set of indices i A such that Ki has size 1. For such is, denote
the symbol in Ki by xi .
Let c be a codeword of C such that G(c)i Sj , j = 1, . . . , n. Our goal
is to nd each such c. Clearly, for all i I, we must have ci = xi . The
1 Under

the erasure noise model, certain symbols are lost during transmission, and the rest
are received intact. The receiver knows the locations of the erasures as well as of the
received symbols. The goal is to decode the erased symbols.

5.2. A toy list recovering problem

155

decoding procedure consists of two cases. The rst case occurs when
the set I has size at least n/10. In this case, we know at least 10% of
symbols of c, and thus we can recover c using the linear-time erasuredecoding algorithm for the code C. It remains to consider the second
case, when the size of the set I is smaller than n/10. Consider any
i
/ I. Observe that, for all (i, j) E, all sets L(i, j) must be equal to
Ki . The set Ki contains two distinct symbols that are candidates for ci .
Note that although for each c only one of this symbols is correct, each
symbol in Ki can be correct for some codeword c. From now on, we will
consider each Ki (respectively Sj ) as ordered pairs of two symbols (Ki )0
and (Ki )1 (respectively (Sj )0 and (Sj )1 ), for an arbitrary ordering.
To recover c from the Ki s, we create an auxiliary graph HH.
Intuitively, the graph HH is obtained by splitting each node in
G into two nodes, each corresponding to two decoding options, and
then putting edges between compatible options. Formally, HH =
(A , B  , E  ) is a bipartite graph such that A = A {0, 1} and B  =
B {0, 1}. For each i
/ I, and (i, j) E, let k be such that k (j) = i
(i.e., j is the k-th neighbor of i). Then the edge set E  contains
edges {(i, 0), (j, t)} and {(i, 1), (j, 1 t)}, where t {0, 1} is such that
(Ki )0 = ((Sj )t )k .
Dene V (c) = {(i, t) : i
/ I, ci = (Ki )t }, i.e., the subgraph of HH
that corresponds to the codeword c. In other words, V (c) contains the
nodes in A that are compatible with c. The key fact, easily proved by
induction, is that if (i, t) V (c), and (i , t ) A is reachable from (i, t)
in HH, then (i , t ) V (c). Hence V (c) will be the union of the vertices
in A that belong to some subset of connected components of HH. The
fault-tolerance property of G can be shown to imply that one of these
connected components must contain at least n/10 vertices. Therefore, if
we enumerate all large connected components of HH, one of them will
give us a large subset S  of V (c) (of size at least n/10). Given S  , we can
recover c from a vector x such that xi is equal to (Ki )t if (i, t) S  , and
is declared as an erasure otherwise (note that the fraction of erasures
is at most 0.9).
Since the graph H has only O(n) edges, all its connected components, and in particular the large ones, can be found in O(n) time.
Thus, the whole decoding process can be performed in linear time. In

156 Graph-Based List-Decodable Codes


addition, encoding C  can be done in linear time as well if C is lineartime encodable.

5.3

List recovering from small but positive noise rate

We now turn to the list recovering problem when the input list size is
greater than 2, and more signicantly when we allow a small fraction
of lists to be erroneous. The following result is shown in [33]:
Theorem 5.2. For every integer   1, there exist R > 0,  > 0, and
a nite alphabet  for which there is an explicit family of codes of
rate R() over alphabet  that are encodable as well as ( , , )-listrecoverable in linear time.
Combining the above with Lemma 5.1, one gets the following result
of Guruswami and Indyk [33] on linear-time list-decodable codes for
correcting any desired constant fraction of errors.
Theorem 5.3. For every , 0 < < 1, there exists an explicit family
of (1 , O(1/))-list-decodable codes of positive rate R() > 0 that
can be encoded as well as list decoded from a fraction (1 ) of errors
in time linear in the block length.2
Below we will sketch the ideas behind the proof of Theorem 5.2.
Both expanders and the notion of list recovering play a prominent role
throughout the construction and proof in [33].
The construction ( , , )-list recoverable codes proceeds recursively
using a construction of (1 ,  1,  1)-list-recoverable codes ( will
recursively depend on 1 ). For the recursion, it is convenient to construct a slightly stronger object; for a xed small , we will construct
( , , , )-list recoverable codes that have the following property: Given
a collection of sets Si all but a fraction of which satisfy |Si |  , there
are at most  codewords whose i-th symbol belongs to Si for at least
a fraction (1  ) of the locations. (The case = 0 corresponds to
( , , )-list-recovering.)
2 The

complexity is linear in the unit cost RAM model, see [33] for details.

5.3. List recovering from small but positive noise rate

157

A code C which is ( , , , )-list recoverable is constructed as C =


G2 (G1 (C1 )) where both G1 , G2 are n n bipartite constant-degree
expanders with appropriate properties (say with degrees d1 , d2 that
could depend on ). Let G1 = (X, Y, E1 ) and G2 = (Y, Z, E2 ), so that
we identify the left side of G2 with the right side of the bipartition
of G1 in a natural way. Clearly, linear time encodability as well as
positive rate is maintained in this process since d1 , d2 are constants.
1 d2
.
The alphabet of C is  = d1
The list recovering of C proceeds in two steps/cases as described
below. Note that the input is a collection of sets S1 , S2 , . . . , Sn with
each Si  where all but n of them have at most  elements. In
1
of at most 
the rst step, each i Y collects a set L2 (i, j) d1
possible symbols from the set Sj for each of its d2 neighbors, and then
computes Ki = (i,j)E2 L2 (i, j). If  is chosen suciently small, for
each candidate codeword that has to be output, most of the Ki s will
have the correct symbol. Clearly |Ki |   for all i. Suppose that in fact
at least n of the Ki s satisfy |Ki |   1. In this case, intuitively we
have made progress since the amount of ambiguity in those symbols
has gone down from  to  1. However, this happens only for a small
fraction of symbols, but this can be transferred to a large (say, 1 )
fraction of symbols of C1 using the expander G1 (the specic property
needed for G1 will thus be that any fraction of nodes in Y are adjacent
to at least a (1 ) fraction of nodes in X). We are now in a position
to apply the list recovering algorithm for C1 to nish the decoding.
The more dicult and interesting case is when |Ki | <  for less
than a fraction of the nodes i Y . This means that for most nodes
i Y , all the sets L2 (i, j) are in fact equal to Ki (and contain 
distinct elements). One can then dene a certain kind consistency
= (Y {1, 2, . . . , }, Z {1, 2, . . . , }, E)
whose vertex set corgraph H
responds to the  possible choices for the correct symbol at each node of
Y, Z. The edge set is dened based on which symbol of L2 (i, j) matches
the r-th symbol of Ki for r = 1, 2, . . . ,  (as per some xed ordering of
the elements in each Ki and L2 (i, j)). The problem of nding a consis
tent codeword can then be cast as nding a very dense subgraph of H
that internally closely resembles a copy of the expander G2 . Using the
fact that this subgraph resembles G2 and that G2 is a high quality

158 Graph-Based List-Decodable Codes


Ramanujan expander, Guruswami and Indyk [33] demonstrate how
spectral techniques can be used to nd such subgraphs and thus all
the solution codewords in O(n log n) time. (This also suces to get
a linear-time decoding algorithm in the unit cost RAM model using
codes over a growing alphabet.) The detailed formalism of the above
vague description as well as the technical details of the spectral partitioning are quite complicated. We will therefore refrain from trying to
explain the details further and instead point the interested reader to
the paper [33] (see also [27, Section 6]).
We note that since  reduces by 1 at each step, and there is at
least a constant factor loss in rate at each recursive step, the best rate
one can hope for using the above approach is 2O() . However, due
to the intricacies of the spectral partitioning step and the resulting
dependencies amongst the various parameters, the rate achieved is in
O()
fact worse and is only 22 . For the case  = 0, Guruswami and
Indyk [34] present a more ecient recursive construction of linear-time
codes that achieve a rate of 1/O(1) (this is accomplished by reducing
 by a constant multiplicative factor as opposed to the above additive
amount at each step).

5.4

Other graph-based list decoding results

5.4.1

Constructions over smaller alphabets

Graph-based constructions have also been useful in obtaining good listdecodable codes over smaller alphabets compared with purely algebraic
codes.
For example, consider using the construction scheme of Denition 5.2 with the following components C, G:
C is a rate () code that is (1/2, (1/), (1/))-list recoverable. Such a code can be constructed via an appropriate
concatenation involving a rate () outer RS code, and a
good list-recoverable inner code found by brute-force search
(see [31] for details).
G is an n n bipartite graph of degree d = O(1/) with the
property that any subset of fraction of nodes on the right

5.4. Other graph-based list decoding results

159

have a neighborhood that spans more than half the nodes on


the left.
It is shown in [31] that this yields (1 , O(1/))-list-decodable codes
of rate (2 ). This matches the rate achieved by RS codes (up to a constant factor), but the advantage is that the alphabet size can be made
a constant that depends only on . (Certain algebraicgeometric codes
achieve a similar result, but this relies on much deeper mathematics
and a complicated decoding algorithm.)
5.4.2

Juxtaposed codes

A dierent method to achieve a similar goal as above, in fact with


even better alphabet sizes (for a slight worsening of the rate), was
developed in [32]. The authors used the name juxtaposed codes for
the resulting constructions. The basic idea behind juxtaposed codes is
to encode the same message using several concatenated codes, each one
of which works well for some error distribution, and then juxtapose
symbols from these codes together to obtain a new code over a larger
alphabet. More concretely, in this approach, multiple ReedSolomon
codes (of varying rates) are concatenated with several dierent inner
codes (of varying rate and list decodability). Corresponding to each
ReedSolomon and inner code pair, we get one concatenated codeword,
and the nal encoding of a message is obtained by juxtaposing together
the symbols from the various individual concatenated codewords.
The purpose of using multiple concatenated codes is that depending
on the distribution of errors in the received word, the portions of it
corresponding to a signicant fraction of a particular inner encoding
will have relatively few errors (the specic inner code for which this
happens will depend on the level of non-uniformity in the distribution of
errors). These can then be decoded to provide useful information about
a large fraction of symbols to the decoder of the corresponding outer
ReedSolomon code. Essentially, depending on how (non)-uniformly
the errors are distributed, a certain concatenated code kicks in and
enables recovery of the message. The use of multiple concatenated codes
reduces the rate compared to the expander based constructions, but it
turns out this technique gives gains in the alphabet size. For example,

160 Graph-Based List-Decodable Codes


for any integer a  1, one can construct list-decodable up to a fraction
(1 ) of errors with rate (2(1+1/a) ) over an alphabet of size O(1/a ).
For a  3, this even beats the alphabet size (of O(1/4 )) achieved by
algebraicgeometric codes, albeit with a worse rate. We refer the reader
to the original paper [32] for further details.
5.4.3

Extractor-based codes

Using certain randomness extractors as the list-recoverable code in the


general scheme of Lemma 5.1, in [26] the author obtained explicit listdecodable codes of close-to-optimal rate to correct a fraction (1 )
of errors in sub-exponential time (specically, the achieved rate was
/ logO(1) (1/), and recall that the best possible rate one could hope
for is ). This result was the rst to beat the quadratic (rate 2 ) barrier
for list decoding from a fraction (1 ) of errors. (As we saw in the previous chapter, ReedSolomon codes achieve a rate 2 with polynomial
time list decoding. Moreover, this is the best that can be achieved by
appealing only to the Johnson bound on list decoding radius.) Though
the result of [26] is obsolete in light of the capacity-achieving codes we
will construct in Part II of this survey, it hints that expander-based
constructions, perhaps in conjunction with other coding techniques,
have the potential of not only yielding faster algorithms, but also good
trade-os between rate and decoding radius.

Part II

Achieving List Decoding


Capacity

6
Folded ReedSolomon Codes

This part of the survey gives some exciting recent progress that achieves
the capacity of list decoding over large alphabets. In this chapter, we
present a simple variant of ReedSolomon codes
called folded Reed
Solomon codes for which we can beat the 1 R decoding radius, we
achieved for RS codes in Chapter 4. In fact, by choosing parameters
suitably, we can decode close to the optimal fraction 1 R of errors
with rate R. In the next chapter, we will discuss techniques that let us
achieve a similar result over an alphabet of constant size that depends
only on the distance to list decoding capacity.
The starting point for the above capacity-achieving result is the
breakthrough work of Parvaresh and Vardy [62] who described a novel
variant of ReedSolomon codes together with a new decoding algorithm. While the new decoding
algorithm led to improvements over
the decoding radius of 1 R, it only did so for low rates (specically, for R < 1/16). Subsequently, Guruswami and Rudra [37] proved
that yet another variant of ReedSolomon codes, namely folded RS
codes that are compressed versions of certain ParvareshVardy codes,
are able to leverage the PV algorithm, essentially as is, but on codes

162

6.1. Description of folded codes

163

of high rate. Together, this gives explicit codes with polynomial time
decoding algorithms that achieve list decoding capacity.
In this chapter, we will describe this combined code and algorithm.
We note that this presentation deviates signicantly from the historical development in the original papers [37,62], in that we are using the
benet of hindsight to give a self-contained, and hopefully simpler, presentation. The last section of this chapter contains more comprehensive
bibliographic notes on the original development of this material.

6.1

Description of folded codes

Consider a ReedSolomon code C = RSF,F [n, k] consisting of evaluations of degree k polynomials over F at the set F of nonzero elements
of F. Let q = |F| = n + 1. Let be a generator of the multiplicative
group F , and let the evaluation points be ordered as 1, , 2 , . . . , n1 .
Using all nonzero eld elements as evaluation points is one of the most
commonly used instantiations of ReedSolomon codes.
Let m  1 be an integer parameter called the folding parameter. For
ease of presentation, we will assume that m divides n = q 1.
Denition 6.1. (Folded ReedSolomon code) The m-folded version of the RS code C, denoted FRSF,,m,k , is a code of block length
N = n/m over Fm . The encoding of a message f (X), a polynomial over
F of degree at most k, has its j-th symbol, for 0  j < n/m, the m-tuple
(f ( jm ), f ( jm+1 ), . . . , f ( jm+m1 )). In other words, the codewords of
C  = FRSF,,m,k are in one-one correspondence with those of the RS
code C and are obtained by bundling together consecutive m-tuple of
symbols in codewords of C.
We illustrate the above construction for the choice m = 4 in Figure 6.1.
The polynomial f (X) is the message, whose ReedSolomon encoding
consists of the values of f at x0 , x1 , . . . , xn1 where xi = i . Then, we
perform a folding operation by bundling together tuples of four symbols
to give a codeword of length n/4 over the alphabet F4 .
Note that the folding operation does not change the rate R of the
original ReedSolomon code. The relative distance of the folded RS
code also meets the Singleton bound and is at least 1 R.

164 Folded ReedSolomon Codes

Fig. 6.1 Folding of the ReedSolomon code with parameter m = 4.

6.2

Why might folding help?

Since folding seems like such a simplistic operation, and the resulting
code is essentially just a RS code but viewed as a code over a large
alphabet, let us now understand why it can possibly give hope to correct
more errors compared with the bound for RS codes.
Consider the above example with folding parameter m = 4. First of
all, decoding the folded RS code up to a fraction p of errors is certainly
not harder than decoding the RS code up to the same fraction p of
errors. Indeed, we can unfold the received word of the folded RS
code and treat it as a received word of the original RS code and run
the RS list decoding algorithm on it. The resulting list will certainly
include all folded RS codewords within distance p of the received word,
and it may include some extra codewords which we can, of course, easily
prune.
In fact, decoding the folded RS code is a strictly easier task. To
see why, say we want to correct a fraction 1/4 of errors. Then, if we
use the RS code, our decoding algorithm ought to be able to correct an
error pattern which corrupts every fourth symbol in the RS encoding of
f (X) (i.e., corrupts f (x4i ) for 0  i < n/4). However, after the folding
operation, this error pattern corrupts every one of the symbols over
the larger alphabet F4 , and thus need not be corrected. In other words,
for the same fraction of errors, the folding operation reduces the total
number of error patterns that need to be corrected, since the channel
has less exibility in how it may distribute the errors.

6.3. Trivariate interpolation based decoding

165

It is of course far from clear how one may exploit this to actually
correct more errors. To this end, algebraic ideas that exploit the specic
nature of the folding and the relationship between a polynomial f (X)
and its shifted counterpart f (X) will be used. These will be clear once
we describe our algorithms later in the chapter.
We note that above simplication of the channel is not attained
for free, since the alphabet size increases after the folding operation.
For folding parameter m that is an absolute constant, the increase in
alphabet size is moderate and the alphabet remains polynomially large
in the block length. (Recall that the RS code has an alphabet size that
is linear in the block length.) Still, having an alphabet size that is a
large polynomial is somewhat unsatisfactory. Fortunately, our alphabet
reduction techniques in the next chapter can handle polynomially large
alphabets, so this does not pose a big problem.

6.3

Trivariate interpolation based decoding

In the bivariate interpolation based decoding algorithm for RS codes,


the key factor driving the agreement parameter t needed for the decoding to be successful was the ((1, k)-weighted) degree D of the polynomial Q(X, Y ). Our quest for an improved algorithm will be based on
trying to lower this degree D by using more degrees of freedom in the
interpolation. Specically, we will try to use trivariate interpolation
of a polynomial Q(X, Y1 , Y2 ) through n points in F3 . As we will see,
this enables performing the interpolation
with D = O((k 2 n)1/3 ) which

is much smaller than the ( kn) bound for bivariate interpolation.


In principle, this could lead to an algorithm that works for agreement
fraction R2/3 instead of R1/2 . Of course, additional ideas are needed
to make this approach work, and we now turn to developing such an
algorithm in detail.

6.3.1

Facts about trivariate interpolation

We begin with some basic denitions and facts concerning trivariate polynomials which are straightforward extensions of the bivariate
counterparts.

166 Folded ReedSolomon Codes


Denition 6.2. For a polynomial Q(X, Y1 , Y2 ) F[X, Y1 , Y2 ], its
(1, k, k)-weighted degree is dened to be the maximum value of  +
kj1 + kj2 taken over all monomials X  Y1j1 Y2j2 that occur with a nonzero
coecient in Q(X, Y1 , Y2 ).

Denition 6.3. (Multiplicity of zeroes) A polynomial Q(X, Y1 , Y2 )


over F is said to have a zero of multiplicity r  1 at a point (, 1 , 2 )
F3 if Q(X + , Y1 + 1 , Y2 + 2 ) has no monomial of degree less than
r with a nonzero coecient. (The degree of the monomial X i Y1j1 Y2j2
equals i + j1 + j2 .)

Lemma 6.1. Let {(i , yi1 , yi2 )}ni=1 be an arbitrary set of n triples
from F3 . Let Q(X, Y1 , Y2 ) F[X, Y1 , Y2 ] be a nonzero polynomial of
(1, k, k)-weighted degree at most D that has a zero of multiplicity r at
(i , yi1 , yi2 ) for every i [n]. Let f (X), g(X) be polynomials of degree
at most k such that for at least t > D/r values of i [n], we have
f (i ) = yi1 and g(i ) = yi2 . Then, Q(X, f (X), g(X)) 0.
Proof. The proof is very similar to that of Lemma 4.6. If we dene
R(X) = Q(X, f (X), g(X)), then R(X) is a univariate polynomial of
degree at most D, and for every i [n] for which f (i ) = yi1 and
g(i ) = yi2 , (X i )r divides R(X). Therefore if rt > D, then R(X)
has more roots (counting multiplicities) than its degree, and so it must
be the zero polynomial.

Lemma 6.2. Given an arbitrary set of n triples {(i , yi1 , yi2 )}ni=1 from
F3 and an integer parameter r  1, there exists a nonzero polynomial
Q(X, Y1 , Y2 ) over F of (1, k, k)-weighted degree at most D such that
Q(X, Y1 , Y2 ) has a zero of multiplicity r at (i , yi1 , yi2 ) for all i [n],
r+2
D3
provided 6k
2 > n
3 . Moreover, we can nd such a Q(X, Y1 , Y2 ) in
time polynomial in n, r by solving a system of homogeneous linear
equations over F.

6.3. Trivariate interpolation based decoding

167

Proof. This is just the obvious trivariate extension of Lemma 4.7. The
condition that Q(X, Y1 , Y2 ) has a zero of multiplicity r at a point
 
homogeneous linear conditions in the coecients
amounts to r+2
3
of Q. The number of monomials in Q(X, Y1 , Y2 ) equals the number,
say N3 (k, D), of triples (i, j1 , j2 ) of nonnegative integers which obey
i + kj1 + kj2  D. One can show that the number N3 (k, D) is at least
as large as the volume of the three-dimensional region {x + ky1 + ky2 
D | x, y1 , y2  0} R3 [62]. An easy calculation shows that the latr+2
D3
D3
ter volume equals 6k
2 . Hence, if 6k 2 > n
3 , then the number of
unknowns exceeds the number of equations, and we are guaranteed
a nonzero solution.

6.3.2

Using trivariate interpolation for Folded RS codes

Let us now see how trivariate interpolation can be used in the context of
decoding the folded RS code C  = FRSF,,m,k of block length N = (q
1)/m. (Throughout this section, we denote q = |F|, and n = q 1.)
Given a received word z (Fm )N for C  that needs to be list decoded,
we dene y Fn to be the corresponding unfolded received word.
(Formally, let the j-th symbol of z be (zj,0 , . . . , zj,m1 ) for 0  j < N .
Then y is dened by yjm+l = zj,l for 0  j < N and 0  l < m.)
Suppose f (X) is a polynomial whose encoding agrees with z on at
least t locations. Then, here is an obvious but important observation:
For at least t(m 1) values of i, 0  i < n, both the
equalities f ( i ) = yi and f ( i+1 ) = yi+1 hold.
Dene the notation g(X) = f (X). Therefore, if we consider the n
triples ( i , yi , yi+1 ) F3 for i = 0, 1, . . . , n 1 (with the convention
yn = y0 ), then for at least t(m 1) triples, we have f ( i ) = yi and
g( i ) = yi+1 . This suggests that interpolating a polynomial Q(X, Y1 , Y2 )
through these n triples and employing Lemma 6.1, we can hope that
f (X) will satisfy Q(X, f (X), f (X)) = 0, and then somehow use this to
nd f (X). We formalize this in the following lemma. The proof follows
immediately from the preceding discussion and Lemma 6.1.

168 Folded ReedSolomon Codes


Lemma 6.3. Let z (Fm )N and let y Fn be the unfolded version
of z. Let Q(X, Y1 , Y2 ) be any nonzero polynomial over F of (1, k, k)weighted degree at D which has a zero of multiplicity r at ( i , yi , yi+1 )
D
. Then
for i = 0, 1, . . . , n 1. Let t be an integer such that t > (m1)r
every polynomial f (X) F[X] of degree at most k whose encoding
according to FRSF,,m,k agrees with z on at least t locations satises
Q(X, f (X), f (X)) 0.
Lemmas 6.2 and 6.3 motivate the following approach to list decoding
the folded RS code FRSF,,m,k . Here z (Fm )N is the received word
and y = (y0 , y1 , . . . , yn1 ) Fn is its unfolded version. The algorithm
uses an integer multiplicity parameter r  1, and is intended to work
for an agreement parameter 1  t  N .
Algorithm trivariate-FRS-decoder:
Step 1 (Trivariate interpolation) Dene the degree parameter


(6.1)
D = 3 k 2 nr(r + 1)(r + 2) + 1.
Interpolate a nonzero polynomial Q(X, Y1 , Y2 ) with coefcients from F with the following two properties: (i) Q
has (1, k, k)-weighted degree at most D, and (ii) Q has a
zero of multiplicity r at ( i , yi , yi+1 ) for i = 0, 1, . . . , n 1
(where yn = y0 ). (Lemma 6.2 guarantees the feasibility
of this step as well as its computability in time polynomial in n, r.)
Step 2 (Trivariate root-nding) Find a list of all degree
 k polynomials f (X) F[X] such that Q(X, f (X),
f (X)) = 0. Output those whose encoding agrees with
z on at least t locations.
Ignoring the time complexity of Step 2 or the size of the output
list for now, we can already claim the following concerning the errorcorrection performance of this strategy.

6.4. Root-nding step

169

Lemma 6.4. The algorithm trivariate-FRS-decoder successfully list


decodes the folded ReedSolomon code FRSF,,m,k up to a radius
 



2
1
m 3 k2
1+
2.
1+
N N
m 1 n2
r
r

Proof. By Lemma 6.3, we know that any f (X) whose encoding agrees
with z on t or more locations will be output in Step 2, provided
D
. For the choice of D in (6.1), this condition is met for
t > (m1)r



  k2 n 
1
1
1 + 2r + (m1)r
. The decoding
the choice t = 1 + 3 (m1)
3 1 + r
radius is equal to N t, and recalling that n = mN , we get bound
claimed in the lemma.
The rate of the folded ReedSolomon code is R = ((k + 1)/n) > k/n,
m
R2/3 . Letting the
and so the fraction of errors corrected is 1 m1
parameter m grow, we can approach a decoding radius of 1 R2/3 .

6.4

Root-nding step

In light of the above discussion, the only missing piece in our decoding
algorithm is an ecient way to solve the following trivariate rootnding type problem:
Given a nonzero polynomial Q(X, Y1 , Y2 ) with coecients from a nite eld F of size q, a primitive element
of the eld F, and an integer parameter k < q 1,
nd a list of all polynomials f (X) of degree at most k
such that Q(X, f (X), f (X)) 0.
The following simple algebraic lemma is at the heart of our solution to
this problem.
Lemma 6.5. Let F be the eld Fq of size q, and let be a primitive element that generates its multiplicative group. Then we have the

170 Folded ReedSolomon Codes


following two facts:
def

(1) The polynomial h(X) = X q1 is irreducible over F.


(2) Every polynomial f (X) F[X] of degree less than q 1 satises f (X) = f (X)q mod h(X).
Proof. The fact that h(X) = X q1 is irreducible over Fq follows
from a known, precise characterization of all irreducible binomials, i.e.,
polynomials of the form X a b, see for instance [55, Chapter 3, Section 5]. For completeness, and since this is an easy special case, we now
prove this fact. Suppose h(X) is not irreducible and some irreducible
polynomial f (X) F[X] of degree b, 1  b < q 1, divides it. Let be
b
a root of f (X) in the extension eld Fqb . We then have q 1 = 1. Also,
q b 1

f () = 0 implies q1 = . These equations together imply q1 = 1.


Now, is primitive in Fq , so that m = 1 i m is divisible by (q 1). We
conclude that q 1 must divide 1 + q + q 2 + + q b1 . This is, however, impossible since 1 + q + q 2 + + q b1 b (mod (q 1)) and
0 < b < q 1. This contradiction proves that h(X) has no such factor of degree less than q 1, and is therefore irreducible.
For the second part, we have the simple but useful identity f (X)q =
f (X q ) that holds for all polynomials in Fq [X]. Therefore, f (X)q
f (X) = f (X q ) f (X). The latter polynomial is clearly divisible
by X q X, and thus also by X q1 . Hence f (X)q f (X)
(mod h(X)) which implies that f (X)q mod h(X) = f (X) since the
degree of f (X) is less than q 1.
Armed with this lemma, we are ready to tackle the trivariate rootnding problem. This is in analogy with Section 4.5.2.
Lemma 6.6. There is a deterministic algorithm that on input a nite
eld F of size q, a primitive element of the eld F, a nonzero polynomial Q(X, Y1 , Y2 ) F[X, Y1 , Y2 ] of degree less than q in Y1 , and an
integer parameter k < q 1, outputs a list of all polynomials f (X) of
degree at most k satisfying the condition Q(X, f (X), f (X)) 0. The
algorithm has runtime polynomial in q.

6.4. Root-nding step

171

Proof. Let h(X) = X q1 . We know by Lemma 6.5 that h(X)


is irreducible. We rst divide out the largest power of h(X) that
divides Q(X, Y1 , Y2 ) to obtain Q0 (X, Y1 , Y2 ), where Q(X, Y1 , Y2 ) =
h(X)b Q0 (X, Y1 , Y2 ) for some b  0 and h(X) does not divide
Q0 (X, Y1 , Y2 ). Clearly, if f (X) satises Q(X, f (X), f (X)) = 0, then
Q0 (X, f (X), f (X)) = 0 as well, so we will work with Q0 instead of Q.
Let us view Q0 (X, Y1 , Y2 ) as a polynomial T0 (Y1 , Y2 ) with coecients
from F[X]. Further, reduce each of the coecients modulo h(X) to
get a polynomial T (Y1 , Y2 ) with coecients from the extension eld
def
F
= F[X]/(h(X)) (this is a eld since h(X) is irreducible over F). We
note that T (Y1 , Y2 ) is a nonzero polynomial since Q0 (X, Y1 , Y2 ) is not
divisible by h(X).
In view of Lemma 6.5, it suces to nd degree  k polynomials f (X) satisfying Q0 (X, f (X), f (X)q ) (mod h(X)) = 0. In turn, this
satisfying T (, q ) = 0. If we
means it suces to nd elements F
def

dene the univariate polynomial R(Y1 ) = T (Y1 , Y1q ), this is equivalent


such that R() = 0, or in other words the roots in
to nding all F

F of R(Y1 ).
Now R(Y1 ) is a nonzero polynomial since R(Y1 ) = 0 i Y2 Y1q
divides T (Y1 , Y2 ), and this cannot happen as T (Y1 , Y2 ) has degree
less than q in Y1 . The degree of R(Y1 ) is at most dq where d is the
is at most q, and
total degree of Q(X, Y1 , Y2 ). The characteristic of F
its degree over the underlying prime eld is at most q log q. Therefore, we can nd all roots of R(Y1 ) by a deterministic algorithm running in time polynomial in d, q [6] (see discussion in Section 4.5.2).
Each of the roots will be a polynomial in F[X] of degree less than
q 1. Once we nd all the roots, we prune the list and only output
those roots f (X) that have degree at most k and satisfy Q0 (X, f (X),
f (X)) = 0.
With this, we have a polynomial time implementation of the algorithm trivariate-FRS-decoder. There is the technicality that the degree
of Q(X, Y1 , Y2 ) in Y1 should be less than q. This degree
 is at most D/k,
which by the choice of D in (6.1) is at most (r + 3) 3 n/k < (r + 3)q 1/3 .
For a xed r and growing q, the degree is much smaller than q. (In fact,

172 Folded ReedSolomon Codes


for constant rate codes, the degree is a constant independent of n.) By
letting m, r grow in Lemma 6.4, and recalling that the running time is
polynomial in n, r, we can conclude the following main result of this
section.
Theorem 6.7. For every > 0 and R, 0 < R < 1, there is a family of
m-folded ReedSolomon codes for m = O(1/) which have rate at least
R and which can be list decoded up to a fraction 1 (1 + )R2/3 of
errors in time polynomial in the block length and 1/.

6.5

Codes approaching list decoding capacity

Given that trivariate interpolation improved the decoding radius


achievable with rate R from 1 R1/2 to 1 R2/3 , it is natural to
attempt to use higher order interpolation to improve the decoding
radius further. In this section, we discuss the (quite straightforward)
technical changes needed for such a generalization.
Consider again the m-folded RS code C  = FRSF,,m,k where F =
Fq . Let s be an integer in the range 1  s  m. We will develop a
decoding algorithm based on interpolating an (s + 1)-variate polynomial Q(X, Y1 , Y2 , . . . , Ys ). The denitions of the (1, k, k, . . . , k)-weighted
degree (with k repeated s times) of Q and the multiplicity at a point
(, 1 , 2 , . . . , s ) Fs+1 are straightforward extensions of Denitions
6.2 and 6.3.
As before let y = (y0 , y1 , . . . , yn1 ) be the unfolded version of the
received word z (Fm )N of the folded RS code that needs to be
decoded. For convenience, dene yj = yj mod n for j  n. Following
algorithm trivariate-FRS-decoder, for suitable integer parameters D, r,
the interpolation phase of the (s + 1)-variate FRS decoder will t a
nonzero polynomial Q(X, Y1 , . . . , Ys ) with the following properties:
(1) It has (1, k, k, . . . , k)-weighted degree at most D
(2) It has a zero of multiplicity r at ( i , yi , yi+1 , . . . , yi+s1 ) for
i = 0, 1, . . . , n 1.

6.5. Codes approaching list decoding capacity

173

The following is a straightforward generalization of Lemmas 6.2 and 6.3.


Lemma 6.8.
r+s
Ds+1
a
nonzero
polynomial
(1) Provided
(s+1)!ks > n s+1 ,
Q(X, Y1 , . . . , Ys ) with the above stated properties exists
and moreover can be found in time polynomial in n and rs .
D
. Then every
(2) Let t be an integer such that t > (ms+1)r
polynomial f (X) F[X] of degree at most k whose encoding according to FRSF,,m,k agrees with the received word
z on at least t locations satises Q(X, f (X), f (X), . . . ,
f ( s1 X)) 0.

Proof. The rst part follows from (i) a simple lower bound on the
number of monomials X a Y1b1 Ysbs with a + k(b1 + b2 + + bs ) 
D, which gives the number of coecients of Q(X, Y1 , . . . , Ys ), and (ii) an
estimation of the number of (s + 1)-variate monomials of total degree
less than r, which gives the number of interpolation conditions per
(s + 1)-tuple.
The second part is similar to the proof of Lemma 6.3. If f (X) has
agreement on at least t locations of z, then for at least t(m s + 1)
of the (s + 1)-tuples ( i , yi , yi+1 , . . . , yi+s1 ), we have f ( i+j ) = yi+j
def

for j = 0, 1, . . . , s 1. As in Lemma 6.1, we conclude that R(X) =


Q(X, f (X), f (X), . . . , f ( s1 X)) has a zero of multiplicity r at i for
each such (s + 1)-tuple. Also, by design R(X) has degree at most D.
Hence if t(m s + 1)r > D, then R(X) has more zeroes (counting multiplicities) than its degree, and thus R(X) 0.
Note the lower bound condition on D above is met with the choice
!
(6.2)
D = (k s nr(r + 1) (r + s))1/(s+1) + 1.
The task of nding a list of all degree k polynomials f (X) F[X]
satisfying Q(X, f (X), f (X), . . . , f ( s1 X)) = 0 can be solved using
ideas similar to the proof of Lemma 6.6. First, by dividing out by h(X)
enough times, we can assume that not all coecients of Q(X, Y1 , . . . , Ys ),

174 Folded ReedSolomon Codes


viewed as a polynomial in Y1 , . . . , Ys with coecients in F[X], are divisible by h(X). We can then go modulo h(X) to get a nonzero polyno = F[X]/(h(X)). Now,
mial T (Y1 , Y2 , . . . , Ys ) over the extension eld F
j
by Lemma 6.5, we have f ( j X) = f (X)q mod h(X) for every j  1.
Therefore, the task at hand reduces to the problem of nding all roots
of the polynomial R(Y1 ) where R(Y1 ) = T (Y1 , Y q , . . . , Y qs1 ).
F
1
1
There is the risk that R(Y1 ) is the zero polynomial, but it is easily
seen that this cannot happen if the total degree of T is less than q.
This will be the case since the total degree is at most D/k, which is at
most (r + s)(n/k)1/(s+1)  q.
The degree of the polynomial R(Y1 ) is at most q s , and therefore
can be found in q O(s) time. We conclude that the
all its roots in F
root-nding step can be accomplished in polynomial time.
D
, which for the
The algorithm works for agreement t > (ms+1)r
choice of D in (6.2) is satised if

s  (k s n)1/(s+1)
+ 2.
t 1+
r ms+1
Recalling that the block length of the code is N = n/m and the rate is
(k + 1)/n, the algorithm can decode a fraction of errors approaching

m
s
Rs/(s+1) ,
1 1+
(6.3)
r ms+1
using lists of size at most q s . By picking r, m large enough compared
with s, the decoding radius can be made larger than 1 (1 + )Rs/(s+1)
for any desired > 0. We state this result formally below.
Theorem 6.9. For every > 0, integer s  1 and 0 < R < 1, there
is a family of m-folded ReedSolomon codes for m = O(s/), which
have rate at least R and which can be list decoded up to a fraction
1 (1 + )Rs/(s+1) of errors in time (N m)O(s) , where N is the block
length of the code. The alphabet size of the code as a function of the
block length N is (N m)O(m) .
In the limit of large s (specically, for s = (1 log(1/R))), the
decoding radius approaches the list decoding capacity 1 R, leading
to our main result of this chapter.

6.6. List recovering

175

Theorem 6.10. [Explicit capacity-approaching codes] For every > 0


and 0 < R < 1, there is a family of folded ReedSolomon codes which
have rate at least R and which can be list decoded up to a fraction
1
1 R of errors in time (N/2 )O( log(1/R)) , where N is the block
length of the code. The alphabet size of the code as a function of the
2
block length N is (N/2 )O(1/ ) .

Remark 6. (Improvement in decoding radius) It is possible to


  mR s/(s+1)

with
slightly improve the bound of (6.3) to 1 1 + rs ms+1
essentially no eort. The idea is to use only a fraction (m s + 1)/m
of the n (s + 1)-tuples for interpolation. Specically, we omit the tuples
( i , yi , yi+1 , . . . , yi+s1 ) where i mod m > m s in the interpolation
conditions. The number of (s + 1)-tuples for which we have agreement
remains at least t(m s + 1), since we only counted agreements on
tuples ( i , yi , yi+1 , . . . , yi+s1 ) for 0  i mod m  m s. However, the
number of interpolation conditions is now reduced to N (m s + 1) =
n(m s + 1)/m. This translates into the stated improvement in error
correction radius. For clarity of presentation, we simply chose to use
all n tuples for interpolation.

6.6

List recovering

Exactly as mentioned in Section 4.4.1 for the case of RS codes, the


above algorithms for folded RS codes work seamlessly for the list
recovering setting. The performance of the decoder is dictated by the
number n of interpolation points, and we did not use that the rst
coordinates of those tuples were distinct. Therefore, generalizing the
bound of (6.3), for any integer   1, the m-folded RS code of rate
m
(Rs )1/(s+1) .
R can be (p, , q s )-list-recovered for p = 1 (1 + rs ) ms+1
For every choice of s, , and > 0, letting r, m grow, we can make
p  1 (1 + )1/(s+1) Rs/(s+1) . Using the choice s = (1 log(/R)),
we can achieve p  1 R , and thus independent of ! Therefore,
the achievable rate for list recovering (for large enough alphabet size)
depends only on the fraction of erroneous input lists but not on the size

176 Folded ReedSolomon Codes


of the input lists. This feature will be crucially used in the next chapter
to reduce the alphabet size needed to achieve capacity in Theorem 6.10.
Theorem 6.11. (Achieving capacity for list recovering) For
every 0 < R < 1, integer   2 and all small enough > 0, there
is a family of folded ReedSolomon codes (over elds of any
desired characteristic) which have rate at least R and which
1
can be (1 R , , ((N log )/2 )O( log(/R)) )-list-recovered in time
1
((N log )/2 )O( log(/R)) , where N is the block length of the code.
The alphabet size of the code as a function of the block length N is

O(2 log )
N log 
.
2
We can also state a further extension to soft-decision decoding of
folded RS codes. We skip the details here, which can be found in [37].

6.7

Bibliography and remarks

Two independent works by Coppersmith and Sudan [11] and Bleichenbacher, Kiayias, and Yung [7] considered the variant of RS codes where
the message consists of two (or more) independent polynomials over F,
and the encoding consists of the joint evaluation of these polynomials
at elements of F (so this denes a code over F2 ).1 A naive way to decode
these codes, which are also called interleaved ReedSolomon codes,
would be to recover the two polynomials individually, by running separate instances of the RS decoder. Of course, this gives no gain over the
performance of RS codes. The hope in these works was that something
can possibly be gained by exploiting that errors in the two polynomials
happen at synchronized locations.
However, these works could not
give any improvement over the 1 R bound known for RS codes for
worst-case errors. Nevertheless, for random errors, where each error
replaces the correct symbol by a uniform random
eld element, they
were able to correct well beyond a fraction 1 R of errors. In fact,
as the order of interleaving (i.e., number of independent polynomials)
1 The

resulting code is in fact just a ReedSolomon code where the evaluation points belong
to the subeld F of the extension eld over F of degree two.

6.7. Bibliography and remarks

177

grows, the radius approaches the optimal value 1 R. This model of


random errors is not very practical or interesting in a coding-theoretic
setting, though the algorithms are interesting from an algebraic viewpoint.
The algorithm of Coppersmith and Sudan bears an intriguing relation to multivariate interpolation. Multivariate interpolation essentially
amounts to nding a non-trivial linear dependence among the rows of
a certain matrix (that consists of the evaluations of appropriate monomials at the interpolation points). The algorithm in [11], instead nds a
non-trivial linear dependence among the columns of this same matrix!
The positions corresponding to columns not involved in this dependence are erased (they correspond to error locations) and the codeword
is recovered from the remaining symbols using erasure decoding.
In [61], Parvaresh and Vardy gave a heuristic decoding algorithm
for these interleaved RS codes based on multivariate interpolation.
However,
the provable performance of these codes coincided with the

1 R bound for ReedSolomon codes. The key obstacle in improving this bound was the following: for the case when the messages are
pairs (f (X), g(X)) of polynomials, two algebraically independent relations were needed to identify both f (X) and g(X). The interpolation
method could only provide one such relation in general (of the form
Q(X, f (X), g(X)) = 0 for a trivariate polynomial Q(X, Y, Z)). This still
left too much ambiguity in the possible values of (f (X), g(X)). (The
approach in [61] was to nd several interpolation polynomials, but there
was no guarantee that they were not all algebraically dependent.)
Then, in [62], Parvaresh and Vardy put forth the ingenious idea of
obtaining the extra algebraic relation essentially for free by enforcing
it as an a priori condition satised at the encoder. Specically, instead
of letting the second polynomial g(X) to be an independent degree
k polynomial, their insight was to make it correlated with f (X) by a
specic algebraic condition, such as g(X) = f (X)d mod h(X) for some
integer d and an irreducible polynomial h(X) of degree k + 1.
Then, once we have the interpolation polynomial Q(X, Y, Z), f (X)
can be found as described in this chapter: Reduce the coecients of
Q(X, Y, Z) modulo h(X) to get a polynomial T (Y, Z) with coecients
from F[X]/(h(X)) and then nd roots of the univariate polynomial

178 Folded ReedSolomon Codes

T (Y, Y d ). This was the key idea in [62] to improve the 1 R decoding
radius for rates less than 1/16. For rates R 0, their decoding radius
approached 1 O(R log(1/R)).
The modication in using independent polynomials does not come
for free, however. In particular, since one sends at least twice as much
information as in the original RS code, there is no way to construct
codes with rate more than 1/2 in the PV scheme. If we use s  2
correlated polynomials for the encoding, we incur a factor 1/s loss
in the rate. This proves quite expensive, and as a result the improvements over RS codes oered by these codes are only manifest at very
low rates.
The central idea in the work of Guruswami and Rudra on list
decoding folded RS codes [37] was to avoid this rate loss by making
the correlated polynomial g(X) essentially identical to the rst (say
g(X) = f (X)). Then the evaluations of g(X) can be inferred as a
simple cyclic shift of the evaluations of f (X), so intuitively there is
no need to explicitly include those too in the encoding. The folded RS
encoding of f (X) compresses all the needed information, without any
extra redundancy for g(X). In particular, from a received word that
agrees with folded RS encoding of f (X) in many places, we can infer
a received word (with symbols in F2 ) that matches the value of both
f (X) and f (X) = g(X) in many places, and then run the decoding
algorithm of Parvaresh and Vardy.
The terminology of folded RS codes was coined in [52], where an
algorithm to correct random errors in such codes was presented (for
a noise model similar to the one used in [7, 11] that was mentioned
earlier). The motivation was to decode RS codes from many random
phased burst errors. Our decoding algorithm for folded RS codes
can
also be likewise viewed as an algorithm to correct beyond the 1 R
bound for RS codes if errors occur in large, phased bursts (the actual
errors can be adversarial).

7
Achieving Capacity over Bounded Alphabets

The capacity-achieving codes from the previous chapter have two main
shortcomings: (i) their alphabet size is a large polynomial in the block
length, and (ii) the bound on worst-case list size as well as decoding
time complexity grows as n(1/) , where is the distance to capacity.
In this chapter, we will remedy the alphabet size issue (Section 7.2).
We begin by using the folded RS codes in a concatenation scheme to
get good list-decodable binary codes.

7.1

Binary codes decodable up to Zyablov bound

The optimal list recoverability of the folded RS codes discussed in Section 6.6 plays a crucial role in establishing the following result concerning list decoding binary codes. The decoding radius achieved matches
the standard product bound on the designed relative distance of
binary concatenated codes, namely the product of the relative distance
of an outer MDS code with the relative distance of an inner code that
meets the GilbertVarshamov bound.

179

180 Achieving Capacity over Bounded Alphabets


Theorem 7.1. For all 0 < R, r < 1 and all > 0, there is a polynomial
time constructible family of binary linear codes of rate at least R
r that can be list decoded in polynomial time up to a fraction (1
R)H 1 (1 r) of errors.
Proof. [Sketch] The idea is to use an appropriate concatenation scheme
with an outer code C1 and a binary linear inner code C2 . For C1 , we
will use a folded RS code over a eld of characteristic 2 as guaranteed
by Theorem 6.11, with the following properties: (i) The rate of C1 is at
least R, (ii) it can be (1 R , , L(N ))-list-recovered in polynomial
2
(N O(1/ ) , where N is the block length) time for  = 10/, and (iii) its
alphabet size is 2M for M = O(3 log N ). The code C2 will be a binary
linear code of dimension M (so that it can be concatenated with C1 )
and rate at least r which is (, )-list decodable for = H 1 (1 r ).
Such a code is known to exist via a random coding argument that
employs the semi-random method [30]. Also, a greedy construction of
such a code by constructing its M basis elements in turn is known
and takes 2O(M ) time. We conclude that the necessary inner code can
3
be constructed in N O(1/ ) , and thus polynomial, time. Note that the
concatenated code is a binary linear code of rate at least R r.
The decoding algorithm proceeds in a natural way. Given a received
word, we break it up into blocks corresponding to the various inner
encodings by C1 . Each of these blocks is list decoded up to a radius ,
returning a list of at most  possible candidates for each outer codeword
symbol. The outer code is then (1 R , , L(N ))-list recovered using
these lists as input. To argue about the fraction of errors this algorithm
corrects, we note that the algorithm fails to recover a codeword only if
on more than a fraction (1 R ) of the inner blocks, the codeword
diers from the received word on more than a fraction of symbols.
It follows that the algorithm correctly list decodes up to a radius (1
R ) = (1 R )(H 1 (1 r) ).
Optimizing over the choice of inner and outer codes rates r, R in
the above result, we can decode up to the so-called Zyablov bound,
depicted in Figure 7.1.

7.2. Approaching capacity over constant-sized alphabets

181

0.5

p (ERROR-CORRECTION RADIUS) --->

Binary list decoding capacity


Zyablov bound

0.4

0.3

0.2

0.1

0
0

0.2

0.6

0.4

0.8

R (RATE) --->

Fig. 7.1 Error-correction radius of our algorithm for binary codes plotted against the rate
R. The best possible trade-o, i.e., capacity, is = H 1 (1 R), and is also plotted.

Remark 7. The construction time of the codes of Theorem 7.1 is


3
n(1/ ) , where n is the block length (this complexity bound arises due
to the search for an appropriate inner code). A uniformly constructive
family of binary codes, i.e., with construction time f ()nc for some
c independent of , will be more desirable. The same applies to the
decoding complexity.

7.2

Approaching capacity over constant-sized alphabets

We now show how to approach the list decoding capacity 1 R with


an alphabet size that is a constant depending only on the distance to
capacity.
Theorem 7.2. (Main) For every R, 0 < R < 1, every > 0, there is a
polynomial time constructible family of codes over an alphabet of size
4
2O( log(1/)) that have rate at least R and which can be list decoded
up to a fraction (1 R ) of errors in polynomial time.

182 Achieving Capacity over Bounded Alphabets


Proof. [Sketch] The theorem can be proved using the code construction scheme used in [3, 35] for linear time unique decodable codes with
optimal rate, with dierent components appropriate for list decoding
plugged in. We only highlight the high level ideas and spare the details,
which can be found in [35, 37].
As in the above result for binary codes, the list recoverability of
folded RS codes will play a crucial role. To motivate the construction,
imagine the following concatenation scheme to combine an outer code
C1 with an inner code C2 . For C1 , we use a folded RS code of rate
close to 1 that can (0, , L)-list-recovered for  = (1/). Here we are
exploiting the fact, mentioned in Section 6.6, that the size of the input
lists  does not gure in the price we pay in terms of rate.
For C2 , for some small > 0, we will use a (1 R /2, O(1/))list decodable code with near-optimal rate, say rate at least (R + /4).
As mentioned in Corollary 3.7, such a (non-linear) code can be shown
to exist over an alphabet of size 2O(1/) using random coding arguments. A naive brute-force for such a code, however, is too expensive.
Fortunately, a similar guarantee holds for a random code drawn from
a much smaller ensemble that was introduced and called pseudolinear
codes in [31] (see also [28, Section 9.3]). This enables polynomial time
construction of such an inner code C2 . The number of codewords in C2
equals the alphabet size of the folded RS code C1 , which is polynomial
in the block length (for each xed ). Therefore, C2 can be list decoded
by a brute-force search over all its codewords in polynomial time. The
overall rate of the concatenated code is at least R since the rate of C1
is very close to 1.
Now suppose a fraction (1 R ) errors occur on the symbols of
the concatenated code. Assume an ideal scenario where these errors
are uniformly distributed amongst all the inner blocks, i.e., no block
internally suers from more than a fraction (1 R /2) of errors.
Then a brute-force list decoding of each of the inner blocks can be
used to return a list of at most  = O(1/) symbols for each position of C1 . By the assumption on the distribution of errors, each
of these lists will contain the correct symbol. Therefore, running
the assumed list recovering algorithm for C1 will recover the correct
codeword.

7.2. Approaching capacity over constant-sized alphabets

183

Of course the above assumption on equitable distribution of errors


amongst the blocks is not valid. The key idea is to somehow simulate
such a distribution. For this purpose, the symbols of the concatenated
codewords are further redistributed using an expander graph to produce
a codeword over a larger alphabet (but with no further loss in rate). The
pseudorandom properties of the expander ensures that for every error
pattern aecting a fraction (1 R ) of positions, once symbols are

distributed backwards, most (say, a fraction 1 O( )) of the blocks


corresponding to inner codewords of C2 incur at most a fraction (1

R ) of errors. If instead of a (0, , L)-list-recoverable code C1 , we

use a list-recoverable code that can tolerate a small O( ) fraction


of erroneous lists (which is still possible with a very high rate of 1

O( )), the errors in decoding the few deviant inner blocks can be
handled when list recovering C1 .
We skip the formal details that are not hard to work out and follow
along the same lines as the arguments in [35].

8
Concluding Thoughts

We have surveyed the topic of list decoding with a focus on the recent
advances in decoding algorithms for ReedSolomon codes and close
variants, culminating with a presentation of how to achieve the list
decoding capacity over large alphabets. We conclude by mentioning
some interesting open questions and directions for future work.

8.1

Improving list size of capacity-achieving codes

To list decode a fraction 1 R of errors with rate R, the decoding


complexity as well as worst-case list size needed are n(1/) . It remains
a challenge to reduce both of these, and in particular, to achieve a list
size bound that is independent of n. Recall that the existential results
of Section 3.2.1 imply that a bound of O(1/) suces.
The large bound on list size for decoding folded RS codes arises due
to the large degree (the eld size q) of the algebraic relation between
and f (X) and its shift f (X). The bound on the list size was derived
from the degree of the nal univariate polynomial whose roots contain
all the solution messages. The degree of this polynomial grew like q s1
leading to the large bound mentioned above. If the degree of the relation
184

8.2. Achieving capacity over small alphabets

185

can be made an absolute constant independent of q, then we can attain


the goal of constant list size.1 This motivates the concrete question:
Does there exist a Fq such that f (X) and f (X), viewed as ele over Fq ,
ments (via a natural injective map) of some extension eld F
Z]
satisfy P (f (X), f (X)) = 0 for a nonzero polynomial P (Y, Z) F[Y,
of degree an absolute constant? In [37], it is argued that if P (Y, Z) has
degree 1 in Z, i.e., has the form Z p(Y ), then p(Y ) must have degree
at least q in Y . Therefore, such a polynomial must be nonlinear in both
Y, Z.
Another potential avenue for improving the complexity and list
size is via a generalization of folded RS codes to folded algebraic
geometric (AG) codes. In [36], the authors dene correlated AG codes,
and describe list decoding algorithms for those codes, based on a generalization of the ParvareshVardy approach to the general class of
algebraicgeometric codes (of which RS codes are a special case). However, to relate folded AG codes to correlated AG codes like we did for
RS codes requires bijections on the set of rational points of the underlying algebraic curve that have some special, hard to guarantee, property.
This step seems like an highly intricate algebraic task, and especially
so in the interesting asymptotic setting of a family of asymptotically
good AG codes over a xed alphabet.

8.2

Achieving capacity over small alphabets

One of the daunting challenges in the subject is to construct explicit


binary codes along with polynomial time decoding algorithms that
achieve the (binary) list decoding capacity. The question remains wide
open over any xed alphabet size.
Perhaps a somewhat less daunting challenge is to explicitly achieve
list decoding capacity of binary (or q-ary for small q) codes for the
erasure channel. In the erasure model, some codeword symbols are not
received, and the rest are received without error, and the receiver knows
1 This

was the case in the work of Parvaresh and Vardy [62], since they had complete
exibility in picking the degree d (it just had to be larger than some absolute constant)
and they dened the correlated polynomial to be g(X) = f (X)d mod h(X) (instead of
f (X)).

186 Concluding Thoughts


the locations corresponding to the symbols received. With rate R, the
existential results indicate that one can list decode from a fraction
1 R of erasures, but the best constructive results are far from this
bound. For linear codes, the erasure decoding problem is easy algorithmically, so the question becomes a mainly combinatorial one. We refer
the reader to [25] for more context on list decoding under erasures.

8.3

Applications outside coding theory

The motivation of the above questions on achieving list decoding capacity is primarily a coding-theoretic one. Resolving these questions may
not directly impact any of the applications of list decoding outside
coding theory, for example, the various complexity-theoretic applications. However, codes are by now a central combinatorial object in
the toolkit of computer science, and they are intertwined with many
other fundamental pseudorandom objects such as expander graphs
and extractors. Therefore, any new idea to construct substantially better codes could potentially have broader impact. As an example of this
phenomenon, we mention the recent work [44], which proved that the
ParvareshVardy codes yield excellent condensers that achieve nearoptimal compression of a weak-random source while preserving essentially all of its entropy. The compressed source has a high enough
entropy rate to enable easy extraction of almost all its randomness
using previously known methods. Together, this yields simple and selfcontained randomness extractors that are optimal up to constant factors, matching the previously best known construction due to [57] (that
was quite complicated and built upon on a long line of previous work).
Therefore, the algebraic ideas underlying the recent developments
in achieving list decoding capacity have already found powerful applications in areas outside coding theory. In the other direction, pseudorandom constructs have already yielded many new developments in
coding theory, such as expander based codes [1,31,33,66,68], extractor
codes [26, 72], and codes from the XOR lemma [47, 73]. It is our hope
that exploring these connections in greater depth might lead to further
interesting progress in coding theory; for example, it is interesting to
see if insights from some exciting recent combinatorial techniques, such

8.4. Combinatorial questions

187

as the zigzag graph product used to construct expanders, can be used


to develop novel methods to construct codes.

8.4

Combinatorial questions

Several fundamental combinatorial aspects of list decoding are still not


well-understood. For example, to get within of list decoding capacity
(say for binary codes), what is the list size needed (as a function of )?
Can one show that it is (1/), matching the simple random coding
argument of Section 3.2? More generally, techniques to lower bound
the required list size are of interest; see [45] for recent work showing a
(1/ 2 ) lower bound for list decoding a fraction (1 1/q ) of errors
with q-ary codes.
The large discrepancy between the strength of random coding
results for linear and general codes, as outlined in Remark 3, merits
further investigation.
The following seems to another intricate combinatorial task: Construct an explicit code and an explicit center with exponentially many
codewords in a Hamming ball of normalized radius less than the relative distance around it. In addition to the intrinsic interest in such
bad list decoding congurations, this has potential applications in
derandomizing the hardness results for approximating the minimum
distance of a linear code [12].

8.5

Decoding concatenated codes using soft information

The approach to construct binary concatenated codes list decodable


up to the Zyablov bound uses the list recovering property of the
outer folded RS code. The inner decoder returns a list of symbols for
each position of the outer folded RS code, but there is no weighting
of the various symbols in such a list. Associating such weights and
passing soft information that can be exploited by the outer decoder
(Section 4.4.2) is a promising approach for improved decoding of concatenated codes. An approach based on convex optimization [49, 50]
has been proposed to nd the best choice of weights to use in concatenated decoding schemes. The involved computations become quickly
infeasible for general, large inner codes, though experimentation with

188 Concluding Thoughts


small inner codes having well-understood coset weight distributions is
possible. Analytically, however, it remains a big challenge to understand
good choices of weights and establish provable bounds on performance
of such decoders. Some analytic work, for rather special or tailor-made
inner codes, appears in [41, 42].

8.6

New vistas

Much of the success in constructing list decodable codes with good


rate has hinged on algebraic ideas. In particular, most results depend
directly or indirectly on the list decoding algorithm for ReedSolomon
codes or its variants. We now discuss other vistas in which one might
look for codes that depart from the algebraic paradigm. Several connections between list-decodable codes and pseudorandomness have
emerged in recent years, and in particular a basic tool in the latter
topic called the XOR lemma can be used to construct list-decodable
codes [73]. However, the rate of such codes have so far been sub-constant
in the block length.
In Chapter 5, we discussed a combinatorial construction of listdecodable codes based on expander graphs due to [33]. These codes have
positive rate and can be list decoded from a (1 ) fraction of errors,
for any desired design parameter > 0, by a linear time combinatorial
algorithm. The rate of these codes, while positive, is rather small, and
not competitive with the algebraic constructions. It remains to be seen
if this can be improved with more sophisticated ideas. Some limited
progress was made in [34] for the case of erasures, and list recovering
with zero error (when all the input lists are guaranteed to contain the
respective codeword symbol).
Progress on list decoding of new families of graph-based codes such
as low-density parity-check (LDPC) codes would be very exciting. Such
codes have been the source of substantial progress in getting close to
capacity with very ecient algorithms for basic stochastic channels
such as the binary erasure channel, the AWGN channel, and the binary
symmetric channel (for basic details and pointers, see, for instance, the
survey [22]). Whether algorithmic tools can be developed to list decode
LDPC or related codes remains an intriguing question.

Acknowledgments

The author thanks Madhu Sudan and the anonymous reviewer for a
careful reading of the manuscript and for their very useful comments
on improving the quality and coverage of the presentation.

189

References

[1] N. Alon, J. Bruck, J. Naor, M. Naor, and R. Roth, Construction of asymptotically good low-rate error-correcting codes through pseudo-random graphs,
IEEE Transactions on Information Theory, vol. 38, pp. 509516, 1992.
[2] N. Alon and F. R. K. Chung, Explicit construction of linear sized tolerant
networks, Discrete Mathematics, vol. 72, pp. 1519, 1988.
[3] N. Alon and M. Luby, A linear time erasure-resilient code with nearly optimal
recovery, IEEE Transactions on Information Theory, vol. 42, no. 6, pp. 1732
1736, 1996.
[4] N. Alon and J. Spencer, The Probabilistic Method. John Wiley and Sons, Inc.,
1992.
[5] S. Ar, R. Lipton, R. Rubinfeld, and M. Sudan, Reconstructing algebraic functions from mixed data, SIAM Journal on Computing, vol. 28, no. 2, pp. 488
511, 1999.
[6] E. Berlekamp, Factoring polynomials over large nite elds, Mathematics of
Computation, vol. 24, pp. 713735, 1970.
[7] D. Bleichenbacher, A. Kiayias, and M. Yung, Decoding of interleaved Reed
Solomon codes over noisy data, in Proceedings of the 30th International Colloquium on Automata, Languages and Programming, pp. 97108, 2003.
[8] V. M. Blinovsky, Bounds for codes in the case of list decoding of nite volume,
Problems of Information Transmission, vol. 22, no. 1, pp. 719, 1986.
[9] V. M. Blinovsky, Code bounds for multiple packings over a nonbinary nite
alphabet, Problems of Information Transmission, vol. 41, no. 1, pp. 2332,
2005.

190

References

191

[10] D. Boneh, Finding smooth integers in short intervals using CRT decoding,
in Proceedings of the 32nd Annual ACM Symposium on Theory of Computing,
pp. 265272, 2000.
[11] D. Coppersmith and M. Sudan, Reconstructing curves in three (and higher)
dimensional spaces from noisy data, in Proceedings of the 35th Annual ACM
Symposium on Theory of Computing, pp. 136142, June 2003.
[12] I. Dumer, D. Micciancio, and M. Sudan, Hardness of approximating the minimum distance of a linear code, IEEE Transactions on Information Theory,
vol. 49, no. 1, pp. 2237, 2003.
[13] P. Elias, List decoding for noisy channels, Technical Report 335, Research
Laboratory of Electronics, MIT, 1957.
[14] P. Elias, Error-correcting codes for list decoding, IEEE Transactions on
Information Theory, vol. 37, pp. 512, 1991.
[15] G. D. Forney, Concatenated Codes. MIT Press, Cambridge, MA, 1966.
[16] P. Gemmell and M. Sudan, Highly resilient correctors for multivariate polynomials, Information Processing Letters, vol. 43, no. 4, pp. 169174, 1992.
[17] O. Goldreich and L. Levin, A hard-core predicate for all one-way functions,
in Proceedings of the 21st Annual ACM Symposium on Theory of Computing,
pp. 2532, May 1989.
[18] O. Goldreich, D. Ron, and M. Sudan, Chinese remaindering with errors,
IEEE Transactions on Information Theory, vol. 46, no. 5, pp. 13301338, July
2000.
[19] O. Goldreich, R. Rubinfeld, and M. Sudan, Learning polynomials with queries:
The highly noisy case, in Proceedings of the 36th Annual IEEE Symposium on
Foundations of Computer Science, pp. 294303, 1995.
[20] O. Goldreich, R. Rubinfeld, and M. Sudan, Learning polynomials with queries:
The highly noisy case, SIAM Journal on Discrete Mathematics, vol. 13, no. 4,
pp. 535570, November 2000.
[21] P. Gopalan, R. Lipton, and Y. Ding, Error correction against computationally
bounded adversaries, Theory of Computing Systems, to appear.
[22] V. Guruswami, Iterative decoding of low-density parity check codes (A Survey), CoRR, cs.IT/0610022, 2006. Appears in Issue 90 of the Bulletin of the
EATCS.
[23] V. Guruswami, List decoding with side information, in Proceedings of the
18th IEEE Conference on Computational Complexity (CCC), pp. 300309,
July 2003.
[24] V. Guruswami, Limits to list decodability of linear codes, in Proceedings of
the 34th ACM Symposium on Theory of Computing, pp. 802811, 2002.
[25] V. Guruswami, List decoding from erasures: Bounds and code constructions,
IEEE Transactions on Information Theory, vol. 49, no. 11, pp. 28262833, 2003.
[26] V. Guruswami, Better extractors for better codes?, in Proceedings of 36th
Annual ACM Symposium on Theory of Computing (STOC), pp. 436444, June
2004.
[27] V. Guruswami, Error-correcting codes and expander graphs, SIGACT News,
pp. 2541, September 2004.

192 References
[28] V. Guruswami, List decoding of error-correcting codes, Lecture Notes in
Computer Science, no. 3282, Springer, 2004.
[29] V. Guruswami, List decoding in pseudorandomness and average-case complexity, in IEEE Information Theory Workshop, March 2006.
[30] V. Guruswami, J. Hastad, M. Sudan, and D. Zuckerman, Combinatorial
bounds for list decoding, IEEE Transactions on Information Theory, vol. 48,
no. 5, pp. 10211035, 2002.
[31] V. Guruswami and P. Indyk, Expander-based constructions of eciently
decodable codes, in Proceedings of the 42nd Annual IEEE Symposium on
Foundations of Computer Science, pp. 658667, 2001.
[32] V. Guruswami and P. Indyk, Near-optimal linear-time codes for unique decoding and new list-decodable codes over smaller alphabets, in Proceedings of the
34th Annual ACM Symposium on Theory of Computing (STOC), pp. 812821,
2002.
[33] V. Guruswami and P. Indyk, Linear-time encodable and list decodable codes,
in Proceedings of the 35th Annual ACM Symposium on Theory of Computing
(STOC), pp. 126135, June 2003.
[34] V. Guruswami and P. Indyk, Linear-time list decoding in error-free settings,
in Proceedings of the 31st International Colloquium on Automata, Languages
and Programming (ICALP), pp. 695707, July 2004.
[35] V. Guruswami and P. Indyk, Linear-time encodable/decodable codes with
near-optimal rate, IEEE Transactions on Information Theory, vol. 51, no. 10,
pp. 33933400, October 2005.
[36] V. Guruswami and A. Patthak, Correlated algebraic-geometric codes:
Improved list decoding over bounded alphabets, in Proceedings of the 47th
IEEE Symposium on Foundations of Computer Science (FOCS), pp. 227238,
October 2006. Journal version to appear in Mathematics of Computation.
[37] V. Guruswami and A. Rudra, Explicit capacity-achieving list-decodable
codes, in Proceedings of the 38th Annual ACM Symposium on Theory of Computing, pp. 110, May 2006.
[38] V. Guruswami and A. Rudra, Limits to list decoding Reed-Solomon codes,
IEEE Transactions on Information Theory, vol. 52, no. 8, August 2006.
[39] V. Guruswami, A. Sahai, and M. Sudan, Soft-decision decoding of Chinese
remainder codes, in Proceedings of the 41st IEEE Symposium on Foundations
of Computer Science, pp. 159168, 2000.
[40] V. Guruswami and M. Sudan, Improved decoding of Reed-Solomon and
algebraic-geometric codes, IEEE Transactions on Information Theory, vol. 45,
pp. 17571767, 1999.
[41] V. Guruswami and M. Sudan, List decoding algorithms for certain concatenated codes, in Proceedings of the 32nd Annual ACM Symposium on Theory
of Computing (STOC), pp. 181190, 2000.
[42] V. Guruswami and M. Sudan, Decoding concatenated codes using soft information, in Proceedings of the 17th Annual IEEE Conference on Computational
Complexity (CCC), pp. 148157, 2002.

References

193

[43] V. Guruswami and M. Sudan, On representations of algebraic-geometric


codes, IEEE Transactions on Information Theory, vol. 47, no. 4, pp. 1610
1613, May 2001.
[44] V. Guruswami, C. Umans, and S. Vadhan, Extractors and condensers from
univariate polynomials, Electronic Colloquium on Computational Complexity,
Report TR06-134, October 2006.
[45] V. Guruswami and S. Vadhan, A lower bound on list size for list decoding,
in Proceedings of the 9th International Workshop on Randomization and Computation (RANDOM), pp. 318329, 2005.
[46] R. W. Hamming, Error detecting and error correcting codes, Bell System
Technical Journal, vol. 29, pp. 147160, April 1950.
[47] R. Impagliazzo, R. Jaiswal, and V. Kabanets, Approximately list-decoding
direct product codes and uniform hardness amplication, in Proceedings of the
47th IEEE Symposium on Foundations of Computer Science, October 2006.
[48] K. Jain and R. Venkatesan, Ecient code construction via cryptographic
assumptions, in Proceedings of the 41st Annual Allerton Conference on Communication, Control, and Computing, 2003.
[49] R. Koetter, On optimal weight assignments for multivariate interpolation listdecoding, in IEEE Information Theory Workshop, March 2006.
[50] R. Koetter and A. Vardy, Soft decoding of Reed Solomon codes and optimal
weight assignments, in ITG Fachtagung, January 2002. Berlin, Germany.
[51] R. Koetter and A. Vardy, Algebraic soft-decision decoding of Reed-Solomon
codes, IEEE Transactions on Information Theory, vol. 49, no. 11, pp. 2809
2825, 2003.
[52] V. Y. Krachkovsky, Reed-Solomon codes for correcting phased error bursts,
IEEE Transactions on Information Theory, vol. 49, no. 11, pp. 29752984,
November 2003.
[53] M. Langberg, Private codes or succinct random codes that are (almost) Perfect, in Proceedings of the 45th IEEE Symposium on Foundations of Computer
Science, pp. 325334, 2004.
[54] A. Lapidoth and P. Narayan, Reliable communication under channel uncertainty, IEEE Transactions on Information Theory, vol. 44, no. 6, October
1998.
[55] R. Lidl and H. Niederreiter, Introduction to Finite Fields and Their Applications. Cambridge University Press, Cambridge, MA, 1986.
[56] R. J. Lipton, A new approach to information theory, in Proceedings of the
11th Annual Symposium on Theoretical Aspects of Computer Science (STACS),
pp. 699708, 1994.
[57] C.-J. Lu, O. Reingold, S. P. Vadhan, and A. Wigderson, Extractors: Optimal
up to constant factors, in Proceedings of the 35th Annual ACM Symposium
on Theory of Computing, pp. 602611, 2003.
[58] R. J. McEliece, On the average list size for the Guruswami-Sudan decoder,
in 7th International Symposium on Communications Theory and Applications
(ISCTA), July 2003.

194 References
[59] R. J. McEliece and L. Swanson, On the decoder error probability for ReedSolomon codes, IEEE Transactions on Information Theory, vol. 32, no. 5,
pp. 701703, 1986.
[60] S. Micali, C. Peikert, M. Sudan, and D. A. Wilson, Optimal error correction
against computationally bounded noise, in Proceedings of the 2nd Theory of
Cryptography Conference (TCC), pp. 116, 2005.
[61] F. Parvaresh and A. Vardy, Multivariate interpolation decoding beyond the
Guruswami-Sudan radius, in Proceedings of the 42nd Allerton Conference on
Communication, Control and Computing, 2004.
[62] F. Parvaresh and A. Vardy, Correcting errors beyond the Guruswami-Sudan
radius in polynomial time, in Proceedings of the 46th Annual IEEE Symposium
on Foundations of Computer Science, pp. 285294, 2005.
[63] W. W. Peterson, Encoding and error-correction procedures for BoseChaudhuri codes, IEEE Transactions on Information Theory, vol. 6, pp. 459
470, 1960.
[64] J. Radhakrishnan, Proof of q-ary Johnson bound, 2006. Personal
Communication.
[65] C. E. Shannon, A mathematical theory of communication, Bell System Technical Journal, vol. 27, pp. 379423, 623656, 1948.
[66] M. Sipser and D. Spielman, Expander codes, IEEE Transactions on Information Theory, vol. 42, no. 6, pp. 17101722, 1996.
[67] A. Smith, Scrambling adversarial errors using few random bits, optimal information reconciliation, and better private codes, Cryptology ePrint Archive,
Report 2006/020, http://eprint.iacr.org/, 2006.
[68] D. Spielman, Linear-time encodable and decodable error-correcting
codes, IEEE Transactions on Information Theory, vol. 42, no. 6, pp. 1723
1732, 1996.
[69] M. Sudan, Decoding of Reed-Solomon codes beyond the error-correction
bound, Journal of Complexity, vol. 13, no. 1, pp. 180193, 1997.
[70] M. Sudan, List decoding: Algorithms and applications, SIGACT News,
vol. 31, pp. 1627, 2000.
[71] M. Sudan, Ideal error-correcting codes: Unifying algebraic and numbertheoretic algorithms, in Proceedings of AAECC-14: The 14th Symposium on
Applied Algebra, Algebraic Algorithms and Error-Correcting Codes, pp. 3645,
November 2001.
[72] A. Ta-Shma and D. Zuckerman, Extractor codes, IEEE Transactions on
Information Theory, vol. 50, no. 12, pp. 30153025, 2004.
[73] L. Trevisan, List-decoding using the XOR lemma, in Proceedings of the 44th
IEEE Symposium on Foundations of Computer Science, pp. 126135, 2003.
[74] L. Trevisan, Some applications of coding theory in computational complexity,
Quaderni di Matematica, vol. 13, pp. 347424, 2004.
[75] J. H. van Lint, Introduction to coding theory, Graduate Texts in Mathematics, vol. 86, 3rd Edition, Springer-Verlag, Berlin, 1999.
[76] J. von zur Gathen, Modern Computer Algebra. Cambridge University Press,
1999.

References

195

[77] L. R. Welch and E. R. Berlekamp, Error correction of algebraic block codes,


US Patent Number 4,633,470, December 1986.
[78] J. M. Wozencraft, List decoding, Quarterly Progress Report, Research Laboratory of Electronics, MIT, vol. 48, pp. 9095, 1958.
[79] V. V. Zyablov and M. S. Pinsker, List cascade decoding, Problems of Information Transmission, vol. 17, no. 4, pp. 2934, (in Russian); pp. 236240 (in
English), 1982, 1981.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy