Universal Latin Square Codes
General - What is error correction ?
Error Detecting Codes (aka EDC, CRC, digest) are techniques to ascertain if a block of symbols (like a text or a file) is received error-free after transmission over a noisy
channel.
Error Correcting Codes (aka ECC, Forward Error Correction) address the much more challenging task of removing the errors and obtaining a replica of the original block.
These techniques, developed since 1950, are now applied in a variety of digital transmission systems and applications.
It has been soon recognized that the problem of storing data on a storage device
(such as a hard disk, a CD-ROM) and correcting device errors is essentially the same.
The prevailing ECC terminology is, however, still borrowed from transmission systems
(source, channel, noise, transmission,...)
Note that in many cases (like TCP/IP) the protocol that governs transmission is a
sophisticated two-way protocol, so that one may use Error Detection only and request a
retransmission of blocks in error. Alternatively, one may use both Errror Detection and
Error Correction and request retransmission of the rare blocks that cannot be corrected.
In other cases (like transmission from a space probe or reading from a CD-ROM)
error correction has no real alternatives.
A variety of techniques has been proposed for error correction. The basic idea is to
transmit the payload (the information symbols) followed by some redundant check symbols
(aka checksums) that are used by a decoder to spot and correct (most) errors.
The channel error probability p is thus reduced to a lower residual probability pb, which
can be very close to zero.
A code that transmits a total of n symbols (a codeword) of which k are payload has a rate of k/n.
Slower transmission (rate <1) is the main price paid for nearly error-free transmission and
a code defines a trade-off between the two.
Some applications (like transmission of program files, block-encrypted data) have no room for errors, while others (like transmission of uncompressed bitmap images) are quite tolerant.
The transmitted symbols can be taken from any alphabet of D symbols: for example
they can be bits (D=2) or bytes (D=256) or text symbols.
In actual practice, D is typically chosen to be a power of 2.
To wrap it up, practical applications of Error Correcting Codes range from mobile phones
to CD-ROM and DVD, from telemetry to radio links, from Internet protocols to satellite
broadcasting.
In the last 50 years, while the theory and practice of codes made great progresses, the scenario of applications has also changed dramatically, from a world of scarse resources to a world of plenty, in terms of data volumes, transmission line throughput, storage capacity, CPU power and so forth.
Desirable qualities of an ECC
Why is an ECC better than the next ?
The answer lies in several user requirements, which in turn depend on the
application. But the main aims are high rate and low residual probability pb.
While accepting this trade-off between pb and rate, if a code A has both higher rate
and lower pb than code B (for the same p), we conclude that A outperforms B.
Then there are many minor criteria (which are more application dependent), e.g.
- Cost, time and complexity of decoding (encoding being typically very light)
- Delay -
The cleaned-up data block is available only after the last of n symbols has been received
and the decoding operation is completed.
If you are downloading a huge application from Internet, this is irrelevant because
you have no use for the file before download is completed.
If you are sending commands to a real-time device, the delay of a single block may become
an issue.
- Protocol advantages -
If your ECC is just a component in a complex transmission protocol, you may add new requirements. If the protocol is designed to send packets of q symbols plus overhead
and control information, you may want, for example, ECC blocks to fit nicely into packets and require that packet-symbols be the same or multiples of ECC-symbols.
- CODEC advantages -
If you are developing a dedicated hardware for decoding (a CODEC) you want a good match between
your CPU features and what the decoding algorithm uses (CPU word length, word- and byte-oriented instructions, supported numeric formats,...). You also prefer decoding algoritms that lend themselves to parallel operations and are easily reconfigured.
- Bursts -
The basic model assumes that errors on the stream of symbols are statistically independent ---and codes are designed accordingly. But there are situations where this model is far from accurate.
Imagine that electrical discharges in the vicinity of your equipment cause error bursts:
k adjacent symbols may be corrupted with probability 0.5.
Similarly, a scratched CD-ROM may produce many errors in the same disk area.
One requirement is then that the code also behave well for error bursts of length k symbols.
- Can the code detect its own failure in decoding ? (e.g. flagging a whole block as
unreliable, or flagging certain bits as "erasures")
- In case of retransmission of an unrecoverable block, it may be wasteful and redundant to resend the same block again (payload and check symbols). Transmitting a different set of check symbols seems a more efficient use of the channel.
Can the code handle this and merge first and second blocks to attempt decoding ?
State of the art
Unfortunately, there is no single ECC technique for all situations.
Many families of ECC have been proposed over the years, under the names of
Golay, BCH, Reed Solomon, Goppa, convolutional, quadratic residue, turbo, LDPC codes ...
These constitute a constellation of clever solutions, each based on a different mathematical theory, suitable for a range of probabilities, implemented by totally unrelated programs or
hardware.
This leaves three types of problems.
- 1. Design phase
The task of the code expert in picking the right code for the user requirements
is very difficult, in spite of the richness of the choices.
It may take days even to answer simple and legitimate questions, like
- In my application D=2 p=1E-4. Can I obtain pb<1E-9 with rate 0.85 ?
- What pb can I expect, instead, if my rate is 0.75 ?
- What is the best known code for D=256, p from 0.02 to 0.04 and rate about 0.8 ?
- 2. On-the-fly reconfiguration
Suppose the designer wants the code to adapt dynamically to the situation, with
requests such as
- Can I switch to a different code when p increases or performance decreases ?
- Can I assign a different target pb to transmitted blocks of different nature ?
These are purely academic questions - in practice, the answer is NO for the sake of
design simplicity.
- 3. Performance
Both theory and experience suggest that long codes (with larger n) tend to perform
better than short codes. But long codes are hard to find and, more important, as n increases, their decoding cost soon becomes forbidding.
In the binary case (D=2) codes with n>1000 are in practice quite rare.
Universal Latin Square codes
Universal Latin Square codes (ULS) have been recently developed as an improvement on points
1. 2. 3.
They tend to be long codes (with n of hundreds or thousand of symbols) with a cost
of decoding that grows moderately with n.
The main point is that they are "universal", meaning that the same algorithm provides
a good solution for any choice of
- D, alphabet size
- n, code length
- p, channel error probability
- rate, k/n
- pb, residual error probability
In fact they provide poor solutions only at the boundary of this space: when
n is very low (say, below 100) and when p is very high (say, above 0.1).
The design phase becomes much easier: the designer chooses D, p and rate and finds
acceptable n and pb through few exploratory simulations.
This may take as little as half an hour.
Several sophistications and on-the-fly reconfigurations are possible, while the
algorithm remains the same: the problem reduces to synchronizing encoder and
decoder to using a new set of parameters --- and this must be done via the same noisy channel used for data transmission.
Advantages of Universal Latin Square codes
- Good performance. In several regions of the (D,p,rate,pb)-space these codes
outperform traditional codes. The reasons are complex but include the
high code length and the fact that, unlike many classic codes, the ULS code can be
tailored to a range of p.
- Code design is guided and fast.
- New degrees of freedom in dynamic reconfiguration (with the same decoding program or CODEC)
- In the case of block retransmission, transmission of new check symbols can be easily
implemented. In a different scenario, the channel can be used in idle times to transmit
a new set of check symbols of previous blocks.
- "Try harder" feature -
In special applications (when retransmission is out of the question) there may be a trade-off between cost of decoding and probability of correct block decoding.
A block that could not be correctly decoded by the standard routine may be submitted to
a more time-consuming analysis.
- The decoding of high volumes is faster with an increase in D (regarding several bits as
a single symbol, e.g. using D=16 or D=256)
- Bursts of errors are well handled, due to long codes and implicit interleaving
- Disadvantage: delay associated with the choice of n.
- Disadvantage: unlike several shorter, classical codes, ULS are too complex to have
exact "closed" formulas that link the relevant parameters. For example, a simulation
run must be used to find pb and its confidence interval.
When pb is very small (say, below 1E-10) it becomes very hard to simulate enough
cases to measure it with any accuracy and different estimation techniques are called for.
Tabular results of ULC performance (pdf)
Write to mauro.guazzo [at] gmail.com if you want to
- receive technical papers on the theory of ULS as soon as they are available
- submit a special case (D,p,rate) and have pb and code parameters computed
- develop a CODEC for ULS codes
- submit a question or a comment
Home page -
Contact