Jump to content

Talk:Reed–Solomon codes for coders: Difference between revisions

Page contents not supported in other languages.
From Wikiversity
Latest comment: 6 years ago by Lrq3000 in topic The magic number 0x11D (0b100011101)
Content deleted Content added
Rcgldr (discuss | contribs)
Signimu (discuss | contribs)
Line 79: Line 79:


::In the case of some tape backup drives used for computers, the "magic number" is 0x187. [[User:Rcgldr|Rcgldr]] ([[User talk:Rcgldr|discuss]] • [[Special:Contributions/Rcgldr|contribs]]) 21:25, 31 January 2018 (UTC)
::In the case of some tape backup drives used for computers, the "magic number" is 0x187. [[User:Rcgldr|Rcgldr]] ([[User talk:Rcgldr|discuss]] • [[Special:Contributions/Rcgldr|contribs]]) 21:25, 31 January 2018 (UTC)

:::Ok so it makes sense when deriving manually the calculations, but do these numbers provide any advantage when the calculations are automated by a program? --[[User:Lrq3000|Lrq3000]] ([[User talk:Lrq3000|discuss]] • [[Special:Contributions/Lrq3000|contribs]]) 00:25, 12 February 2018 (UTC)


== Some Mistakes I Think I Have Found in The Article ==
== Some Mistakes I Think I Have Found in The Article ==

Revision as of 00:25, 12 February 2018

Some problems using this

Hi, I was trying to use this on a Standard Datamatrix example I found in a text book. Attention, here the prime has to be set to

0x12d

Here the example:

The correct message is

c = [142,164,186,114,25,5,88,102]

with the first three being data, and the last 5 error symbols. For this message, the syndrome calculation correctly gives 0. If I then change the message to

c~ = [255,164,186,114,25,7,88,102]

The syndrome gets calculated correctly, if I use rs_calc_syndromes and change gf_pow(2,i) to gf_pow(2,i+1) in the loop of rs_calc_syndromes (I guess this is a error in your code, can you confirm this?). The correctly calculated syndrome is then

s = [126,161,114,243,137]

From this I get the right error locator polynomial via rs_find_error_locator, which is

[90,132,1]

using rs_find_errors I get [5,0] which are the correct positions, although I had to change gf_poly_eval(err_loc, gf_pow(2, i)) to gf_poly_eval(err_loc, gf_pow(2, -i)) in rs_find_errors (I guess this is another bug?) Still, using the correct syndrome and err_pos rs_correct_errata(mesecc, synd, err_pos) does not correct the message right, as the error magnitude seems to be wrong.

Can you help me get this example working?

You did not provide the parameters of the Galois Field, the generator etc that are needed. However, I personally tested the code here several times to ensure everything works fine. However, the code in the main article is a simplified example working only for Galois Field 2^8 and generator 2 and a specific fcr, so if the parameters of your example are different, you need to adapt the code. And indeed, since you had to do +1 in rs_calc_syndromes, it seems that your fcr = 1, so you cannot use the example from the main page (but your change in rs_find_errors is buggy, avoid changing i to -i without a very very good reason). An extended example code that supports different galois fields, generators and fcr is provided in the supplemental material, maybe you can try it out? --Lrq3000 (discusscontribs) 11:57, 16 June 2016 (UTC)Reply

Question about rs_correct_errata

Hi, Could someone explain, what do following lines in the rs_correct_errata function mean?

for i in range(0, len(pos)):            #loop 
      x = gf_exp[pos[i]+256-len(msg)]   #reciprocal of error location Xj
      y = gf_poly_eval(p, x)            #omega evaluation of reciprocal Xj
      z = gf_poly_eval(q, gf_mul(x,x))  #??
      msg[pos[i]] ^= gf_div(y, gf_mul(x,z)) #??

in relationship to explanation of Forney's algorithm on wikipedia.org? all other lines of the function are clear.

Thanks.

jan

It's evaluating the error values.
Since we're in a field of characteristic 2, Λ' is Λ with even-numbered coefficients set to zero. The code removes the even elements completely and then evaluates the resulting polynomial at x2 instead of x. Bobmath (discusscontribs) 20:08, 15 April 2013 (UTC)Reply

The magic number 0x11D (0b100011101)

Hello. Could someone explain how the value 0x11D was picked? This value was used in preparing the list objects gf_exp and gf_log. What if I want to use different list sizes such as 16 or 65536? How do I determine the correct modulo value for these lists?

For finding larger polynomials, check out wolfram alpha. Any of the primitive polynomials will do. Hjfreyer (discusscontribs) 15:15, 24 April 2015 (UTC)Reply

Thanks for your help - jccruz

i'm pretty sure 0x11b is the reducing polynomial of the field, basically if you represent it as a polynomial you cannot factor it. from what i read, i'm not entirely sure but you might be able to use every un-factorable polynomial with a degree of 8 as the reducing polynomial for the field, but just use 0x11b, maybe it was chosen for resones we cannot understand. and for the list sizes, they must be 256 because they must contain an item for each of the elements in the field. because the field contains 256 elements (2^8) both lists must be that size. i think that one of them is double that size to make implementation of the multiplication algorithm easier and faster (read the comments in that algorithm). so unless you will use a different finite field, use the exponent lists with sizes 256.

The math is a bit complicated; see w:Finite field. You can find a list of different "magic" numbers used by different codes in the zxing source.
I had the same question and looked into it. Added some explanation. Hjfreyer (discusscontribs) 15:13, 24 April 2015 (UTC)Reply
11d is not a magic number, out of the 30 possible prime polynomials for GF(256), only 16 of them (such as 11d) have a primitive element α = 1x + 0 = 2, which makes the math simpler. 11b is used by AES encryption, one of it's α's is 1x + 1 = 3, which is a bit awkward, however AES mostly just needs to calculate 1/x in this field, which can be done using a lookup table in software or sub-field mapping aka composite fields in hardware to reduce gate count. Rcgldr (discusscontribs) 17:13, 30 January 2018 (UTC)Reply
In the case of some tape backup drives used for computers, the "magic number" is 0x187. Rcgldr (discusscontribs) 21:25, 31 January 2018 (UTC)Reply
Ok so it makes sense when deriving manually the calculations, but do these numbers provide any advantage when the calculations are automated by a program? --Lrq3000 (discusscontribs) 00:25, 12 February 2018 (UTC)Reply

Some Mistakes I Think I Have Found in The Article

1: The generator number: In the article, they say if you multiply 2 by itself (in the finite field) over and over you will eventually reach all numbers in the field except 0. From my experiments, that's wrong, you should instead use 3 or 5 or any other number from the list in this page: http://www.samiam.org/galois.html

It depends on the generating polynomial chosen. Most often, code designers choose a polynomial for which 2 is a primitive element. Bobmath (discusscontribs) 02:28, 14 November 2014 (UTC)Reply
Yes but indeed a lot of academic papers and books advise to carefully choose the base number for the generator polynomial, as to maximize separability. But I don't know too much about QR codes standards, so it may be that they did not follow those advices and you are required to use 2... Lrq3000 (discusscontribs) 01:45, 4 June 2015 (UTC)Reply
As mentioned by Bobmath, it depends on the polynomial. Here is a list of the 30 9 bit polynomials in hex used for GF(28) along with the smallest α for that polynomial: 0x11b:0x03, 0x11d:0x02, 0x12b:0x02, 0x12d:0x02, 0x139:0x03, 0x13f:0x03, 0x14d:0x02, 0x15f:0x02, 0x163:0x02, 0x165:0x02, 0x169:0x02, 0x171:0x02, 0x177:0x03, 0x17b:0x0b, 0x187:0x02, 0x18b:0x0b, 0x18d:0x02, 0x19f:0x03, 0x1a3:0x03, 0x1a9:0x02, 0x1b1:0x07, 0x1bd:0x07, 0x1c3:0x02, 0x1cf:0x02, 0x1d7:0x07, 0x1dd:0x07, 0x1e7:0x02, 0x1f3:0x0d, 0x1f5:0x02, 0x1f9:0x03 . In the case of 0x1f3, the smallest α = 0x0d. Normally one of the 16 fields where α = 2 and the fewest number of 1 bits in the generator polynomial is chosen to simplify hardware. So although both 0x11d and 0x15f have the same "minimum" α = 2, 0x11d has five 1 bits versus 0x15f with seven 1 bits. Some devices use 0x187 which also has five 1 bits and "minimum" α = 2. Trivia - the bits for a field polynomial can be reversed, but for α ≠ 2, α may be different for the bit reversed field, for example, 0x19f:0x03 and bit reversed, 0x1f3:0x0d.Rcgldr (discusscontribs) 01:20, 9 February 2018 (UTC)Reply

Multiplication

I think I see an error in the multiplication example:

10001001 times 00101010 is 1011001111010, and not 1010001111010 (137 times 42 is 5754, and not 5242)

In the third line of your calculation 2 x⁸ disappears, what happens to it...? (2 x⁸ = 512, exactly the difference between 5242 and 5754)

Why don't you fix it? After all you have the ability to edit it and correct it. --Goldenburg111 14:41, 1 December 2014 (UTC)Reply

Sorry, i can not "fix" it because: the 'addition replaced with exclusive-or' is 0000100010010 ^ 0010001001000 ^ 1000100100000 = 1010001111010 in the example. This seems to be correct. (alas, where 2 x⁸ occurs we have 1 ^ 0 ^ 1 = 0; why is there no carrier to the x⁹ column....) I am an absolute noob and can not resolve this contradiction.

Alright, I see. Possibly someone down the line can do it or if anyone decides not to pay attention to this I'll do it myself. Thanks :) --Goldenburg111 16:23, 1 December 2014 (UTC)Reply

On a side note I have tried the "Python function which implements the polynomial multiplication" (as a rewrite in C) and it produces the same result 137 * 42 = 5242 (which is wrong in my opinion). Bitwise operations in the Python function work exactly as in the multiplication example. Do I miss the point here or is the function missing something?

The definition of "multiplication" here is fundamentally different from the algorithm from grade school we all know and love. It's correct that it gives different answers. Also note that in this setting, 8 - 1 = 9. Just a different set of rules. Hjfreyer (discusscontribs) 15:19, 24 April 2015 (UTC)Reply
I'd like to add that indeed your intuition is correct: additions and substractions of GF(2^p) elements are carry-less. Do not ask me why, I can't give the specifics, but I have read it in a lot of academic books and articles.Lrq3000 (discusscontribs) 01:43, 4 June 2015 (UTC)Reply
I just checked the results, both programmatically and manually. The calculation is correct, and the result you've got is because you did not do the addition carry-less: if you do a normal addition inside the multiplication, you get 1011001111010, but if you do it carry-less like it should, you get 1010001111010. Note that doing with-carry addition might get you to the correct result after you do the modular reduction (polynomial division) for some numbers, but not for all. That's a pretty hard bug to catch, so be precautious in your implementation of carry-less operations, they are mandatory.--Lrq3000 (discusscontribs) 18:42, 22 June 2015 (UTC)Reply

I get a different value for g_4(x)

When calculating , I get a different coefficient for :

In hexadecimal, the coefficients would be: [ 0x01 0x0f 0x46 0x78 0x40 ]

Nope, just did it out by hand. His number is correct. Remember, the calculations are happening using the field operations. So, addition is replaced by XOR, and 14+56 = 54, not 70 like it does with normal arithmetic. Hjfreyer (discusscontribs) 20:26, 27 April 2015 (UTC)Reply

More importantly, though, why did he choose g4(x) and not a higher number? Is that what's actually used in QR? Is that only for the scope of the example?

According to Wikipedia: "[In QR codes] Codewords are 8 bits long and use the Reed–Solomon error correction algorithm with four error correction levels." So yes, the given polynomial is the one used in the spec, I'm pretty sure. Hjfreyer (discusscontribs) 20:26, 27 April 2015 (UTC)Reply

The RS encoding example, is a little confusing. I don't see how he get's the value that he's XOR-ing with. In the section above (the BCH part) he was XOR-ing with the generator polynomial (in binary) which made sense to me. Here, I'd expect he'd XOR with the generator polynomial as well (this time with the RS polynomial, that we've just computed). But no matter how I shift the generator polynomial back and forth, I can't get them to align, so that it'd result '12' in the left-most byte.

I can't see where he get's his 12 ee 2b 23 f4 from. As I understnand it - and how I remember if from school - I should multiply the whole byte string (the generator) by 12, but that doesn't seem to work either.

You lost me a little bit with the specifics, but re-read the section on finite field arithmetic (which I recently beefed up), and try your calculations again using those operations. Don't worry, it's legitimately tricky. I studied this kind of math in college and it still hurts my brain sometimes. Hjfreyer (discusscontribs) 20:26, 27 April 2015 (UTC)Reply

Possible plagiat (with more comments)

Another tutorial on Reed-Solomon with provided sourcecode, with more comments, can be found here: http://www.drdobbs.com/testing/error-correction-with-reed-solomon/240157266

Sourcecode: http://twimgs.com/ddj/images/article/2013/0613/ReedSolomon.py

This is probably a plagiat of this tutorial on wikiversity, since the structure is identical, only variables names and comments are different (there are a lot of different ways to implement Reed-Solomon, you won't find any two implementation having the same structure and using the exact same algorithms implementations). However, the comments and the more meaningful variables names can be useful as an additional study material for readers. Lrq3000 (discusscontribs) 23:11, 3 June 2015 (UTC)Reply

The decoding part of the Dr.Dobb's tutorial seems to be highly similar to the decoding part of your tutorial except for one thing: Its _rsCorrect() function looks quite different from your rs_correct_errata() function. I made a simple performance test and found that _rsCorrect() is almost twice as fast as rs_correct_errata(). I do not know why there is such a difference, though.--Kenethcusten (discusscontribs) 06:40, 19 January 2017 (UTC)Reply
I wrote this before I did a major overhaul of the article (I was talking about Bobmath's original code), so this might explain the differences. In particular, rs_correct_errata() is drastically different. Also, it can correct both errors and erasures, I think _rsCorrect() can only correct errors (as Bobmath's original code), so this might explain why it's slower. Finally, the code here is not made for optimized speed, I have a version of this code that runs more than 10x faster in native Python, but is rather optimized for readability. --Lrq3000 (discusscontribs) 13:17, 6 February 2017 (UTC)Reply

Update 06/2015

I updated the tutorial with more background information, more details, updated code and added comments, and added possible enhancements and current research topics such as list decoding. I hope the tutorial is still logical and natural to read, but if not, feel free to fix the mistakes you see.

Also feel free to ask if you feel like there needs to be some more info on a topic. I may not answer all questions, but I will do my best. Lrq3000 (discusscontribs) 01:38, 4 June 2015 (UTC)Reply

Coders, help is needed!

Help us make this tutorial better by unifying the convention for polynomials ordering inside lists throughout the code and functions.

Let me explain: for now, there are a lot of list reversing in order to make it work, you never know the order of a polynomial, ie, if the first coefficient is the major degree or the constant term... This can be particularly frustrating for learners that try to understand what the code really does, and for expert coders as well when trying to debug.

Stating a clear convention from the start (eg, least degree first, major degree last) and following it throughout the code would help a lot. There's no need to understand the maths, so if you are an experienced coder and want to help, you are welcome to do that! Lrq3000 (discusscontribs) 16:51, 13 June 2015 (UTC)Reply

For Berlekamp Massey decoder, the order doesn't matter as long as it's consistent. For Sugiyama's adaptation of extended Euclid algorithm, the polynomials should be most significant coefficient first, since it's similar to long division. However, hardware implementations often use a pair of shift registers that always shift left, so potential error locator polynomials are left to right, but potential "Omegas" are right to left, a pair of them in each register. Rcgldr (discusscontribs) 17:30, 30 January 2018 (UTC)Reply

Optimized for generator=2

I noticed that the code in this tutorial (but not the explanations) is optimized because the field is generated using 2, which gives hardly comprehensible code like the following (from the rs_generator_poly):

g = gf_poly_mul(g, [1, gf_exp[i]])

This isn't theoretically correct, the following is:

g = gf_poly_mul(g, [1, gf_pow(generator, i)])

However, since the precomputed log and anti-log tables are generated using 2, you get the following:

gf_pow(2, i) == gf_exp[(gf_log[2] * i) % 255] == gf_exp[(1 * i) % 255] == gf_exp[i]

Because of course log(2) == 1 if the base is 2.

Optimization explained! But not generalizable at all if you work with other implementations of the Reed-Solomon codec, so I will fix the tutorial by implementing generators in the code. Lrq3000 (discusscontribs) 23:56, 22 June 2015 (UTC)Reply

Update 10/2015

I updated the tutorial with a better code, which removes some very edgy errors in decoding that can happen, and also with a lot more comments and better mathematical clarity (I removed the optimizations that were very specific here and that you could not find in any book, the goal here is that the tutorial should be clear and close to the mathematical equations so that anyone can understand and compare with what you can read in books).

I also tested the code thoroughly to make sure everything works and is correct, and I also added a universal Reed-Solomon codec, which is based on the code in the main article, in the appendix here: https://en.wikiversity.org/wiki/Reed%E2%80%93Solomon_codes_for_coders/Additional_information#Universal_Reed-Solomon_Codec

I hope it will help the readers work more confidently and better understand the Reed-Solomon algorithms. --Lrq3000 (discusscontribs) 19:04, 8 October 2015 (UTC)Reply

Why "% 255"?

GF(2^8) contains 256 values from 0 to 255. If we mod a value by 255 we will get a result in the range from 0 to 254, which contains only 255 different values. In the code, I see many "% 255". Did you mean "% 256" or the equivalent "& 255"? -- by Kenethcusten on 15 January 2017.

Um... that's a good question. Actually, IIRC, I did some tests, and I discovered that the first and the last elements of the log table are the same. So basically, even if indeed there are 256 elements in the field, element 1 == element 256, so it can be skipped. This is anyway something you will see in all RS implementations. You should ask a mathematician for a more formal proof than my anecdotical evidence :) --Lrq3000 (discusscontribs) 13:22, 6 February 2017 (UTC)Reply
But I can confirm that if you use % 256, the maths won't work. You need to use nb_of_elements - 1, whatever field you are using. --Lrq3000 (discusscontribs) 13:23, 6 February 2017 (UTC)Reply
Log and antilog tables are based on powers of α, which don't include the value 0. Cyclic codes such as BCH Code based implementations of Reed Solomon have maximum length 255 for GF(256), non cyclic codes such as the original view Reed Solomon codes have maximum length 256. Rcgldr (discusscontribs) 21:36, 31 January 2018 (UTC)Reply
Thank you very much Rcgldr for the clarification, I think you should add this to the article, this is a very useful explanation :-) --Lrq3000 (discusscontribs) 02:50, 9 February 2018 (UTC)Reply

original view decoders

Somehow these got missed in the wiki article, even though Berlekamp Welch (1986) has it's own wiki article, but the original article wasn't clear (it got a rewrite). The wiki article now also includes the Gao (extended Euclid) decoder for original view codewords. Rcgldr (discusscontribs) 17:22, 30 January 2018 (UTC)Reply

Principles of error correction - bad / misleading example

From the article: "If we now get five corruptions, for example on that ... * * * * 20 8 1 * ...". This is misleading, since it implies correction could be made with 5 erasures, when it can't even correct a 4 erasure case such as: "t h * * 20 8 * *", it's equally distant from "t h i s 20 8 9 19" and "t h a t 20 8 1 20". Rcgldr (discusscontribs) 05:28, 1 February 2018 (UTC)Reply

Why use QR codes as an example?

In my opinion, the section on QR codes takes up too much space in a article about Reed Solomon coding. It would be sufficient to note the cases where it's used similar to the Wiki Reed Solomon article (cd's, dvd's, ... , no mention of hard drives internal ecc?). Rcgldr (discusscontribs) 05:31, 1 February 2018 (UTC)Reply

Not all Reed Solomon codes are BCH codes

The original view Reed Solomon codes are not a class of BCH codes, since they don't use a fixed generator polynomial, instead the polynomial is based on the message and the encoder and decoder use the same set of evaluation points, although if the set of evaluation points are successive powers of α, the code becomes cyclic (a rotated code word would still be a "valid" codeword). The first original view "decoder" was impractical, and since the concurrently developed BCH codes had practical decoders, the Reed Solomon codes were switched to using BCH style generator polynomials, making them a class of BCH codes and a class of Reed Solomon codes. However, in 1986, an original view practical decoder (time complexity O(n^3)) was developed by Berlekamp and Welch, and later in 2002 an improved original view decoder was developed by Gao. There are also "list" decoders that can generate a list of potential original view polynomials in polynomial time, and work continues on these. These original view decoders are now included, along with examples, in the Wiki Reed Solomon article. Rcgldr (discusscontribs) 05:42, 1 February 2018 (UTC)Reply

Thank you for your input that's very interesting, I must admit I don't know much about the history of error correction codes as I have learnt by myself (my background is in artificial intelligence, we don't study much about signal processing). However, I can't find your commits in the history, can you give me a few links (or maybe you meant you updated the wikipedia article, not the wikiversity one)? BTW if you have some info on how to implement a list decoder or a fast NTT decoder, **please** feel free to contribute to the article, as I would very much like to learn how to implement these approaches :-D --Lrq3000 (discusscontribs) 03:00, 9 February 2018 (UTC)Reply
The original view decoders and history have been updated in the wiki article wiki Reed Solomon. The wiki article also mentions list decoders with perhaps one link. Seems like the work on list decoders is still ongoing. One issue with the original view decoders is that they are slower than BCH view decoders. This becomes clear when you compare Gao's original view extended Euclid algorithm versus Sugiyama's BCH view extended Euclid algorithm. Gao's method starts off with degree n and n-1 polynomials, while Sugiyama's starts off with n-k and n-k-1 degree polynomials. The number of iterations for both decoders is the same as the number of errors, and with quadratic time complexity, the difference in time is ~(n/(n-k))2. Berlekamp Massey original view decoder has cubic time complexity so significantly slower. The list decoders are slower still, and meant for "low rate" code (smaller number of elements in a codeword) but attempt to handle cases beyond the normal error bound, by finding a list of original view polynomials of degree k-1 that match "most" of the n elements in a received codeword with errors. This is the same as Reed Solomon's original concept, but with optimized algorithms that can produce a list in polynomial time, but as mentioned, they're too slow to be useful for RS(n,n-k) codes where n = 255 is a common size for a BCH 8 bit code. Rcgldr (discusscontribs) 03:45, 9 February 2018 (UTC)Reply
I should mention that the list decoders are sacrificing mis-correction detection in favor of a good chance of recovery. One example would be image or video transmission, where occasional mis-corrections would be preferred to retransmission of data. It's not clear to me what is the motivation behind the original view decoder research. If there was some advantage for these with a low rate code, such as RAID 6 on 10 or fewer hard drives, they could be used, but to my knowledge, most or all RAID 6 error correction methods are based on BCH view RS codes. The main advantage is you get one more element, for GF(2^8) original view codeword size can be 256, instead of BCH limit of 255. Rcgldr (discusscontribs) 03:59, 9 February 2018 (UTC)Reply

General comments about the article

As noted above, original view Reed Solomon codes are not BCH codes, and slow but "practical" decoders for the original view encoding now exist, such as Berlekamp Welch (1986) and Gao (2002). Rcgldr (discusscontribs)

Polynomial addition and subtraction never involves carries or borrows between terms (coefficients) of a polynomial. Rcgldr (discusscontribs) 09:11, 9 February 2018 (UTC)Reply

Reed Solomon codes can be based on GF(pm) for p ≥ 2 and m ≥ 1. The wiki Reed Solomon article uses GF(929) (929 is prime) as an example. As an unusual example, GF(32) based on 1x2+2x+2, with primitive α = 1x+0 could be used. These are examples where addition and subtraction are not xor operations, and need to be kept track of (knowing when to subtract versus add). Rcgldr (discusscontribs) 09:10, 9 February 2018 (UTC)Reply