Talk:μ-law algorithm
This article is rated C-class on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | |||||||||||||||||||||||||||
|
Talk
[edit]This article should be called "μ-law algorithm" with a lower-case μ. Unfortunately, the software currently prevents this, capitalizing the mu. Capital mu looks just like "M", which just looks wrong. -- Karada 23:19, 29 October 2005 (UTC)
- I changed it to "Mu-law algorithm", because a capital mu is evil. Not only does it look dumb, it isn't obvious to most people that it's a mu and not the Latin letter M. And nobody calls it M-law. - furrykef (Talk at me) 14:26, 8 April 2006 (UTC)
Graph/legend and text inconsistency
[edit]"Comparison with A-law: A special feature ... near zero sound pressure...": this is not expected from the analytical expression of F, and not visible on the graph (does it occur at a level below -80 dB ? In this case, a proper scaling of the axis should display the difference between A and Mu laws).
In addition, the green line on the graph is not "A-law", but "No companding"... --Dgcrete 12:27, 22 August 2006 (UTC)
- I fixed the graph. Regarding the "special feature near zero" - this statement is false (assuming there is no mistake in the formulae). Near zero, the mu-law and A-law formulae differ only by a constant multiplier. This is because as x goes to zero, ln(1+x) becomes the same as x, hence the top halves of the Mu-law and A-law equations become the same. The same is true for the for the quantized versions of the equations.
- I will remove the "special feature near zero" comment.
- I've also added quantized points to the graph
- Ozhiker 22:18, 22 August 2006 (UTC)
Table of 14 bit linear values incorrect?
[edit]I think the table of u-law to 14 bit linear is slightly incorrect. e.g. the first value should be -8031 rather than -8159.
-8159 is the maximum negative input to an encoder, which encodes to 0, but 0 should decode to -8031 (which is the centre of the range of values that encode to 0)
I think the 14 bit values should just be the given 16 bit values divided by 4. (The 16 bit values appear to be correct) TimMorley 16:34, 25 May 2007 (UTC)
Odd reference
[edit]One of the two references is from Wikipedia itself. That seems a bit sketchy...? 24.4.102.10 11:28, 23 September 2007 (UTC)
Who is right?
[edit]The table in this article differs from the code given in [1]: Both 0x7F and 0xFF decodes to 0. But the encoder in the article encodes 0 to 0xFF and -1 to 0x7F. --RokerHRO (talk) 08:11, 1 March 2010 (UTC)
Encoding table
[edit]I created an encoding table from "unsigned linear 16 bit" values into µ-law 8 bit values using some self-written programs for data generating and pretty printing and SoX for the µ-law encoding. As you can see the octet 0x7F does not apper in this µ-law encoding table.
µ-law encoding table u16 µ-law Range size 0x0000 ... 0x0487 00 1160 values 0x0488 ... 0x0887 01 1024 values 0x0888 ... 0x0C87 02 1024 values 0x0C88 ... 0x1087 03 1024 values 0x1088 ... 0x1487 04 1024 values 0x1488 ... 0x1887 05 1024 values 0x1888 ... 0x1C87 06 1024 values 0x1C88 ... 0x2087 07 1024 values 0x2088 ... 0x2487 08 1024 values 0x2488 ... 0x2887 09 1024 values 0x2888 ... 0x2C87 0A 1024 values 0x2C88 ... 0x3087 0B 1024 values 0x3088 ... 0x3487 0C 1024 values 0x3488 ... 0x3887 0D 1024 values 0x3888 ... 0x3C87 0E 1024 values 0x3C88 ... 0x4087 0F 1024 values 0x4088 ... 0x4287 10 512 values 0x4288 ... 0x4487 11 512 values 0x4488 ... 0x4687 12 512 values 0x4688 ... 0x4887 13 512 values 0x4888 ... 0x4A87 14 512 values 0x4A88 ... 0x4C87 15 512 values 0x4C88 ... 0x4E87 16 512 values 0x4E88 ... 0x5087 17 512 values 0x5088 ... 0x5287 18 512 values 0x5288 ... 0x5487 19 512 values 0x5488 ... 0x5687 1A 512 values 0x5688 ... 0x5887 1B 512 values 0x5888 ... 0x5A87 1C 512 values 0x5A88 ... 0x5C87 1D 512 values 0x5C88 ... 0x5E87 1E 512 values 0x5E88 ... 0x6087 1F 512 values 0x6088 ... 0x6187 20 256 values 0x6188 ... 0x6287 21 256 values 0x6288 ... 0x6387 22 256 values 0x6388 ... 0x6487 23 256 values 0x6488 ... 0x6587 24 256 values 0x6588 ... 0x6687 25 256 values 0x6688 ... 0x6787 26 256 values 0x6788 ... 0x6887 27 256 values 0x6888 ... 0x6987 28 256 values 0x6988 ... 0x6A87 29 256 values 0x6A88 ... 0x6B87 2A 256 values 0x6B88 ... 0x6C87 2B 256 values 0x6C88 ... 0x6D87 2C 256 values 0x6D88 ... 0x6E87 2D 256 values 0x6E88 ... 0x6F87 2E 256 values 0x6F88 ... 0x7087 2F 256 values 0x7088 ... 0x7107 30 128 values 0x7108 ... 0x7187 31 128 values 0x7188 ... 0x7207 32 128 values 0x7208 ... 0x7287 33 128 values 0x7288 ... 0x7307 34 128 values 0x7308 ... 0x7387 35 128 values 0x7388 ... 0x7407 36 128 values 0x7408 ... 0x7487 37 128 values 0x7488 ... 0x7507 38 128 values 0x7508 ... 0x7587 39 128 values 0x7588 ... 0x7607 3A 128 values 0x7608 ... 0x7687 3B 128 values 0x7688 ... 0x7707 3C 128 values 0x7708 ... 0x7787 3D 128 values 0x7788 ... 0x7807 3E 128 values 0x7808 ... 0x7887 3F 128 values 0x7888 ... 0x78C7 40 64 values 0x78C8 ... 0x7907 41 64 values 0x7908 ... 0x7947 42 64 values 0x7948 ... 0x7987 43 64 values 0x7988 ... 0x79C7 44 64 values 0x79C8 ... 0x7A07 45 64 values 0x7A08 ... 0x7A47 46 64 values 0x7A48 ... 0x7A87 47 64 values 0x7A88 ... 0x7AC7 48 64 values 0x7AC8 ... 0x7B07 49 64 values 0x7B08 ... 0x7B47 4A 64 values 0x7B48 ... 0x7B87 4B 64 values 0x7B88 ... 0x7BC7 4C 64 values 0x7BC8 ... 0x7C07 4D 64 values 0x7C08 ... 0x7C47 4E 64 values 0x7C48 ... 0x7C87 4F 64 values 0x7C88 ... 0x7CA7 50 32 values 0x7CA8 ... 0x7CC7 51 32 values 0x7CC8 ... 0x7CE7 52 32 values 0x7CE8 ... 0x7D07 53 32 values 0x7D08 ... 0x7D27 54 32 values 0x7D28 ... 0x7D47 55 32 values 0x7D48 ... 0x7D67 56 32 values 0x7D68 ... 0x7D87 57 32 values 0x7D88 ... 0x7DA7 58 32 values 0x7DA8 ... 0x7DC7 59 32 values 0x7DC8 ... 0x7DE7 5A 32 values 0x7DE8 ... 0x7E07 5B 32 values 0x7E08 ... 0x7E27 5C 32 values 0x7E28 ... 0x7E47 5D 32 values 0x7E48 ... 0x7E67 5E 32 values 0x7E68 ... 0x7E87 5F 32 values 0x7E88 ... 0x7E97 60 16 values 0x7E98 ... 0x7EA7 61 16 values 0x7EA8 ... 0x7EB7 62 16 values 0x7EB8 ... 0x7EC7 63 16 values 0x7EC8 ... 0x7ED7 64 16 values 0x7ED8 ... 0x7EE7 65 16 values 0x7EE8 ... 0x7EF7 66 16 values 0x7EF8 ... 0x7F07 67 16 values 0x7F08 ... 0x7F17 68 16 values 0x7F18 ... 0x7F27 69 16 values 0x7F28 ... 0x7F37 6A 16 values 0x7F38 ... 0x7F47 6B 16 values 0x7F48 ... 0x7F57 6C 16 values 0x7F58 ... 0x7F67 6D 16 values 0x7F68 ... 0x7F77 6E 16 values 0x7F78 ... 0x7F87 6F 16 values 0x7F88 ... 0x7F8F 70 8 values 0x7F90 ... 0x7F97 71 8 values 0x7F98 ... 0x7F9F 72 8 values 0x7FA0 ... 0x7FA7 73 8 values 0x7FA8 ... 0x7FAF 74 8 values 0x7FB0 ... 0x7FB7 75 8 values 0x7FB8 ... 0x7FBF 76 8 values 0x7FC0 ... 0x7FC7 77 8 values 0x7FC8 ... 0x7FCF 78 8 values 0x7FD0 ... 0x7FD7 79 8 values 0x7FD8 ... 0x7FDF 7A 8 values 0x7FE0 ... 0x7FE7 7B 8 values 0x7FE8 ... 0x7FEF 7C 8 values 0x7FF0 ... 0x7FF7 7D 8 values 0x7FF8 ... 0x7FFF 7E 8 values
µ-law encoding table u16 µ-law Range size 0x8000 ... 0x8003 FF 4 values 0x8004 ... 0x800B FE 8 values 0x800C ... 0x8013 FD 8 values 0x8014 ... 0x801B FC 8 values 0x801C ... 0x8023 FB 8 values 0x8024 ... 0x802B FA 8 values 0x802C ... 0x8033 F9 8 values 0x8034 ... 0x803B F8 8 values 0x803C ... 0x8043 F7 8 values 0x8044 ... 0x804B F6 8 values 0x804C ... 0x8053 F5 8 values 0x8054 ... 0x805B F4 8 values 0x805C ... 0x8063 F3 8 values 0x8064 ... 0x806B F2 8 values 0x806C ... 0x8073 F1 8 values 0x8074 ... 0x807B F0 8 values 0x807C ... 0x808B EF 16 values 0x808C ... 0x809B EE 16 values 0x809C ... 0x80AB ED 16 values 0x80AC ... 0x80BB EC 16 values 0x80BC ... 0x80CB EB 16 values 0x80CC ... 0x80DB EA 16 values 0x80DC ... 0x80EB E9 16 values 0x80EC ... 0x80FB E8 16 values 0x80FC ... 0x810B E7 16 values 0x810C ... 0x811B E6 16 values 0x811C ... 0x812B E5 16 values 0x812C ... 0x813B E4 16 values 0x813C ... 0x814B E3 16 values 0x814C ... 0x815B E2 16 values 0x815C ... 0x816B E1 16 values 0x816C ... 0x817B E0 16 values 0x817C ... 0x819B DF 32 values 0x819C ... 0x81BB DE 32 values 0x81BC ... 0x81DB DD 32 values 0x81DC ... 0x81FB DC 32 values 0x81FC ... 0x821B DB 32 values 0x821C ... 0x823B DA 32 values 0x823C ... 0x825B D9 32 values 0x825C ... 0x827B D8 32 values 0x827C ... 0x829B D7 32 values 0x829C ... 0x82BB D6 32 values 0x82BC ... 0x82DB D5 32 values 0x82DC ... 0x82FB D4 32 values 0x82FC ... 0x831B D3 32 values 0x831C ... 0x833B D2 32 values 0x833C ... 0x835B D1 32 values 0x835C ... 0x837B D0 32 values 0x837C ... 0x83BB CF 64 values 0x83BC ... 0x83FB CE 64 values 0x83FC ... 0x843B CD 64 values 0x843C ... 0x847B CC 64 values 0x847C ... 0x84BB CB 64 values 0x84BC ... 0x84FB CA 64 values 0x84FC ... 0x853B C9 64 values 0x853C ... 0x857B C8 64 values 0x857C ... 0x85BB C7 64 values 0x85BC ... 0x85FB C6 64 values 0x85FC ... 0x863B C5 64 values 0x863C ... 0x867B C4 64 values 0x867C ... 0x86BB C3 64 values 0x86BC ... 0x86FB C2 64 values 0x86FC ... 0x873B C1 64 values 0x873C ... 0x877B C0 64 values 0x877C ... 0x87FB BF 128 values 0x87FC ... 0x887B BE 128 values 0x887C ... 0x88FB BD 128 values 0x88FC ... 0x897B BC 128 values 0x897C ... 0x89FB BB 128 values 0x89FC ... 0x8A7B BA 128 values 0x8A7C ... 0x8AFB B9 128 values 0x8AFC ... 0x8B7B B8 128 values 0x8B7C ... 0x8BFB B7 128 values 0x8BFC ... 0x8C7B B6 128 values 0x8C7C ... 0x8CFB B5 128 values 0x8CFC ... 0x8D7B B4 128 values 0x8D7C ... 0x8DFB B3 128 values 0x8DFC ... 0x8E7B B2 128 values 0x8E7C ... 0x8EFB B1 128 values 0x8EFC ... 0x8F7B B0 128 values 0x8F7C ... 0x907B AF 256 values 0x907C ... 0x917B AE 256 values 0x917C ... 0x927B AD 256 values 0x927C ... 0x937B AC 256 values 0x937C ... 0x947B AB 256 values 0x947C ... 0x957B AA 256 values 0x957C ... 0x967B A9 256 values 0x967C ... 0x977B A8 256 values 0x977C ... 0x987B A7 256 values 0x987C ... 0x997B A6 256 values 0x997C ... 0x9A7B A5 256 values 0x9A7C ... 0x9B7B A4 256 values 0x9B7C ... 0x9C7B A3 256 values 0x9C7C ... 0x9D7B A2 256 values 0x9D7C ... 0x9E7B A1 256 values 0x9E7C ... 0x9F7B A0 256 values 0x9F7C ... 0xA17B 9F 512 values 0xA17C ... 0xA37B 9E 512 values 0xA37C ... 0xA57B 9D 512 values 0xA57C ... 0xA77B 9C 512 values 0xA77C ... 0xA97B 9B 512 values 0xA97C ... 0xAB7B 9A 512 values 0xAB7C ... 0xAD7B 99 512 values 0xAD7C ... 0xAF7B 98 512 values 0xAF7C ... 0xB17B 97 512 values 0xB17C ... 0xB37B 96 512 values 0xB37C ... 0xB57B 95 512 values 0xB57C ... 0xB77B 94 512 values 0xB77C ... 0xB97B 93 512 values 0xB97C ... 0xBB7B 92 512 values 0xBB7C ... 0xBD7B 91 512 values 0xBD7C ... 0xBF7B 90 512 values 0xBF7C ... 0xC37B 8F 1024 values 0xC37C ... 0xC77B 8E 1024 values 0xC77C ... 0xCB7B 8D 1024 values 0xCB7C ... 0xCF7B 8C 1024 values 0xCF7C ... 0xD37B 8B 1024 values 0xD37C ... 0xD77B 8A 1024 values 0xD77C ... 0xDB7B 89 1024 values 0xDB7C ... 0xDF7B 88 1024 values 0xDF7C ... 0xE37B 87 1024 values 0xE37C ... 0xE77B 86 1024 values 0xE77C ... 0xEB7B 85 1024 values 0xEB7C ... 0xEF7B 84 1024 values 0xEF7C ... 0xF37B 83 1024 values 0xF37C ... 0xF77B 82 1024 values 0xF77C ... 0xFB7B 81 1024 values 0xFB7C ... 0xFFFF 80 1156 values
The table differs slightly from the more compact one in the article. Which is correct? --RokerHRO (talk) 22:05, 16 April 2010 (UTC)
What is the rationale behind the choice μ = 255?
[edit]Why is the value of μ chosen to be 255? The article seems to suggest that this is because 255 = 2n - 1, where n is the number of bits used to store the encoding, but I see no reason for this.
I guess that evidently, choosing μ = 255 works well, but to me, this seems like an ad hoc choice rather than a choice that has anything to do with the number of bits used. I think it seems more reasonable that the encoding (i.e., F(x)) should be independent of the number of bits used to store the encoding and rather be based on the maximum sound pressure you want to support and the sensitivity of the human auditory system as a function of the sound pressure (such that the encoding has more densely packed quantization levels for sound pressures where the auditory system is more sensitive to small changes in the sound pressure).
I can think of two reasons that led to the choice μ = 255:
- The expressions for the encoding and decoding can be evaluated efficiently when μ = 255 — especially,
ln(1 + μ) = ln(1 + 255) = 8log2(1 + μ) = log2(1 + 255) = 8, so the division in the encoding can be preformed very efficiently by just preforming an arithmetic shift (presuming that the base of the logarithms is changed from e to 2) — however, this is true for any μ = 22m - 1, where m is a non-negatve integer - Choosing μ = 255 happens to give a reasonably good sound quality for all sound pressures that it should be possible represent
However, I see no connection between μ and the number of bits used to store the encoding, which the article suggests that there is.
So, is there such a connection? Or is the value 255 an ad hoc value chosen mostly for the reasons I listed? Or in other words, what is the rationale behind choosing μ = 255? —Kri (talk) 22:50, 17 June 2018 (UTC); edited 06:32, 21 June 2018 (UTC)
- Isn't that just to get the resulting 8-bit value of the input of the function? In other words μ specifies the maximum value of the output (minimum being 0), with the input's range being -1 to 1. —DIYeditor (talk) 01:55, 18 June 2018 (UTC)
- No, it's not. The output of F is still in the range -1 to +1, no matter what value you choose for μ. —Kri (talk) 19:27, 18 June 2018 (UTC)
- I tried to look through some technical papers and had trouble finding a straight answer for this. I think maybe part of it is as you say that 255 gives approximately the desired curve. As for the main reason: does the value 255 make the segments as defined in the g.711 recommendation (tables 2a and 2b) turn out as approximately powers of two intervals (less ~31)? This was a good call on your part, we need to clarify the article and address the part implying this is directly related to the 8-bit value. We need a RS that directly addresses this choice. —DIYeditor (talk) 01:05, 19 June 2018 (UTC)
- I also sifted through the G.711 recommendation but didn’t find any explanation for why μ = 255 is used.
- What is an RS? —Kri (talk) 07:10, 21 June 2018 (UTC)
- So is it okay if I remove the parenthesis "(8 bits)" since the choice μ = 255 and the number of bits used don't really seem to be related? —Kri (talk) 07:32, 2 July 2018 (UTC)
- Yes I agree with that. Does not appear to be directly related, it was original research by someone. —DIYeditor (talk) 08:30, 2 July 2018 (UTC)
- So is it okay if I remove the parenthesis "(8 bits)" since the choice μ = 255 and the number of bits used don't really seem to be related? —Kri (talk) 07:32, 2 July 2018 (UTC)
- Done. —Kri (talk) 14:13, 5 July 2018 (UTC)
Origin of μ and A in names?
[edit]I think it would be interesting for readers to learn about the origin / history of the µ and A in the terms µ-law and A-law. Do we have any WP:RS for this? --Matthiaspaul (talk) 18:01, 8 June 2021 (UTC)
Comparison with A-law
[edit]The "Comparison with A-law" section states that "The μ-law algorithm provides a slightly larger dynamic range than the A-law at the cost of worse proportional distortions for small signals." I think this may be backwards -- that is, A-law provides larger dynamic range and worse small-signal distortion than μ-law. The same (possibly erroneous) statement is in the A-law article. --99.121.214.126 (talk) 02:41, 18 December 2021 (UTC)
x-Axis of comparison plot confusingly labeled "Linear" but actually *logarithmic* (dBm0)
[edit]I'm looking at the comparison chart that is in both this μ-law page and the A-law page:
While it is an otherwise excellent chart, I have a frustration with the creator's (@Ozhiker) word choice of "Linear" in the x-Axis label, because really the x-axis is a *log* scale (dBm0 is a logarithmic unit). Hence the "No companding" graph is a straight line. Maybe the creator was trying to express something else with "Linear" (or am I misunderstanding something), but I think "Linear" just confuses readers. Better readability would be simply "Input Signal (dBm0)". It seems the chart was made with gnuplot and has instructions, so I could easily edit it or the svg if the creator is not around. But I want to make sure I'm not misunderstanding. Em3rgent0rdr (talk) 23:18, 26 June 2023 (UTC)
- I agree, "Input Signal (dBm0)" would be a better label. Linear in the current label likely refers to the fact that compression is a non-linear process so the input in that sense is linear and the output non-linear. ~Kvng (talk) 12:27, 29 June 2023 (UTC)
- fixed. Em3rgent0rdr (talk) 19:46, 29 June 2023 (UTC)
- Additional problem: The red (and blue?) line(s) seem(s) to extend beyond 0dB, up to maybe +2 or +3dB? That might be fine for an analogue amplification system or recording to tape, perhaps (like using dolby companding to extend the effective dynamic range of audio cassette), but it's not possible in the digital domain. 0dB is the maximum possible representable signal strength, both for input and output. Even if the encoding means you end up with internally calculated values exceeding this, wouldn't they either be clipped to 0dB, or the entire output scale normalised to shift the maximum output to be such (and all quieter values moving down a little in kind)? — Preceding unsigned comment added by 92.12.87.15 (talk) 17:53, 31 October 2023 (UTC)
Audio sample quality mismatch
[edit]The two samples given for "original" and "encoded" are so far apart in overall quality that it's impossible to make a judgement of the effect on the sound caused by the A-law encoding. The "encoded" version seems to be recorded at a much lower sample rate / frequency for a start, which compromises the apparent quality much more than the companding, and isn't an inherent effect of the codec either. Possibly somebody fed a CD quality voice synth output into something that encodes into Phone quality mu-law, without considering that it also reduces the sample rate to about 1/6th as part of the process?
Either the "encoded" version needs to be remade with all other aspects being the same as the original one, which isn't hard to achieve via the export / save menu options of any half decent audio processing program (and would sound somewhere between the original and a plain 8-bit PCM conversion), or the "original" needs to be preprocessed and presented in a state otherwise matching the current encoded one, except for being 16 (or 14?) bit PCM not 8-bit mu-law encoded (so would also sound rather muffled etc).
...and if whoever's supplying the audio samples doesn't understand what any of that means or why it's a problem, they should refrain from any further uploads until they've studied up.
(I have no account so can't do that I'm afraid, otherwise I'd have already replaced it myself, it's been more than 25 years since the first time I experimentally messed around with format conversions of that type so it'd be a piece of cake) 92.12.87.15 (talk) 18:00, 31 October 2023 (UTC)
Worst possible choice for speech samples
[edit]The speech samples are machine-generated, seemingly with a model that produces audio optimized for phone (or alike) operations, and not a great speech model in terms of clarity, on top of that. How is one to compare quality of the original is already taking the impairments for granted?
I'll replace these. MüllerMarcus (talk) 16:03, 21 December 2024 (UTC)
- There goes my conversion effort in response to the previous comment above. I assumed the pre-existing original was good enough as the only perceptible difference should be the raised noise floor with 8-bit samples anyway. I intentionally lowered the sample rate of the original to make the effects of sample coding easier to compare. But yeah, Harvard sentences spoken by a real human are definitely much better source material. – MwGamera (talk) 20:29, 21 December 2024 (UTC)
- C-Class Telecommunications articles
- Mid-importance Telecommunications articles
- C-Class Computing articles
- Low-importance Computing articles
- C-Class Computer networking articles
- Low-importance Computer networking articles
- C-Class Computer networking articles of Low-importance
- All Computer networking articles
- C-Class software articles
- Low-importance software articles
- C-Class software articles of Low-importance
- All Software articles
- All Computing articles