Talk:Benford's law/Archive 4
This is an archive of past discussions about Benford's law. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 |
explain "several orders of magnitude", correct sentence on final digit in Russian election analysis
The overview should clarify that for data to span several orders of magnitude means that the ratio of highest to lowest number in the data should be above 100. The number of orders of magnitude "spanned" is log(highest/lowest) with the logarithm taken to the base 10.
Also the discussion of Russian election data being too close to the Benford prediction incorrectly states that the final digit averaging to 4.5 is a prediction of Benford's law. It has nothing to do with Benford, the probability model for final digits is (almost always, and in this case) that all ten digits have equal probability. The average of 0,1,2,3,4,5,6,7,8,9 is 4.5. 73.89.25.252 (talk) 15:57, 9 November 2020 (UTC)
- With regards the log calculation, this is an approximation of the order of magnitude (or is order of magnitude and approximation of the logarithm!) which I don't think is helpful in the context of this article. Instead, the article on order of magnitude is linked, so if a reader is confused they can read more about it there. With regards the Russian election data, you are correct that the final digit isn't really covered by Benford. I've removed the reference to this, and left the comment on second digit in as this is covered by the generalisation. Awoma (talk) 16:15, 9 November 2020 (UTC)
- The log formula not an approximation, it is exactly the definition of (the number or span of) "orders of magnitude" in this context. There is room to debate what "several" should mean but presumably at least 2, so a factor of at least 100. 73.89.25.252 (talk) 16:49, 9 November 2020 (UTC)
- It is an approximation. An order of magnitude is an integer, while a logarithm is in general not. Awoma (talk) 16:58, 9 November 2020 (UTC)
- The statements about Benford law working better over a span of multiple orders of magnitude are referring to a non-integer number of orders of magnitude between the lowest and highest data (or some weighted version of this measure accounting for the frequency distribution of the data and not only the range). Saying "this data covers 4.5 orders of magnitude, Benford should work", or "between 4 and 5 orders of magnitude" neither assumes nor contradicts your idea that orders of magnitude have to be integers. There are practically no data sets where the low and high values differ exactly by an integer power of 10, yet Benford applies none the less. The issue is not whether it's an integer but whether the log of the ratio is large enough. The log formula is not "approximating" anything, it is the actual quantity of interest for this purpose. 73.89.25.252 (talk) 17:25, 9 November 2020 (UTC)
- I'd recommend reading wikipedia's article on order of magnitude. It is an integer. When the article says "data spanning multiple orders of magnitude" it means there should be a decent proportion of data points from different orders of magnitude. For example, if there is a decent amount of data with order of magnitude 2, order of magnitude 3, and order of magnitude 4, then we might expect Benford to fit nicely. I don't think readers would be confused by this wording, and I certainly don't think any such confusion would be helped by presenting a logarithmic approximation. With regards there being practically no data sets that are suitable for Benford, that's just not true. The article gives the example of town populations (which span 6 orders of magnitude) and stock prices. Both great examples. Awoma (talk) 17:36, 9 November 2020 (UTC)
- The article order of magnitude is erroneous, as are many of your comments on this page. There is now a talk page discussion there that I started, detailing multiple problems with the article, and corroborated by comments of several others.
- I'd recommend reading wikipedia's article on order of magnitude. It is an integer. When the article says "data spanning multiple orders of magnitude" it means there should be a decent proportion of data points from different orders of magnitude. For example, if there is a decent amount of data with order of magnitude 2, order of magnitude 3, and order of magnitude 4, then we might expect Benford to fit nicely. I don't think readers would be confused by this wording, and I certainly don't think any such confusion would be helped by presenting a logarithmic approximation. With regards there being practically no data sets that are suitable for Benford, that's just not true. The article gives the example of town populations (which span 6 orders of magnitude) and stock prices. Both great examples. Awoma (talk) 17:36, 9 November 2020 (UTC)
- The statements about Benford law working better over a span of multiple orders of magnitude are referring to a non-integer number of orders of magnitude between the lowest and highest data (or some weighted version of this measure accounting for the frequency distribution of the data and not only the range). Saying "this data covers 4.5 orders of magnitude, Benford should work", or "between 4 and 5 orders of magnitude" neither assumes nor contradicts your idea that orders of magnitude have to be integers. There are practically no data sets where the low and high values differ exactly by an integer power of 10, yet Benford applies none the less. The issue is not whether it's an integer but whether the log of the ratio is large enough. The log formula is not "approximating" anything, it is the actual quantity of interest for this purpose. 73.89.25.252 (talk) 17:25, 9 November 2020 (UTC)
- It is an approximation. An order of magnitude is an integer, while a logarithm is in general not. Awoma (talk) 16:58, 9 November 2020 (UTC)
- There isn't actually a concept in current use of numbers having "order of magnitude 3" or phrases like that. If you got this idea from the Wikipedia article that would explain some of the communication problems here. However, your interpretation of "spanning multiple orders of magnitude" is practically not much different from a statement in terms of ratios.
- The material removed in your recent edits should be restored. Final digits is important as a contrast to Benford, showing it is not an all purpose model of digit probabilities (and Benford for the k-th digit from the left converges to uniform distribution for increasing k). 73.89.25.252 (talk) 06:48, 10 November 2020 (UTC)
- This is getting slightly ridiculous. Your posts are seeming more about opposing anything I say, to the extent that when I agreed with your assessment on removing the final digit comment, you have now flipped on this and are saying it should be put back. Final digits being uniform is something which can be established with no appeal to Benford, and using far more rudimentary mathematics. Further, the generalisation of Benford has additional requirements on the domain of the data being studied. If the domain is natural (for example, winning raffle tickets over the course of many raffles), then the distribution of the n-th digit does not tend to uniform as as n tends to infinity. Instead, there will exist some N such that for all n greater than N the distribution is 100% 0s. In the other direction, we would expect uniformity of the final digit in this case, and the concept of "final digit" only makes sense because the domain of the data is the natural numbers. If you are unhappy with the order of magnitude article, declaring it wrong, then would you be happy with the scientific notation article? This uses the exact same definition. Explanations of this concept proliferate online also, so if you are not happy with wikipedia feel free to use any of those. That you appear to be denying the very concept of orders of magnitude is bizarre. Awoma (talk) 08:59, 10 November 2020 (UTC)
- I never asked to remove the material on final digits, but to correct it by clarifying that 4.5 doesn't come from Benford's law.
- Your comments are being opposed because you insistently post nonsense ("damning errors" as the other IP put it) while arrogantly "correcting" statements by more knowledgeable editors and referring them to material that, unknown to you, is actually wrong. You also made several edits to the article that have made it worse. Why not let people who actually understand this stuff do the editing and discussion on this article? 73.89.25.252 (talk) 09:09, 10 November 2020 (UTC)
- Please keep in mind that we both want the article to be as best as it can possibly be. With that in mind, if you have suggestions on how to improve it, then make these. It is quite likely and natural that other editors will agree with some improvements and not agree with others. In this case, I agree with your point about the final digit. It comes from a fundamentally different reasoning than Benford's law. I do not agree that we should include the logarithmic approximation of orders of magnitude. There is no hard and fast rule relating the orders of magnitude to the expected degree of fit to Benford's law. It is merely that data spanning more orders of magnitude is expected to fit better. This is what is currently expressed in the article, and I think most readers would understand this well and not be confused. Awoma (talk) 09:43, 10 November 2020 (UTC)
- This is getting slightly ridiculous. Your posts are seeming more about opposing anything I say, to the extent that when I agreed with your assessment on removing the final digit comment, you have now flipped on this and are saying it should be put back. Final digits being uniform is something which can be established with no appeal to Benford, and using far more rudimentary mathematics. Further, the generalisation of Benford has additional requirements on the domain of the data being studied. If the domain is natural (for example, winning raffle tickets over the course of many raffles), then the distribution of the n-th digit does not tend to uniform as as n tends to infinity. Instead, there will exist some N such that for all n greater than N the distribution is 100% 0s. In the other direction, we would expect uniformity of the final digit in this case, and the concept of "final digit" only makes sense because the domain of the data is the natural numbers. If you are unhappy with the order of magnitude article, declaring it wrong, then would you be happy with the scientific notation article? This uses the exact same definition. Explanations of this concept proliferate online also, so if you are not happy with wikipedia feel free to use any of those. That you appear to be denying the very concept of orders of magnitude is bizarre. Awoma (talk) 08:59, 10 November 2020 (UTC)
misleading paragraph
"In terms of conventional probability density (referenced to a linear scale rather than log scale, i.e. P(x) dx rather than P(log x) d(log x)), the equivalent criterion is that Benford's law will be very accurately satisfied when P(x) is approximately proportional to 1/x over several orders-of-magnitude variation in x.[15]"
This statement is untrue, and also unreferenced (the apparent reference "[15]" discusses the meaning of the log of the probability.) Since the distribution P(x)=1/x follows Benford's law (assuming appropriate cutoff to maintain finite integral), a distribution that looks like 1/x will follow Benford's law, but this is not a criterion for following the law because distributions that look very different from 1/x can also follow Benford's law (i.e., 1/x is sufficient, but not necessary).
I am removing it, if for no other reason than that it is unreferenced; if there's a citation for it, go ahead and put it back (and add the cite). Skepticalgiraffe (talk) 13:30, 10 November 2020 (UTC)
...and, now that I look at it, I'm going to remove the reference 15 as well. Text of [15] is: "This section discusses and plots probability distributions of the logarithms of a variable. This is not the same as taking a regular probability distribution of a variable, and simply plotting it on a log scale. Instead, one multiplies the distribution by a certain function. The log scale distorts the horizontal distances, so the height has to be changed also, in order for the area under each section of the curve to remain true to the original distribution. See, for example, [1]. Specifically: "
The reason to remove it is that this paragraph simply strays too far from the topic. 13:51, 10 November 2020 (UTC)
- Agreed. This statement is false, so it is unlikely such a citation will be found! Awoma (talk) 13:55, 10 November 2020 (UTC)
I think this book is appropriate to be added as another reference:
- Steven J. Miller (Eds): "Benford's Law: Theory and Applications", Princeton University Press,ISBN 978-0691147611 (May 2015).
Error on range-restricted data not following Benford law
The section on applicability of Benford law says that while population data conforms to the law, looking at villages with 300 to 999 inhabitants would not. This is misleading; the relative frequency of first digits (other than 1 and 2) would still follow the Benford prediction.
If there were a law that settlements could not be formed with fewer than 300 people, that would invalidate the Benford statistics by artificially producing an excess of villages with slightly more than 300 inhabitants. 73.89.25.252 (talk) 16:14, 9 November 2020 (UTC)
- This isn't true. Looking at villages with 300 to 999 inhabitants, we would expect all leading digits between 3 and 9 to appear equally often. Benford's law does not apply here due to the failure of the dataset to span multiple orders of magnitude. Awoma (talk) 16:17, 9 November 2020 (UTC)
- The statement was correct. We would absolutely not expect leading digits to appear equally often. We would expect the relative number of 3's and 9's (i.e. the ratio of frequencies) to be identical in the truncated and untruncated data, and the latter follow Benford. The truncation only affects the relative frequency of items kept in the data set compared to those excluded. 73.89.25.252 (talk) 17:03, 9 November 2020 (UTC)
- It's not clear to me what you mean by "truncated and untruncated data." This section of the article at current is correct. Looking at populations of villages with populations between 300 and 999, we expect all leading digits to occur at the same rate (apart from 1 and 2 which don't occur at all). You can see this from the fact that all populations in that range are equally likely, and there are exactly 100 different possible populations for each leading digit. Awoma (talk) 17:10, 9 November 2020 (UTC)
- No, we NEVER expect all values in a range to be equally probable, if they are formed by restricting a distribution, that (as you assumed) satisfies Benford's law, to a sub-interval. Those two statements contradict each other since the ratios of probabilities are unchanged by the restriction.
- It's not clear to me what you mean by "truncated and untruncated data." This section of the article at current is correct. Looking at populations of villages with populations between 300 and 999, we expect all leading digits to occur at the same rate (apart from 1 and 2 which don't occur at all). You can see this from the fact that all populations in that range are equally likely, and there are exactly 100 different possible populations for each leading digit. Awoma (talk) 17:10, 9 November 2020 (UTC)
- The statement was correct. We would absolutely not expect leading digits to appear equally often. We would expect the relative number of 3's and 9's (i.e. the ratio of frequencies) to be identical in the truncated and untruncated data, and the latter follow Benford. The truncation only affects the relative frequency of items kept in the data set compared to those excluded. 73.89.25.252 (talk) 17:03, 9 November 2020 (UTC)
- It is incorrect to say Benford statistics don't apply to data in a range that doesn't have all possible first digits (solely for that reason; it could fail for other reasons). Benford is a "law" about the relative frequencies of digit patterns in settings that are approximately scale invariant and it makes perfect sense for data in ranges like 300-999. The sorts of arguments you are making about why it would work for data without restriction to a range but fail when so restricted do not make sense, they are just-so-stories that are likely to be wrong: if there's something causing Benford relative frequencies to apply to e.g. village populations as a whole there is no particular reason it would "know" to not work for some arbitrarily chosen interval of population sizes. As explained in the article you can understand Benford as a local density that is integrated to give the logarithm and usually if Benford works it's because the local version works, so integrate that from 3.000 to 9.999... to get a probability distribution. 73.89.25.252 (talk) 07:24, 10 November 2020 (UTC)
- There are lots of examples where all values in a range are equally probably. This is called the uniform distribution. For example, winning raffle tickets are uniformly distributed over all participating tickets. The data set of winning raffle tickets across multiple raffles (where the size of each raffle is unknown) does fit Benford's law. Whereas if we only consider winning tickets between 300 and 999 (assuming all these numbers were present in each raffle. i.e. they had at least 1000 participants each time) then the distribution of leading digits in this range will be uniform still. This example comes from the excellent numberphile video on Benford's law which you may find interesting. Benford's law definitely does not apply to data where we have restricted the range in such a fashion. It doesn't even matter if all digits are present. If we only consider single-digit winning raffle tickets in the above example, then the leading digit data will reflect the underlying uniformity of the wider data. If we consider only winning raffle tickets up to 19 (assuming at least 19 tickets in every raffle) then our distribution would have 58% leading 1s, and just 5% all other digits. I think you can see this intuitively - we're only considering winning tickets up to 19, and the majority of them start with a 1, whereas to have leading digit anything else you need exactly that number. If we bound at any set point like this, then we can actually work out the underlying distribution as it is simply equal to the distribution of leading digits less than our bound (note: the average result we get across all such bounds is Benford's law!). What if we have a different distribution, other than uniform. It has been commented that village populations will distribute in some other non-uniform fashion due to population dynamics. In that instance, all we have to do is multiply the distributions. As a result, looking only at single digit data we will not see Benford but will see the underlying distribution for these values. Looking at 300-999 will not show Benford, but instead the underlying distribution of village populations for that range. This is something that clearly sits uncomfortably with you. If Benford's law works for the data, then why does it not work for an arbitrarily selected range? The answer is that it *does* work for an "arbitrarily selected" range, but the range 300-999 is far from arbitrary. In fact, we would see very different distributions depending on what range we picked (see the raffle example for an easier to understand case of this) and all those distributions will average out to Benford's law. Importantly, Benford's law works for the village data, and works for the uniform raffle data. It will also work for data sets which are distributed in just about any fashion - such is a strength of the law. You have an idea, then, of us ruling out a digit, and the remaining numbers still having the same relative frequencies. You are absolutely correct. If we look at village populations, and simply dismiss all villages with population starting with a 3, then the resulting distribution would indeed have the same relative distributions of leading digits - just with no 3s. Hope this helps! Awoma (talk) 09:23, 10 November 2020 (UTC)
- None of that has any relevance to the discussion or the article. If population numbers of towns, in toto, satisfy Benford's law it is probably because the way in which towns are founded, grow, "reproduce" to seed new towns, and fluctuate in size can be reasonably modeled by a scale invariant process. If you run this process, stop it after a long enough time that the town sizes (either all of them, or up to some large size) conform to Benford, and then throw out the sizes outside the range 300-999, the ones in the range will also conform to Benford (as a prediction of relative frequencies of 3 vs 4 vs ... vs 9). This is why the text in the article should be changed: it is wrong.
- There are lots of examples where all values in a range are equally probably. This is called the uniform distribution. For example, winning raffle tickets are uniformly distributed over all participating tickets. The data set of winning raffle tickets across multiple raffles (where the size of each raffle is unknown) does fit Benford's law. Whereas if we only consider winning tickets between 300 and 999 (assuming all these numbers were present in each raffle. i.e. they had at least 1000 participants each time) then the distribution of leading digits in this range will be uniform still. This example comes from the excellent numberphile video on Benford's law which you may find interesting. Benford's law definitely does not apply to data where we have restricted the range in such a fashion. It doesn't even matter if all digits are present. If we only consider single-digit winning raffle tickets in the above example, then the leading digit data will reflect the underlying uniformity of the wider data. If we consider only winning raffle tickets up to 19 (assuming at least 19 tickets in every raffle) then our distribution would have 58% leading 1s, and just 5% all other digits. I think you can see this intuitively - we're only considering winning tickets up to 19, and the majority of them start with a 1, whereas to have leading digit anything else you need exactly that number. If we bound at any set point like this, then we can actually work out the underlying distribution as it is simply equal to the distribution of leading digits less than our bound (note: the average result we get across all such bounds is Benford's law!). What if we have a different distribution, other than uniform. It has been commented that village populations will distribute in some other non-uniform fashion due to population dynamics. In that instance, all we have to do is multiply the distributions. As a result, looking only at single digit data we will not see Benford but will see the underlying distribution for these values. Looking at 300-999 will not show Benford, but instead the underlying distribution of village populations for that range. This is something that clearly sits uncomfortably with you. If Benford's law works for the data, then why does it not work for an arbitrarily selected range? The answer is that it *does* work for an "arbitrarily selected" range, but the range 300-999 is far from arbitrary. In fact, we would see very different distributions depending on what range we picked (see the raffle example for an easier to understand case of this) and all those distributions will average out to Benford's law. Importantly, Benford's law works for the village data, and works for the uniform raffle data. It will also work for data sets which are distributed in just about any fashion - such is a strength of the law. You have an idea, then, of us ruling out a digit, and the remaining numbers still having the same relative frequencies. You are absolutely correct. If we look at village populations, and simply dismiss all villages with population starting with a 3, then the resulting distribution would indeed have the same relative distributions of leading digits - just with no 3s. Hope this helps! Awoma (talk) 09:23, 10 November 2020 (UTC)
- The example from Numberphile has no bearing on this question. It shows that if you use a non-scale invariant process, Benford probably will not hold anywhere on any particular interval. If it's true that there is some regularization (can you state the result precisely?) that extracts the Benford distribution out of the sequence 1,2,3... that's a cute and interesting result, but there isn't any reasonable way of viewing actual data sets as the result of that kind of limiting or averaging process, so it has nothing to say about why Benford would or would not apply in particular situations. 73.89.25.252 (talk) 06:46, 12 November 2020 (UTC)
- "If population numbers of towns, in toto, satisfy Benford's law it is probably because the way in which towns are founded" - This is possibly the misunderstanding at the heart of this issue. Benford's law tells us nothing about the distribution of the underlying data. It applies to population numbers of towns because they are spread across suitably many orders of magnitude, and do not follow any additional exceptional rule (such as a number-theoretic rule or upper limit). Town populations can be distributed normally, exponentially, uniformly, or in some really unusual way which we don't have a name for yet - Benford's law would be expected to work in all cases, because it is really a result on the number base used. That's why it is so strong, and why I am at liberty to assume uniform distribution. Doing so changes nothing about the goodness-of-fit for Benford, but is hopefully easier to follow. In particular, in this example you can hopefully see very easily that the winning raffle ticket data fits Benford for the whole data, but does not for the specific fixed ranges 1-9 or 300-999. When we imagine some other distribution, especially one which favours smaller numbers, it is easy to imagine restriction to the range 1-9 or 300-999 might fit some relative rewording of Benford, which seems to be what you expect would happen, which is why I think it would be helpful to stick to uniform distribution for this discussion. With this in mind, there's two facts I hope you can see. The first is that winning raffle tickets across raffles of unknown size are expected to fit Benford's law. The second is that the same exact data, restricted to the range 1-9, is expected to be uniform. If you study this example and really get your head around it I think at some point it will click into place :) The article currently reads "one can expect that Benford's law would apply to a list of numbers representing the populations of UK settlements. But if a "settlement" is defined as a village with population between 300 and 999, then Benford's law will not apply." This is correct. Awoma (talk) 07:49, 12 November 2020 (UTC)
- The example from Numberphile has no bearing on this question. It shows that if you use a non-scale invariant process, Benford probably will not hold anywhere on any particular interval. If it's true that there is some regularization (can you state the result precisely?) that extracts the Benford distribution out of the sequence 1,2,3... that's a cute and interesting result, but there isn't any reasonable way of viewing actual data sets as the result of that kind of limiting or averaging process, so it has nothing to say about why Benford would or would not apply in particular situations. 73.89.25.252 (talk) 06:46, 12 November 2020 (UTC)
- Awoma is correct. If a small section spanning less than an order of magnitude is excerpted from a larger distribution, that small section will not obey Benford's law even if the larger distribution does. Benford's law applies for sequences spanning several orders of magnitude. Skepticalgiraffe (talk) 03:26, 14 November 2020 (UTC)
- You are very wrong to expect that digits between 3 and 9 would appear equally often. The population of villages is absolutely not a uniform distribution. Although this might not be a case covered by the Benford law, the smaller numbers would still appear more often. 89.239.30.60 (talk) 21:12, 9 November 2020 (UTC)
- You are correct. I was assuming uniform distribution. If they are distributed in some other way, due to population dynamics or whatever, then that underlying distribution is what would be represented. This is not relevant to Benford's law though. Assuming uniform distribution, the data between 300 and 999 people does not fit Benford, while the data for all settlements on earth would still fit Benford. Awoma (talk) 21:37, 9 November 2020 (UTC)
Requested lead change
This edit request has been answered. Set the |answered= or |ans= parameter to no to reactivate your request. |
remove "by Hill " in the intro. not needed. — Preceding unsigned comment added by 101.100.139.52 (talk) 05:06, 6 December 2020 (UTC)
- done and I also removed the second citation on this sentence which references one of Newcomb's original writings on this topic and doesn't relate to the content. Awoma (talk) 09:18, 6 December 2020 (UTC)
Benford, QAnon, and the 2020 election
Following the 2020 United States presidential election result, a number of QAnon folks have been promoting a theory on social media that the failure of voting numbers for Biden to match Benford is a demonstration of likely electoral fraud. This is likely why there has been a big increase in interest in this page, and in particular the electoral fraud section. The short answer is no. These claims are baseless, and come from a misapplication of Benford's law to particular cities in a county, or wards in a city, as opposed to all counties/cities in the US (which is how Benford detected possible fraud in Iran. If you do this analysis in the US you find that yes, all the numbers fit Benford perfectly). Of course, this cannot be posted in the article as it would constitute original research, but it is worth keeping a close eye on the article as there may be misleading edits made in support of the conspiracy theory over the next few days. Awoma (talk) 09:46, 8 November 2020 (UTC)
- I've seen this circulating on social media, but there is no indication that it or its circulation are tied to QAnon-ists. Clearly, Trump supporters are much more interested in the possibility of a "smoking gun of fraud" than Biden supporters, and QAnonists post at a higher rate online than the average Trump supporter, but the signal is being boosted by a much broader group of people than QAnon. The idea of election fraud is neither proven nor a meritless conspiracy theory at this point, but it will be investigated (Pennsylvania legislature has already requested an audit) and we will know better soon enough. Pre-emptively classifying any Benford election-related activity on this page as supporting a conspiracy theory, or QAnon, or disinformation, etc is not reasonable at this time without analyzing the actual edits. 73.89.25.252 (talk) 06:59, 9 November 2020 (UTC)
- Definitely. I am not suggesting that the activity on this page is to promote disinformation, but it is worth editors being aware of the growing disinformation on social media surrounding this topic which may threaten the article. Edits should be scrutinised to ensure they are not made in promotion of the above. Awoma (talk) 08:55, 9 November 2020 (UTC)
"a misapplication of Benford's law to particular cities in a county, or wards in a city, as opposed to all counties/cities in the US" So what is the basis for your claim that Benfords law does not apply to these? 95.202.161.202 (talk) 14:48, 8 November 2020 (UTC)
- In order to be confident in applying Benford's law, you need the data to span multiple orders of magnitude. For example, with country populations, there are very large countries and very small countries. Something which doesn't do this, such as adult human heights (or votes for the winning candidate across wards in a city which has standardised ward size) will not be expected to fit Benford's law very well. This is nicely explained in the article actually. Awoma (talk) 15:13, 8 November 2020 (UTC)
- Lol! Care to try again? "Benford's law tends to apply most accurately to data that span several orders of magnitude."
- Note those words: "tends to", "most accurately". It doesn't say "does not apply at all".
- But more damnning for your case than that: "As a rule of thumb, the more orders of magnitude that the data evenly covers, the more accurately Benford's law applies. For instance, one can expect that Benford's law would apply to a list of numbers representing the populations of UK settlements. But if a "settlement" is defined as a village with population between 300 and 999, then Benford's law will not apply."
- So there's even a given example of when it is too small: 300-999. Most of the places contested using Benfords law are much larger than this, certainly every one of them in Michigan, where no county is below 10000 votes afaik.
- And one example of what has been analysed and disputed using Benfords law.. is the city of Chicago, another being the city of Milwaukee. Are you now going to claim such dataset are too small? I hope, and am going to assume, you are arguing in good faith and are simply completely ignorant of what is being disputed. 95.202.161.202 (talk) 15:48, 8 November 2020 (UTC)
- There's a lot of issues being mixed up here. You say the article doesn't say "does not apply at all" but the article does give specific examples where Benford's law does not apply. In each of these, it is because the data being considered does not span more than a single order of magnitude. I think you have misunderstood the settlement example. The issue there is not that settlements are too small but that they only span a single order of magnitude. If one took the population of areas without access to postal facilities, for instance, then the average size of such an area would be less than the average size of a settlement, but the resulting data would quite likely span many orders of magnitude, and as a result we would expect Benford's law to be a good fit for this data. The issue with wards in Michigan, then, is not that they are too small - they may have many thousands of residents - but that they are all very similarly sized. Data which ranges between 1 and 4,000 we would expect to fit Benford's law reasonably well. Data which ranges between 7 million and 8 million, obviously, would not. The final comment about the size of the datasets is irrelevant. That's not the issue here. If you tally the total savings of people on your street, then you might have a dataset of only 50 elements, but we would expect it to fit Benford well. On the other hand, height of adults is a dataset with billions of elements, and it fits Benford poorly. I hope that helped, and will do my best to clarify any other questions you have on this, but I do think much of what I've said here is explained well (better than I can manage) in the article. It would be best to read that first! Awoma (talk) 16:04, 8 November 2020 (UTC)
- I find it extremely suspect that trumps totals respect benford in those specific swing states and the third party candidates respect benfords in those swing states. but bidens data does not..... — Preceding unsigned comment added by Flynnwasframed (talk • contribs) 21:51, 8 November 2020 (UTC)
- This happens because the losing candidate's votes will span more orders of magnitude than the winning candidate's. 0-300 votes is 3 orders of magnitude, whereas 300-600 votes is just 1. As a result, it is expected that the losing candidate's data would fit Benford's law better than the winning candidate's data. Hope this helps. Awoma (talk) 22:23, 8 November 2020 (UTC)
- That isn't something you expect to happen unless there are wild variations in the ratio of winner to loser votes in the different precints. Biden will get more votes than Trump in black neighborhoods of Philadelphia, but in fair conditions there will not be orders of magnitude variation in Biden to Trump vote ratio. Where this does happen, like the 2012 election where Romney got exactly 0 votes out of around 20000 in some precints, that itself is highly suspicious, since even third parties and write-ins get some votes when 20000 people vote, the demographics are not 100 percent uniform and so on.
- And the exit-polled ratios were not that extreme in 2020. 1:7 among blacks, 1:2 among Hispanics and LGBT. Data here https://twitter.com/CharlesMBlow/status/1323975456668979200 73.89.25.252 (talk) 07:54, 9 November 2020 (UTC)
- I can't see anything relevant here. "Wild variations in the ratio of winner to loser votes" isn't important - Benford's law applies regardless of the underlying distribution, so long as the data is sufficiently spread over orders of magnitude. Across wards in a precinct, the winning candidate's votes will not be spread in this way, whereas the losing candidate's votes will be. This does not mean the winning candidate enjoys orders of magnitude more votes than the losing candidate on average, or at all. Comments about Romney not getting votes which you consider suspicious isn't relevant to this page. Awoma (talk) 08:55, 9 November 2020 (UTC)
- The point was that unless the precints are tiny (small N), the number of votes to the winner and loser will be roughly proportional to each other, or rather the ratio does not vary too much, therefore winner and loser's numbers will span a similarly broad range on a log scale (i.e., about the same number of "orders of magnitude" of fluctuation in the winner's and loser's numbers). It's not likely to be 0 to 300, more like 40 to 200 (L) versus say 250 to 1000 (W). There might very well be an explanation of the Benford patterns people are talking about but just saying "winner and loser" is not it.
- As empirical evidence of this, multiple third-party candidates who lost worse than Trump had Benford-shaped distributions where Biden did not; https://github.com/cjph8914/2020_benfords . Whatever is going on, cannot be dismissed based on winner or loser status, the two (or more) aren't that different for Benford purposes. 73.89.25.252 (talk) 09:23, 9 November 2020 (UTC)
- I think you might benefit from reading the article and making sure you understand the concepts being discussed, importantly magnitude and logarithms, where Benford's law fits and doesn't fit, and why. In your preferred ranges, 40 to 200 and 250 to 1000, the winning candidate still spans a single order of magnitude while the losing candidate spans 2, so we would still expect Benford to fit the losing candidate but not necessarily the winning candidate. What's more, the ratio between winning and losing candidate's votes in those ranges is substantially greater than it was in my ranges, so this change doesn't even solve the problem you're suggesting exists. What's more, there is no such problem. You can make the ratio as big or small as you like. The losing candidate will always be better spread over orders of magnitude than winning candidate, because that's what smaller numbers do. You mention third-party candidates, which is a perfect example. Third-party candidates got very few votes, and so range over more orders of magnitude, and we would expect them to fit Benford well. Looking at the data, that's exactly what we see. Hope this helps. Awoma (talk) 09:57, 9 November 2020 (UTC)
- That simply is not what "orders of magnitude" means here.
- I think you might benefit from reading the article and making sure you understand the concepts being discussed, importantly magnitude and logarithms, where Benford's law fits and doesn't fit, and why. In your preferred ranges, 40 to 200 and 250 to 1000, the winning candidate still spans a single order of magnitude while the losing candidate spans 2, so we would still expect Benford to fit the losing candidate but not necessarily the winning candidate. What's more, the ratio between winning and losing candidate's votes in those ranges is substantially greater than it was in my ranges, so this change doesn't even solve the problem you're suggesting exists. What's more, there is no such problem. You can make the ratio as big or small as you like. The losing candidate will always be better spread over orders of magnitude than winning candidate, because that's what smaller numbers do. You mention third-party candidates, which is a perfect example. Third-party candidates got very few votes, and so range over more orders of magnitude, and we would expect them to fit Benford well. Looking at the data, that's exactly what we see. Hope this helps. Awoma (talk) 09:57, 9 November 2020 (UTC)
- I can't see anything relevant here. "Wild variations in the ratio of winner to loser votes" isn't important - Benford's law applies regardless of the underlying distribution, so long as the data is sufficiently spread over orders of magnitude. Across wards in a precinct, the winning candidate's votes will not be spread in this way, whereas the losing candidate's votes will be. This does not mean the winning candidate enjoys orders of magnitude more votes than the losing candidate on average, or at all. Comments about Romney not getting votes which you consider suspicious isn't relevant to this page. Awoma (talk) 08:55, 9 November 2020 (UTC)
- This happens because the losing candidate's votes will span more orders of magnitude than the winning candidate's. 0-300 votes is 3 orders of magnitude, whereas 300-600 votes is just 1. As a result, it is expected that the losing candidate's data would fit Benford's law better than the winning candidate's data. Hope this helps. Awoma (talk) 22:23, 8 November 2020 (UTC)
- I find it extremely suspect that trumps totals respect benford in those specific swing states and the third party candidates respect benfords in those swing states. but bidens data does not..... — Preceding unsigned comment added by Flynnwasframed (talk • contribs) 21:51, 8 November 2020 (UTC)
- If the numbers (of votes in a precint) are in the range from A to B, they span exactly log(B/A) orders of magnitude, where the logarithm is to the base 10. From 40 to 200 is a factor of 5, less than one order of magnitude, and 250 to 1000 slightly smaller than that. My point is not that these are realistic intervals for votes or that they necessarily obey Benford statistics, but that the ratio Max/Min will be similar for the winner and loser under fair conditions, not off by orders of magnitude, i.e., large factors like 10 or 100.
- Again we have data on this, from another forensic analysis of the Biden-Trump race https://twitter.com/APhilosophae/status/1325593635996512257 . When bundles of Biden and Trump votes come in during the vote tally, the vast majority of the time they are within a small factor (like 2) of their numbers in the whole state or region. Unless the wards have very small total votes AND extreme levels of polarization AND this combination happens often in the data set, it will not be the case that the loser's data span a couple more orders of magnitude than the winner's and if they do those large fluctuations will happen infrequently enough to not have much effect on the Benford histogram. 73.89.25.252 (talk) 10:46, 9 November 2020 (UTC)
- I'm really not here to engage with links to twitter conspiracy theories, but also that link has nothing to do with Benford's law! Looking at the data merely confirms what I'm saying. Biden's results by ward stick largely to a single order of magnitude, while Trump's range across 3. Awoma (talk) 11:03, 9 November 2020 (UTC)
- The link has data that (regardless of any "conspiracy theories") demonstrate the preceding point about B and T data being essentially proportional to each other, so the same span of orders of magnitude.
- What this discussion indicates for the article is that apparently not everyone understands the business about orders of magnitude and it needs to be stated clearly that it concerns (base 10 logarithms of) ratios. You are misinterpreting the relevant "orders of magnitude" to be log_10 of the difference, (B - A), and not the ratio B/A. I did not realize until this discussion that such an interpretation was likely, but now that it appears there is no reason not to elaborate a bit in the article. 73.89.25.252 (talk) 16:41, 9 November 2020 (UTC)
- I really don't think clarifying the meaning of orders of magnitude is necessary. Most people reading the article will already be familiar with the concept, and the article on that topic is linked for further reading if they are not. Data spread between 0 and 300 covers 3 orders of magnitude, and so will fit Benford nicely. Data spread between 300 and 600 won't. Awoma (talk) 17:03, 9 November 2020 (UTC)
- I'm really not here to engage with links to twitter conspiracy theories, but also that link has nothing to do with Benford's law! Looking at the data merely confirms what I'm saying. Biden's results by ward stick largely to a single order of magnitude, while Trump's range across 3. Awoma (talk) 11:03, 9 November 2020 (UTC)
- So if the arbitrary numbers he chose were instead 200 to 800 votes for the loser and 600 to 2400 for the winner, would you still say that the loser has a broader range of orders of magnitude? Or do you think that the winner's numbers must have smaller variance? I don't follow you train of thought here. 89.239.30.60 (talk) 21:28, 9 November 2020 (UTC)
- If the losing data was spread in some nice fashion between 200 and 800 and the winning data similarly spread between 600 and 2400, then with no knowledge of this underlying distribution we would expect the winning data to fit Benford better than the losing data (which obviously would not). Awoma (talk) 21:39, 9 November 2020 (UTC)
- That is a total misunderstanding of how Benford law works. Benford is scale invariant -- that's the whole point! If you have the same number of data points (each precinct gives us one number each for the winner and the loser), and they tend to be roughly proportional to each other as in an election, then the two data sets are essentially the same for purposes of Benford's law, and we would expect Benford to fit the winner and loser to exactly the same extent in (200,800) as in (600, 2400). 73.89.25.252 (talk) 05:52, 12 November 2020 (UTC)
- Benford is, in most cases, scale invariant yes. Any distribution over (600,2400) in the naturals, however, is not simply a multiple of some distribution over (200,800) in the naturals. With regards the winning and losing data being some scaling of one another, I don't see how this can possibly be true. Assuming precinct size distribution D, then call the winning distribution W and the losing distribution L, we have L=D-W and so L is not simply a scaling of W. From this equation we also see that, if D is some constant n, the mean of L is n minus the mean of W, and the standard deviation of L and W are identical. Thus, L will span more orders of magnitude than W and L will fit Benford's law while W may not. Awoma (talk) 08:04, 12 November 2020 (UTC)
- The variance is obviously smaller for the loser in situations like the Biden-Trump race in Democrat cities, e.g. because Var(W) - Var(L) equals Cov(W-L, W+L). The margin of victory, W-L, strongly correlates with precinct size in such conditions where it is driven by the number of people who show up to vote. For the relatively large precinct sizes that gave Biden vote numbers in the mid-hundreds, even on the unrealistic assumption of equal W and L std deviations, you need 3 or 4+ SD events to get single digit counts for Trump and more than that to get triple digits, so the Trump numbers are mostly forced into the 10-100 range (in the inner cities) plus a few outliers. It does not give us a span of multiple orders of magnitude just from higher SD relative to the mean. 73.89.25.252
- And the effect of the data being integers (for which there is no perfect Benford distribution) is small compared to other factors. 73.89.25.252 (talk) 11:13, 22 November 2020 (UTC)
- It's not really clear to me what you are trying to say here. If the variance of W and L are equal then L will span more orders of magnitude than W. If the variance of both is very small, as claimed, then neither will span sufficiently many orders of magnitude, and so Benford will not be expected to fit either candidate's data. There is no "effect of the data being integers." Benford applies in exactly the same way regardless of the data being integral. I'm not certain what you mean by "perfect Benford distribution" - there is only one distribution described by Benford. It is the same distribution for integers/rationals/reals. Awoma (talk) 13:00, 22 November 2020 (UTC)
- Benford is, in most cases, scale invariant yes. Any distribution over (600,2400) in the naturals, however, is not simply a multiple of some distribution over (200,800) in the naturals. With regards the winning and losing data being some scaling of one another, I don't see how this can possibly be true. Assuming precinct size distribution D, then call the winning distribution W and the losing distribution L, we have L=D-W and so L is not simply a scaling of W. From this equation we also see that, if D is some constant n, the mean of L is n minus the mean of W, and the standard deviation of L and W are identical. Thus, L will span more orders of magnitude than W and L will fit Benford's law while W may not. Awoma (talk) 08:04, 12 November 2020 (UTC)
- That is a total misunderstanding of how Benford law works. Benford is scale invariant -- that's the whole point! If you have the same number of data points (each precinct gives us one number each for the winner and the loser), and they tend to be roughly proportional to each other as in an election, then the two data sets are essentially the same for purposes of Benford's law, and we would expect Benford to fit the winner and loser to exactly the same extent in (200,800) as in (600, 2400). 73.89.25.252 (talk) 05:52, 12 November 2020 (UTC)
- What I'm getting at is that you claim that "The losing candidate will always be better spread over orders of magnitude than winning candidate, because that's what smaller numbers do." However, I haven't heard of such a phenomenon. Maybe you can point me to a source for this claim? I'd also like to point out that while you think that the actual boundaries between orders of magnitude are somehow important, claiming that the range 600 to 2400 constitutes 2 orders of magnitude but 200 to 800 only 1, but if you inspect it closely, the first range will only contain datapoints starting with 6, 7, 8, 9, 1, and 2, while the second one will cover digits 2 through 8. In other words, both are comparably insufficient for Benford's law and what truly matters is the logarithm of their quotient as the other user pointed out. 89.239.30.60 (talk) 22:01, 9 November 2020 (UTC)
- Suppose you have n voters in an area. Then the range from 50%-100% (the winning candidate's share) spans fewer orders of magnitude than the range from 0%-50% (the losing candidate's share). This can indeed be seen from the log calculation. This is the reason why the winning candidate's vote tallies will fit Benford more poorly than the losing candidate's tallies. Even in your unusual example, where both fit poorly (as you point out), the losing candidate's data still fits better. Awoma (talk) 22:09, 9 November 2020 (UTC)
- However, we are not applying the Benford's law to percentages, are we? We simply have two sets of vote counts, votes for the winning candidate and votes for a losing candidate. They are both integer sets with some (presumably similar) distributions. The only difference between them is that the values in the first set are generally larger - and typically not by very much. So what mathematical law dictates that the set of smaller numbers must also have a broader range of orders of magnitude? 89.239.30.60 (talk) 22:20, 9 November 2020 (UTC)
- The log approximation for the winning candidate would be log(2), while the log approximation for the losing candidate would be log(n/2), where n is the number of voters. Log is strictly increasing, so the order of magnitude of the losing candidate's data will span more orders of magnitude than the winning candidate's data. Awoma (talk) 22:25, 9 November 2020 (UTC)
- Sorry but the first sentence doesn't make any sense to me and the second one is a non sequitur. Please just link a source that supports your claim (the one I've quoted), that's all. 89.239.30.60 (talk) 22:34, 9 November 2020 (UTC)
- The log approximation for the winning candidate would be log(2), while the log approximation for the losing candidate would be log(n/2), where n is the number of voters. Log is strictly increasing, so the order of magnitude of the losing candidate's data will span more orders of magnitude than the winning candidate's data. Awoma (talk) 22:25, 9 November 2020 (UTC)
- However, we are not applying the Benford's law to percentages, are we? We simply have two sets of vote counts, votes for the winning candidate and votes for a losing candidate. They are both integer sets with some (presumably similar) distributions. The only difference between them is that the values in the first set are generally larger - and typically not by very much. So what mathematical law dictates that the set of smaller numbers must also have a broader range of orders of magnitude? 89.239.30.60 (talk) 22:20, 9 November 2020 (UTC)
- Suppose you have n voters in an area. Then the range from 50%-100% (the winning candidate's share) spans fewer orders of magnitude than the range from 0%-50% (the losing candidate's share). This can indeed be seen from the log calculation. This is the reason why the winning candidate's vote tallies will fit Benford more poorly than the losing candidate's tallies. Even in your unusual example, where both fit poorly (as you point out), the losing candidate's data still fits better. Awoma (talk) 22:09, 9 November 2020 (UTC)
- If the losing data was spread in some nice fashion between 200 and 800 and the winning data similarly spread between 600 and 2400, then with no knowledge of this underlying distribution we would expect the winning data to fit Benford better than the losing data (which obviously would not). Awoma (talk) 21:39, 9 November 2020 (UTC)
- Let n be the number of votes in an area (and assume two candidates for simplicity). The winning candidate gets between n/2 and n votes, spanning log(n/(n/2)) = log(2) orders of magnitude. The losing candidate gets between 1 and n/2 votes, spanning log((n/2)/1) = log(n/2) orders of magnitude. log(n/2) is greater than log(2) and so the losing candidate spans more orders of magnitude than the winning candidate. As n increases, this becomes especially pronounced, but notice that even as n increases, the winning candidate never has an appropriate spread over several orders of magnitude which would be required to fit Benford's law. I hope that's clearer. Happy to keep trying if it's not! Awoma (talk) 22:39, 9 November 2020 (UTC)
- Alright, I agree with you that IF the votes were distributed according to this rule you made up, that would be the correct conclusion. However, this assumption is absolutely incorrect. The winning candidate will not get a majority of the votes in all districts. We are talking about the overall winner here, not a different winner for each datapoint. Therefore, there can be districts where the loser gets 0 or 1 votes, but the same is true for the winner. There will be districts where the loser has a big majority. So, as I said, the actual data is much more like I characterized it a few comments earlier. 89.239.30.60 (talk) 22:52, 9 November 2020 (UTC)
- Let n be the number of votes in an area (and assume two candidates for simplicity). The winning candidate gets between n/2 and n votes, spanning log(n/(n/2)) = log(2) orders of magnitude. The losing candidate gets between 1 and n/2 votes, spanning log((n/2)/1) = log(n/2) orders of magnitude. log(n/2) is greater than log(2) and so the losing candidate spans more orders of magnitude than the winning candidate. As n increases, this becomes especially pronounced, but notice that even as n increases, the winning candidate never has an appropriate spread over several orders of magnitude which would be required to fit Benford's law. I hope that's clearer. Happy to keep trying if it's not! Awoma (talk) 22:39, 9 November 2020 (UTC)
- Of course they may not get a majority of the votes in all districts, but this happens sufficiently often that we do not expect the winning candidate's data to fit Benford well (whereas the losing candidates would be more likely to fit Benford). Considering additional data points representing occasional deviation from the overall trend doesn't change this underlying reason, and makes the whole thing needlessly complicated. We could similarly discuss the fact that in reality precinct sizes are not all equal (I have assumed they are for the purposes of explanation). All these assumptions stand on the side - they do not change what's going on in the case of electoral data and Benford's law. The end result is still that the winner is not expected to match Benford while the loser frequently will. The reason for this is that the loser's data is nicely spread over more orders of magnitude than the winner. Awoma (talk) 23:00, 9 November 2020 (UTC)
- But this is the root of the problem. You claim that there is an end result that "the winner is not expected to match Benford while the loser frequently will," but this is pure conjecture. You haven't given made any arguments to support this whatsoever. Notice that I've never claimed that one candidate's data should or shouldn't follow Benford's law. All I'm claiming is that whether or not a candidate is a winner or a loser has absolutely no bearing on this property of the dataset. Why? Because the only difference is that the winner's total is higher. Other than that, both datasets are generated in the same way - by supporters of a given candidate voting for that candidate. If a certain random selection of the winner's voters decided to stay at home, making the other candidate the winner, would that suddenly change the properties of that candidate's data even though it stayed exactly the same? No. I believe you are simply wrong about this, but I would be gladly proven wrong - by an actual source supporting this hypothesis. 89.239.30.60 (talk) 23:19, 9 November 2020 (UTC)
- It's not really a hypothesis, or conjecture. It follows directly from the mathematics involved. In terms of sources, you may enjoy reading this, which might reassure you of this issue, but does not real detail it in the fundamental mathematical way I have attempted to demonstrate above. An attempt to clarify the situation you construct. Suppose we have precincts of a certain distribution of voters (may as well assume them to be all equivalent size). Then, as I've explained, the winning candidate's data will not fit Benford's law, while the losing candidate's quite possibly will. But, you then ask, what happens if the winning candidate's voters stay home for some reason, making the losing candidate the winning candidate. We now have a winning candidate whose data "quite possibly" matches Benford. Surely this goes against the statement that the winning candidate's data does not match Benford? What's being missed here is that by assuming a suitably large reduction in votes for the (formerly) winning candidate, without any change in the votes for the (newly crowned) winning candidate, we have changed the distribution of the total voters across precincts. While at the start I assumed that voting population of each precinct was equivalent size, this no longer holds. What we now have is precinct data matching Benford, and a winning candidate who wasn't expected to match Benford, but thanks to the coincidental precinct data, actually does. If you have any other confusing situations you'd like clarified I'll do my best to help, but it does seem a bit like you are trying to treat this as an argument. Awoma (talk) 23:47, 9 November 2020 (UTC)
- Unfortunately, the linked document doesn't relate to the claim I'm contesting. It is true that I'm here on this page because I want to make sense of the situation but I do not have an opinion on whether election fraud took place or not. All I'm interested in is the claim that according to you, the winner's and loser's vote totals somehow have inherently different properties, which is something I currently don't agree with. You say that I treat this as an argument, but you treat this as if you were an authority on the matter even though you've already made several damning errors and attempted to downplay them. I myself have a computer science and applied mathematics degree, so I know a thing or two about statistics. Statements like "it follows directly from the mathematics involved" need to be substantiated. Your rebuttal of my last example is also wrong, because 1. in a close election such as this one, the vote totals wouldn't change in a significant way to have the effects you described, 2. the assumption that precincts were the same size was never made, and 3. the data can be altered in such a way that a constant number of voters abstain from each precinct, upholding the assumption anyway. There is no need to "clarify" or "help", I'm only interested in solid proof. 89.239.30.60 (talk) 00:30, 10 November 2020 (UTC)
- I am glad to hear of the statistics degree. More people should do this! So, with regards the properties of votes for the winning and losing candidates, let's assume that all precincts have the same number of voters, n. You accept that the median vote for the winner is greater than for the loser, but actually we can say a little more than this. The median for the winner is greater than n/2, and the median for the loser is n minus the median for the winner. Hopefully you can also see that actually, if we plot the distribution for the winner, the distribution for the loser is n minus this. We simply mirror the distribution over n/2. As a result, while the winning candidates vote tallies are not nicely spread across multiple orders of magnitude (the majority of points span log(2) at best using approximation) the losing candidates can indeed do this (easily spanning log(n/2) as shown previously). Thus, while the winning candidate's tallies are not expected to match Benfords law, the losing candidates tallies may, because they can easily span more orders of magnitude. Moving on to the counter example, you now ask what if the totals are close to each other, so that the precinct totals do not have to change much? This now assumes that there is a large amount of data around the n/2 mark (lots of precincts which can flip with ease), which in turn breaks our assumption that the losing candidate's data fits Benford (too many leading digits equal to the leading digit of n/2). We have assumed too much - it is not possible to assume equal precincts, and a losing candidates data matching Benford, and very few votes needed to switch the winner, all at the same time. If instead we go the coincidence route - assume that there is not much data near n/2, but the median is near there, and the losing distribution matches Benford, then this does indeed result in a winner matching Benford. But, nothing really has been demonstrated as we've really just constructed the whole example. I can equally construct a distribution where the winning data fits Benford but the losing data does not - this can happen, absolutely, but the point is that it is not what is expected. As a final remark, I can sadly see this shifting over to the question of "what if we don't assume equal sized precincts?" I have done this for convenience, but we can assume any number of distributions for the total voters in each precinct. The only thing which would not be allowed is having the distribution of total voters matching Benford - in such a case we would expect the winning candidate to fit also! Awoma (talk) 01:11, 10 November 2020 (UTC)
- Unfortunately, the linked document doesn't relate to the claim I'm contesting. It is true that I'm here on this page because I want to make sense of the situation but I do not have an opinion on whether election fraud took place or not. All I'm interested in is the claim that according to you, the winner's and loser's vote totals somehow have inherently different properties, which is something I currently don't agree with. You say that I treat this as an argument, but you treat this as if you were an authority on the matter even though you've already made several damning errors and attempted to downplay them. I myself have a computer science and applied mathematics degree, so I know a thing or two about statistics. Statements like "it follows directly from the mathematics involved" need to be substantiated. Your rebuttal of my last example is also wrong, because 1. in a close election such as this one, the vote totals wouldn't change in a significant way to have the effects you described, 2. the assumption that precincts were the same size was never made, and 3. the data can be altered in such a way that a constant number of voters abstain from each precinct, upholding the assumption anyway. There is no need to "clarify" or "help", I'm only interested in solid proof. 89.239.30.60 (talk) 00:30, 10 November 2020 (UTC)
- It's not really a hypothesis, or conjecture. It follows directly from the mathematics involved. In terms of sources, you may enjoy reading this, which might reassure you of this issue, but does not real detail it in the fundamental mathematical way I have attempted to demonstrate above. An attempt to clarify the situation you construct. Suppose we have precincts of a certain distribution of voters (may as well assume them to be all equivalent size). Then, as I've explained, the winning candidate's data will not fit Benford's law, while the losing candidate's quite possibly will. But, you then ask, what happens if the winning candidate's voters stay home for some reason, making the losing candidate the winning candidate. We now have a winning candidate whose data "quite possibly" matches Benford. Surely this goes against the statement that the winning candidate's data does not match Benford? What's being missed here is that by assuming a suitably large reduction in votes for the (formerly) winning candidate, without any change in the votes for the (newly crowned) winning candidate, we have changed the distribution of the total voters across precincts. While at the start I assumed that voting population of each precinct was equivalent size, this no longer holds. What we now have is precinct data matching Benford, and a winning candidate who wasn't expected to match Benford, but thanks to the coincidental precinct data, actually does. If you have any other confusing situations you'd like clarified I'll do my best to help, but it does seem a bit like you are trying to treat this as an argument. Awoma (talk) 23:47, 9 November 2020 (UTC)
- But this is the root of the problem. You claim that there is an end result that "the winner is not expected to match Benford while the loser frequently will," but this is pure conjecture. You haven't given made any arguments to support this whatsoever. Notice that I've never claimed that one candidate's data should or shouldn't follow Benford's law. All I'm claiming is that whether or not a candidate is a winner or a loser has absolutely no bearing on this property of the dataset. Why? Because the only difference is that the winner's total is higher. Other than that, both datasets are generated in the same way - by supporters of a given candidate voting for that candidate. If a certain random selection of the winner's voters decided to stay at home, making the other candidate the winner, would that suddenly change the properties of that candidate's data even though it stayed exactly the same? No. I believe you are simply wrong about this, but I would be gladly proven wrong - by an actual source supporting this hypothesis. 89.239.30.60 (talk) 23:19, 9 November 2020 (UTC)
- Of course they may not get a majority of the votes in all districts, but this happens sufficiently often that we do not expect the winning candidate's data to fit Benford well (whereas the losing candidates would be more likely to fit Benford). Considering additional data points representing occasional deviation from the overall trend doesn't change this underlying reason, and makes the whole thing needlessly complicated. We could similarly discuss the fact that in reality precinct sizes are not all equal (I have assumed they are for the purposes of explanation). All these assumptions stand on the side - they do not change what's going on in the case of electoral data and Benford's law. The end result is still that the winner is not expected to match Benford while the loser frequently will. The reason for this is that the loser's data is nicely spread over more orders of magnitude than the winner. Awoma (talk) 23:00, 9 November 2020 (UTC)
- I don't think so. The winning candidate spans more orders of magnitude than the losing candidate. Let n be the number of votes in an area, and N be the maximum number of votes of each area. The spanning of the winning candidate should be and the spanning of the losing candidate should be .Suppose there are 3 areas with number of votes , the winning candidate can get from while the losing candidate only have . The number of votes obeys Benford's law.--Tttfffkkk (talk) 10:32, 19 January 2021 (UTC)
- None of this is anything new. There are many examples of the votes for Democratic candidates violating Benford's law. The votes for Republicans rarely do. Walter Mebane at the University of Michigan did a second digit 2BL test that showed that the 2009 Iranian election was a fraud. Somebody needs to do a similar analysis for the election in Michigan. I don't think that there is any rule about how many orders of magnitude the law applies to. Miller found that precincts were too small for the law to apply to. But he had to test it to find out. It's apparently not something that is intuitively obvious. See Stephen Miller's Benford's law: Theory and Applications, p. 215.
- This is the kind of misinformation I'm wanting to avoid working its way into the article. It is untrue that votes for Democratic candidates violate Benford's law, analysis on the Iranian election was done across the entire country, and yes there is a rule about orders of magnitude, which is currently explained nicely in the article. Awoma (talk) 08:55, 9 November 2020 (UTC)
- Wikipedia is not for partisan bickering. FAISSALOO(talk) 10:50, 9 November 2020 (UTC)
- I gave a reference. You might check it before making statements of this kind. On second look, I noticed that the relevant chapter is actually by Mebane and Miller is the book editor. They looked at seven U.S. elections held in 2006 and 2008. They found that the Republican candidate's vote followed Benford's law in every case. But the vote for the Democrat violated Benford in six of the seven elections. They suggest that this might have something to do with how the parties canvass for votes. 5440orSleep (talk) 10:52, 9 November 2020 (UTC)
- Apologies. You are correct. I had misinterpreted your statement as mirroring what the disinfo posts were saying about Biden's data not fitting but Trump' fitting in Michigan wards. That's my bad - you were very clear. Awoma (talk) 11:17, 9 November 2020 (UTC)
- I gave a reference. You might check it before making statements of this kind. On second look, I noticed that the relevant chapter is actually by Mebane and Miller is the book editor. They looked at seven U.S. elections held in 2006 and 2008. They found that the Republican candidate's vote followed Benford's law in every case. But the vote for the Democrat violated Benford in six of the seven elections. They suggest that this might have something to do with how the parties canvass for votes. 5440orSleep (talk) 10:52, 9 November 2020 (UTC)
Mebane has published a response to the disinformation here which, if the theory ends up being reported on more widely, will be worth including in the article. Awoma (talk) 16:04, 9 November 2020 (UTC)
Matt Parker has published a video on the topic, it might have some useful reference. Why do Biden's votes not follow Benford's Law?--Salix alba (talk): 08:03, 13 November 2020 (UTC)
The winning Biden obeys Benford's law and the winning Biden doesn't obey Benford's law
I take the data of Milwaukee from Nov.4,2020 and Nov.5,2020
On Nov.4, 2020
Leading digit | Registered Voters - Total | Ballots Cast - Total | Joseph R. Biden / Kamala D. Harris | Donald J. Trump / Michael R. Pence | In Benford's law | ||||
---|---|---|---|---|---|---|---|---|---|
Count | % | Count | % | Count | % | Count | % | ||
1 | 209 | 44% | 144 | 30.3% | 208 | 50.4% | 142 | 34.3% | 30.1% |
2 | 34 | 7.2% | 142 | 29.9% | 78 | 18.9% | 71 | 17.1% | 17.6% |
3 | 23 | 4.8% | 60 | 12.6% | 19 | 4.6% | 51 | 12.3% | 12.5% |
4 | 20 | 4.2% | 26 | 5.5% | 13 | 3.1% | 24 | 5.8% | 9.7% |
5 | 19 | 4% | 9 | 1.9% | 7 | 1.7% | 31 | 7.5% | 7.9% |
6 | 35 | 7.4% | 9 | 1.9% | 15 | 3.6% | 24 | 5.8% | 6.7% |
7 | 48 | 10.1% | 4 | 0.8% | 17 | 4.1% | 34 | 8.2% | 5.8% |
8 | 50 | 10.5% | 7 | 1.5% | 24 | 5.8% | 18 | 4.3% | 5.1% |
9 | 37 | 7.8% | 13 | 2.7% | 32 | 7.7% | 19 | 4.6% | 4.6% |
Correlation | 80.9% | 91.8% | 92.8% | 98% | 100% |
On Nov.5, 2020
Leading digit | Registered Voters - Total | Ballots Cast - Total | Joseph R. Biden / Kamala D. Harris | Donald J. Trump / Michael R. Pence | In Benford's law | ||||
---|---|---|---|---|---|---|---|---|---|
Count | % | Count | % | Count | % | Count | % | ||
1 | 208 | 43.8% | 155 | 32.6% | 86 | 18.1% | 114 | 24% | 30.1% |
2 | 34 | 7.2% | 34 | 7.2% | 35 | 7.4% | 85 | 17.9% | 17.6% |
3 | 24 | 5.1% | 36 | 7.6% | 51 | 10.8% | 89 | 18.7% | 12.5% |
4 | 20 | 4.2% | 31 | 6.5% | 69 | 14.6% | 57 | 12% | 9.7% |
5 | 20 | 4.2% | 43 | 9.1% | 79 | 16.7% | 35 | 7.4% | 7.9% |
6 | 34 | 7.2% | 56 | 11.8% | 62 | 13.1% | 36 | 7.6% | 6.7% |
7 | 48 | 10.1% | 44 | 9.3% | 42 | 8.9% | 27 | 5.7% | 5.8% |
8 | 50 | 10.5% | 36 | 7.6% | 28 | 5.9% | 16 | 3.4% | 5.1% |
9 | 37 | 7.8% | 40 | 8.4% | 22 | 4.6% | 16 | 3.4% | 4.6% |
Correlation | 81.1% | 80.6% | 51.5% | 91.6% | 100% |
As you see I pick Registered Total, Ballots Total, Biden's vote and Trump's vote together comparing to Benford's law.
I'm Hong Kong mathematician but new to Benford's law. Just wondering if this sort of analysis is good enough.--Tttfffkkk (talk) 23:59, 19 January 2021 (UTC)
- First leading digit analysis is known to be irrelevant in analyzing voting. Also, Wikipedia doesn't publish original work.Constant314 (talk) 00:49, 20 January 2021 (UTC)
- You can say that. What I found is that people are interested in analyzing voting with Benford's law. I found the GitHub showing statistics with Fulton County, Miami-Dade, Milwaukee, Chicago, Allegheny and Spotted Toad analyzing votes with Benford's chi-squared tests and Lowess Smoothing. If they're all invalid, I hope we can show the reason to the public clearly.Tttfffkkk (talk) 01:44, 20 January 2021 (UTC)
- "First leading digit analysis is known to be irrelevant in analyzing voting." Not quite. The correct statement would be "First leading digit analysis is known to be irrelevant in analyzing voting data that does not span several orders of magnitude." It is the part about spanning several orders of magnitude that matters to Benford's law, not the source of the data. (And the fact that the data underlying the numbers in the table above do not span several orders of magnitude is in fact the reason why these do not follow Bendord's law). Skepticalgiraffe (talk)
- Turns out that there are several analyses of the case you cite, which point out this flaw. Here, for example: www-personal.umich.edu/~wmebane/inapB.pdf. Skepticalgiraffe (talk) 02:18, 20 January 2021 (UTC)
- Thanks for pointing out the orders of magnitude. That's consistent with what Awoma has been talking about. And thanks for advising me with reliable resource. I will take a look on it.Tttfffkkk (talk) 02:27, 20 January 2021 (UTC)
- A very good treatment from Matt Parker is at youtube: Why do Biden's votes not follow Benford's Law?. Johnuniq (talk) 02:46, 20 January 2021 (UTC)
- Thanks. I see the Trump Tower. I would check the election data from Chicago later as well.Tttfffkkk (talk) 06:47, 20 January 2021 (UTC)
- A very good treatment from Matt Parker is at youtube: Why do Biden's votes not follow Benford's Law?. Johnuniq (talk) 02:46, 20 January 2021 (UTC)
- Although Mebane stated that the first digits of precinct vote counts are not useful for trying to diagnose election frauds, I have a thought that the early voting results are more likely to obey Benford's law. The maximum number of registered voters is 4045. That means once the number of votes in a ward reached over 1000, they are impossible to reach over 10000 anymore. On Nov.4, 2020, there are 4.12% and 0.72% wards having more than 1000 votes for Biden and Trump respectively. On Nov.5, 2020, there are 15.82% and 2.53% wards having more than 1000 votes for Biden and Trump respectively. Maybe that 15.82% wards make Biden's vote disobey Benford's law.
Date Registered Voters - Total Ballots Cast - Total Joseph R. Biden / Kamala D. Harris Donald J. Trump / Michael R. Pence Nov.4, 2020 52% 8.21% 4.12% 0.72% Nov.5, 2020 51.79% 36.63% 15.82% 2.53%
- However, I still have a feeling that Biden's votes are so weird on Nov.5, 2020. The correlation drops too much. Tttfffkkk (talk) 11:51, 20 January 2021 (UTC)
- Matt Parker's video is so imformative. I notice that the standard deviation between wards of Milwaukee and Chicago are so different. Chicago is around 200 while Milwaukee is around 600 which enable numbers varing from 1 to 1000 wider. Milwaukee's registered voters obey Benford's law while Chicago's registered voters satisfy normal distribution. Eventually, it made Biden's votes satisfy normal distribution.
- Respecting to the last two digits results, I found that other election candidates also build tower between 1 and 9 while the Trump's tower built among 10 and 99. Because 64.98% of Trump's votes are below 100 while other candidates get almost below 10 votes. When the number of votes between 1 and 99, the last two digits are same as the first two digits which obeys Benford's law. As a result, the uniform distribution of the last two digits is mixed with the distribution of the first two digits.Tttfffkkk (talk) 14:05, 20 January 2021 (UTC)
Assume all the election votes have a normal distribution . When the standard deviation is larger than the mean , the distribution can spread wider.
I notice that coefficient of variation in statistic is measuring this.
Date | Variable | Registered Voters - Total | Ballots Cast - Total | Joseph R. Biden / Kamala D. Harris | Donald J. Trump / Michael R. Pence |
---|---|---|---|---|---|
Nov.4, 2020 | mean | 1021 | 234 | 144 | 60 |
standard deviation | 649 | 459 | 280 | 202 | |
coefficient of variation | 0.636 | 1.962 | 1.944 | 3.367 | |
Nov.5, 2020 | mean | 1019 | 829 | 576 | 192 |
standard deviation | 649 | 605 | 378 | 300 | |
coefficient of variation | 0.637 | 0.73 | 0.656 | 1.563 |
Additionally, if I round the mean of votes inside to
Date | Variable | Registered Voters - Total | Ballots Cast - Total | Joseph R. Biden / Kamala D. Harris | Donald J. Trump / Michael R. Pence |
---|---|---|---|---|---|
Nov.4, 2020 | adjusted mean | 1000 | 100 | 100 | 100 |
standard deviation | 649 | 459 | 280 | 202 | |
adjusted coefficient of variation | 0.649 | 4.59 | 2.8 | 2.02 | |
Nov.5, 2020 | adjusted mean | 1000 | 1000 | 1000 | 100 |
standard deviation | 649 | 605 | 378 | 300 | |
adjusted coefficient of variation | 0.649 | 0.605 | 0.378 | 3 |
I think I've got what I want. No matter how smooth the normal distribution is, they would disobey Benford's law if the coefficient of variation is smaller than 1.Tttfffkkk (talk) 13:28, 21 January 2021 (UTC)
Walter Mebane
Surely more than one single man's opinion can be presented in the "election fraud" section of this article? It's clear that there is a political bias. 124.169.150.131 (talk) 17:48, 13 February 2021 (UTC)
- The election fraud category presented Mebane as the chief proponent of using Benford's Law to detect election fraud as well as the contrary opinion that others found it did not work and that it was just as likely to detect fraud when there was none as it was to not detect fraud. Mebane was later cited as refuting them, but he never actually did, which I pointed out in the above page. Further, Mebane analyzed the improper implementation of his methodology onto the 2020 election fraud claim and stated that it doesn't pass the muster, essentially. It's not political bias. You would be hardpressed to find someone outside of Dr. Mebane who is an expert in such a thing considering as he's the pioneer of using Benford's Law for election fraud analysis. It's disingenious to call the article or Dr. Mebane as politically biased. X0n10ox (talk) 04:32, 14 February 2021 (UTC)
Preventing Benford's Law from being used in propaganda
Hi, I'm really sorry about this, but I added the following sentence to the introduction paragraph: "It should be noted, however, that Benford's Law cannot be used to critique a single data set (for example, if the results of a single opinion poll or election shows a larger difference, that in no way indicates a problem with that result)." Please let me explain why. At least one candidate in the California governor's recall election is actually claiming that the results were fraudulent significantly ahead of the votes being counted, as you can see from the following website: https://stopcafraud.com/ (I followed a link there from Larry Elder's campaign website, electelder.com. Of particular concern: "Statistical analyses used to detect fraud in elections held in 3rd-world nations (such as Russia, Venezuela, and Iran) have detected fraud in California resulting in Governor Gavin Newsom being reinstated as governor. The primary analytical tool used was Benford’s Law and can be readily reproduced." I'm not a mathematician--my Ph.D. is in politics--but I know enough about quantitative research to be confident that the sentence I appended is uncontroversial on the face of it (it's like saying 'the average college student is 20 years old, but you are 30 years old, so that proves that you can't be a college student,' right?. It may in fact seem axiomatic to anyone who understands math at an undergraduate level. However, I am certain that a lot of people are going to be Googling Benford's Law in the coming days, and I think it would be a valuable public service to head off election fraud disinformation. I say this regardless of political affiliation (and as a non-American myself)--if there is fraud, let it be proven in court rather than alleged without evidence, falsely asserted in the social media information space, and rooted in a lie about how statistics work. Thanks for your time. I wouldn't object if you deleted my addition, but could I ask you to leave it up for a week or so, please? Cheers, D
- I have protected the page from changes made by new or unconfirmed users for a month. - DavidWBrooks (talk) 16:40, 14 September 2021 (UTC)
- Checking the pageview chart, it looks like page views increased about 10fold for that election - roughly one-tenth as much as they increased after the 2020 presidential election. - DavidWBrooks (talk) 13:33, 15 September 2021 (UTC)
Repeated removal of sourced information
This has been repeatedly removed today despite being sourced, correct, relevant, and verifiable. In one reversion it was labeled as "misinformation":
- Similarly, application of Benford's law to the last pairs of digits in the same data shows an expected distribution for Joe Biden, but a massive spike for lower values for Donald Trump, which also seems to suggest fraud; however, Donald Trump received fewer than 60 votes in a significant portion of districts, skewing the results of such analysis by adding a standard normal distribution in this range, rather than indicating fraud.[1] John Moser (talk) 21:23, 7 August 2022 (UTC)
- YouTube is not a reliable source.
- There is no body of work concerned with a Benford analysis of the last pairs of digits.
- It is not notable. It is trivia. Constant314 (talk) 22:05, 7 August 2022 (UTC)
References
Statistical Question regarding the section "Multiplicative Fluctuations"
It is stated in the article under "Muliplicative Fluctuations": "More technically, the central limit theorem says that multiplying more and more random variables will create a log-normal distribution with larger and larger variance, so eventually it covers many orders of magnitude almost uniformly" However, the "central limit theorem" in its classical form does not refer to the multiplication of random variables. It refers to the sample average of a number of random variables that are mutually independent and identically distributed. To form the sample average, these variables are added and then divided by the number n (the total number of input-variables) and the distribution of this new random output-variable tends to follow a standard normal distribution (if n is sufficiently large). If a special form of the central limit theorem is used in the article on "Benford's law" (I am unaware of a muliplicative formulation of the central limit theorem) then the source of this formulation should be clearly noted. Aurelien101 (talk) 11:15, 6 July 2023 (UTC)
- You add the logarithms of the numbers. You apply the central limit theorem on the log scale. Constant314 (talk) 13:10, 6 July 2023 (UTC)