Jump to content
Objectivism Online Forum

2020 Election Statistical Anomaly?

Rate this topic


Recommended Posts

I'm curious what people think of this... I'd like to see a list of all the states this is happening in, and I'd like to know the probability that it could happen by chance.

I mean, regardless of what you may think of the source... these are actual numbers, and it should be possible to check them... right?

Link to post
Share on other sites

You would need a good statistical model that takes into account parameters like the expectation that mail in voters would probably lean towards Biden, how the convenience of the widespread encouragement of mail in voting can change the usual demographics. Also some sort of Bayesian model, different than the frequentist models that seems very common in political polling. This is better than asking the probability that this could happen by chance. That way, you could measure the confidence of belief rather than simply asking "could it happen by chance?" 

This would have to be done compared to previous years, and there really isn't that much data for comparison (one new data set every 4 years is extremely small). But I do get the sense that political polling consistently samples the wrong people, so I find it hard to use the premise used in that link: "In fact, according to Pew Research, overwhelming shares of voters who are supporting Trump and Biden say they are also supporting the same-party candidate for Senate."

I'm not that familiar with statistics in political science though. 

Edited by Eiuol
Link to post
Share on other sites

The probability of the facts being the way they are due to “chance” is zero. But without a model of what causes voting results, everything is effectively “by chance”. He proposes (based on a suggestion by Pew) that people tend to vote party-line (adtually, people tend to claim that they will vote party line), so one would want to investigate that claim (how strong or true is that effect??). The apparent anomaly is that in Michigan, Biden is doing proportionally much better than Trump, compared to the Senatorial race. That’s a miserable standard of comparison: what about Democrat vs. Republican in other statewide races in Michigan? The total number of votes for major-party presidential candidates is 76,224 more than the total number of major party votes for Senator – how is that possible? Maybe the senatorial vote was boring and fewer voters cared. Or the Democrat senatorial candidate was more more annoying; or maybe more people hate Trump. The simple fact is that 76,224 more people in Michigan voted for president that voted for senator. Same thing in Georgia, only twice as big an effect: 132,915 more presidential voters.

The Pew claim is based on surveying potential voters, and is not based on what people actually do (for good reason). It would be interesting to see Pew do some self-validation surveys, where after the election, they ask people “Did you vote for Trump, or for Biden?”. I predict that the survey result will differ from the actual voting percentages, which would allow us to rescale the survey results.

I find the results to be totally n.s.

Link to post
Share on other sites

What the Zero Hedge article is alleging is that, when you group votes like this:

  • Biden with Democratic Senator
  • Biden with Republican Senator
  • Biden with no vote for Senator
  • Trump with Democratic Senator
  • Trump with Republican Senator
  • Trump with no vote for Senator

...there is an unusually high number of "Biden with no vote for Senator" votes, as compared with the "Trump with no vote for Senator" votes, but only in battleground states.

(Of course we can't really group the votes this way, because we only have the total counts for each candidate from each state, so we're looking at differences between the Presidents and the Senators and then allowing that there may be some differences like what showed up in the Pew Research studies.)

In non-battleground states, the number of "Biden with no vote for Senator" votes correlates well with the number of "Trump with no vote for Senator" votes.

If somebody is trying to manufacture Biden votes, it's easier and quicker to create a ballot with just a Biden vote and nothing else, than to create one that "simulates a real voter" and thus also has votes for other Democrats (or even sometimes Republicans) on the ballot. That's why this is suspicious.

It's possible that people really voted that way, but when I say "the probability that this could have happened by chance" what I mean is that, while most people's real votes are not the product of chance, people don't vote for the statistical properties of the whole vote. The behavior of "real voters" should give rise to certain statistical properties, but "fraudulent votes," if they are not done carefully, will not statistically look like real votes and will give rise to different statistical properties which can then be detected. When you're talking about statistical properties, there's always a chance that real votes could also look fraudulent, but this chance decreases as the number of votes increases.

You'd probably have to create a "weirdness score," compute a mean weirdness score for a bunch of elections, and a standard deviation, and then you could calculate how many standard deviations the "weirdness score" for these battleground states differs from the average. If we're talking about one standard deviation, then we could say that this is no big deal, but if it's five or six standard deviations, then it's unlikely to be a fluke, and you can compute exactly how unlikely it is.

(Although technically the distribution is not Gaussian so you might need to use something other than the standard deviation.)

Edited by necrovore
Link to post
Share on other sites
2 hours ago, necrovore said:

You'd probably have to create a "weirdness score," compute a mean weirdness score for a bunch of elections, and a standard deviation, and then you could calculate how many standard deviations the "weirdness score" for these battleground states differs from the average. If we're talking about one standard deviation, then we could say that this is no big deal, but if it's five or six standard deviations, then it's unlikely to be a fluke, and you can compute exactly how unlikely it is.

This is actually interesting. I'd like to see what sigma this comes out to. 

I'm surprised Fox News isn't all over this yet.

Edited by EC
Link to post
Share on other sites
2 hours ago, necrovore said:

...there is an unusually high number of "Biden with no vote for Senator" votes, as compared with the "Trump with no vote for Senator" votes, but only in battleground states.

You can only conclude that it is an unusually high number after you create a statistical model. Otherwise, you would be begging the question. We want to find out if in fact the number is unusually high. And of course we want to specify exactly what counts as unusually high, not just some arbitrary big number. I don't doubt that you know all this, so I'm just making explicit what it is you left out. 

It's worth pointing out that for me, and my state, it was always a reasonable possibility that I would vote Biden but leave Senator blank. That's not what I ended up doing, but it was completely in the realm of reasonable possibility because of how I feel about Democrats and Republicans. I pick president by different standards.

2 hours ago, necrovore said:

That's why this is suspicious.

As a prior, this is something you could put into a model simulating what a fraudulent election would produce. You'd want to factor in prior assumptions of how likely fraud is, and you'd also need to simulate how likely this form of fraud is being done. You would compare that to a model where these likelihoods are near zero. But I think you're getting ahead of yourself if the reason you're doing the modeling is to validate your thinking. 

2 hours ago, necrovore said:

When you're talking about statistical properties, there's always a chance that real votes could also look fraudulent, but this chance decreases as the number of votes increases.

What do you mean? Why would the chance that a real vote looks fraudulent go down as the number of votes increases? I am thinking that the chance a real vote looks fraudulent doesn't change one bit as the number of votes increases. If there is a 1% that a randomly selected ballot is fraudulent, then it will always be 1% no matter how many votes there are. Okay, maybe more will appear fraudulent, but the number is still 1%. 

By the way, when I say modeling, I mean very complex models, not the sort where you are doing linear modeling (with frequentist assumptions). I'm not sure a weirdness score would work for the model you would need. 

I find this all kind of beside the point for the current election. It's faster and more efficient to simply validate the votes as a precaution. But I think modeling like this is pretty damn interesting anyway.

Edited by Eiuol
Link to post
Share on other sites

I think the statistical guys that have found superusers and bots in online poker in the past could handle this type of data modeling. I'll bring it up on twoplustwo, I think, if they aren't already looking into it.

Link to post
Share on other sites

Theory: A large block of Never-Trump Republicans might support Biden but not down-ballot Dems.

That said, I've found Zero Hedge to be a decent source of information. And I'm against mail-in ballots. So even if the stats are accurate, I would question the legitimacy of the ballots.

Link to post
Share on other sites
On 11/6/2020 at 12:33 PM, Dupin said:

Tucker Carlson may cover this but it looks like Fox News in general has gone over to the dark side.

Yeah, they seem to be actively promoting the false premise that Biden has "won" based simply off of evil Leftist media propaganda now; while in actual reality, this is still an ongoing and contested race that is (and should stay) unconceded.

Edited by EC
Link to post
Share on other sites
On 11/6/2020 at 3:58 PM, MisterSwig said:

Theory: A large block of Never-Trump Republicans might support Biden but not down-ballot Dems.

I think they looked at that, and figured that it were the case, these Never-Trump Republicans would still have voted for the Republican senators -- and then Trump would be behind the Republican senators, by about the same amount that Biden was ahead of the Democrat senators.

Also, why would these Never-Trump Republicans only appear in battleground states and not in all states? (If they were in all states, then even in places like Alaska where Trump won handily, the Republican senator would have even more votes than Trump, and the Democrat senator would have fewer votes than Biden.)

1 hour ago, EC said:

Yeah, they seem to be actively promoting the false premise that Biden has "won" based simply off of evil Leftist media propaganda now; while in actual reality, this is still an ongoing and contested race that is (and should stay) unconceded.

On the other hand if Trump wins due to the Supreme Court ruling that a lot of races had rampant fraud -- the Democrats will howl that the election was "stolen" from them.

Still, I think Trump should proceed with his challenge (and I hope he follows all the correct steps and makes no mistakes). I think it's better to fight for freedom than to give it up without a fight. But also, I want to know -- I want it to be properly investigated -- whether these allegations of fraud are actually true.

I don't like it that the allegations of fraud are being censored* and swept under the rug.

*(Ayn Rand said censorship can only be done by governments, so by her definition, this isn't actually censorship. However, I think it's suspicious that so many big companies delete content according to the same Leftist standards. It makes me wonder if government is responsible for this behind the scenes after all. Who's setting these standards? It's like the mainstream press, they often say the same thing word for word. Who's choosing these words? It would be very easy for some regulator to threaten endless inconclusive but expensive investigations against any company that steps out of line...)

Edited by necrovore
Link to post
Share on other sites
1 hour ago, necrovore said:

I think they looked at that, and figured that it were the case, these Never-Trump Republicans would still have voted for the Republican senators -- and then Trump would be behind the Republican senators, by about the same amount that Biden was ahead of the Democrat senators.

Are you asking how to determine a statistical anomaly caused by voter fraud and what might that statistical statistical anomaly look like, or are you assuming you are correct and asking for a statistical model that will prove it? Paragraphs like this make it seem like you are saying in your first post "lol guys look at these dumb dumbs, thinking we can't check these obviously fraudulent numbers". But then DO and me both talk about doing the stats and not beginning with any presumption of there being fraud, you haven't gotten very much into statistical method. Just a lot of vague possibilities, which aren't wrong, but haven't been folded into thinking about how to create models to estimate how confident you should be in those possibilities. Not just a chance of those cases, but an estimate of how confident you should be in the belief itself!

I don't think you're wrong as much as you aren't being careful. And perhaps overconfidence in your own statistical reasoning abilities. About this:

On 11/5/2020 at 9:42 PM, necrovore said:

You'd probably have to create a "weirdness score," compute a mean weirdness score for a bunch of elections, and a standard deviation, and then you could calculate how many standard deviations the "weirdness score" for these battleground states differs from the average.

Okay, but you just described pretty much every statistical test. The so-called weirdness score here is how much elections deviate from the expectation, usually being the distance from the mean. Every statistical test is some variation of this process. What I'm getting at is that although you are correct, all you did is describe what statistics tries to do. 

You said that technically the distribution is not Gaussian. What do you mean? I'm asking because this is about what you think of the numbers in a descriptive sense, which is a lot to do with how we then reason about the data. So we need to know exactly what you mean here (otherwise it is just idle speculation).

1 hour ago, necrovore said:

But also, I want to know -- I want it to be properly investigated -- whether these allegations of fraud are actually true.

Data and statistical analysis needs to come first. Then you make allegations. Then you investigate. That's the proper way to reason about it.

You don't make allegations, do data and statistical analysis, then investigate. You can't start by saying "the numbers look wrong, we need to investigate!" You have to create a model, then you can say the numbers look wrong. Until then, allegations of fraud are arbitrary claims. "I want to know whether these allegations about chemtrails are actually true!" Is the same sort of claim.

Edited by Eiuol
Link to post
Share on other sites

My point in the first post was that this allegation isn't arbitrary; it's specific enough to be either true or false. It's worthy of investigation as opposed to being dismissed out of hand.

It is possible to have a hunch about the numbers, that "something is up." You could say that this hunch is based on an informal model, an expectation that the numbers should look a certain way. It would make sense to develop that informal model into a formal model; with the formal model it becomes possible to capture what the assumptions really are and whether and to what extent they are violated. (However, as the famous quote goes, one cannot proceed from the informal to the formal by formal means...)

I suppose all I'd need to do is define this "weirdness" function but I don't know how to define it. I might even have to try a few candidate "weirdness" functions until I find one that is both simple and well-behaved not only with the real data but also with some really unlikely test scenarios. I don't think this is arbitrary at all. The function has to be simple.

I don't know if I can do the full analysis myself. I almost can. I don't really have time. I don't have all the data (the vote tallies for all 50 states, and besides, these numbers are still changing in some states). But I know it's possible to do. Maybe even with an Excel spreadsheet.

As for the Gaussian thing -- I put that in parentheses because I didn't think it was essential to my argument. The weirdness value of an election is a random variable. As such, it would have a probability distribution. The probability distribution characterizes, in a single function, all expectations about the value of the variable, and how improbable they are. The probabilities must add up to 1. If you have the probability distribution of a random variable, and then you get an actual value, you can compute how improbable that value was. You can say things like, "there's only a 5% chance it would be that high," or whatever.

One of the most common probability distributions is the Gaussian or "normal" distribution. For Gaussian random variables, you can compute the improbability of a value by measuring how many standard deviations it is from the mean. There's a simple function for that, or you can use tables.

But if the variable has a different distribution, then that function does not apply. Suppose your random variable ("weirdness") can only range from -1 to 1. It is then not Gaussian because a Gaussian variable can range from minus infinity to infinity. So you can say that a value of 5 has such and such improbability for a Gaussian variable, because it is so-and-so many standard deviations away from the mean, but if your variable ranges from -1 to 1, then the probability that is has a value of 5 is exactly zero, even though 5 might still only be so-and-so many standard deviations from the mean. So you can't use "standard deviations from the mean" to get the improbability of a non-Gaussian variable. But there are other ways to get the improbability.

Link to post
Share on other sites
1 hour ago, necrovore said:

But if the variable has a different distribution, then that function does not apply.

Well, more accurately, I should have said why do you think this distribution probably is not Gaussian? I guess you're right it isn't very important to your argument, but I really am pretty curious about your thoughts on what the model would look something like.

Link to post
Share on other sites

Just to add a few things and meant to get to:

11 hours ago, necrovore said:

My point in the first post was that this allegation isn't arbitrary; it's specific enough to be either true or false. It's worthy of investigation as opposed to being dismissed out of hand.

Arbitrary as in by whim, such that the claim isn't actually about true or false (even if the words spell out a proposition). Whether chemtrails exist and are used to control the US population is a true or false claim. And you can't disprove it by simply pointing at an obvious fact. The problem is they can feel true because of intuition. Saying there might be fraud isn't as egregiously arbitrary as claims about chemtrails, but they seem similar as far as their basis. So when you say things like this:

11 hours ago, necrovore said:

You could say that this hunch is based on an informal model, an expectation that the numbers should look a certain way. It would make sense to develop that informal model into a formal model

it sounds like you are saying that intuition is a valid basis for coming up with hypotheses about the world. Intuition is based on a model of information, and expectation that the world to look a certain way. It doesn't make sense to develop that intuition model into a formal model until we start piecing together the facts, just the straight facts. You can say informal in terms of the rigor of your claim, in the sense that you have a lot of work left to do, but I am essentially asking for what information you are starting with. "The numbers look really big!" doesn't tell me anything. You went on to talk about how down ballot voting was left blank, but didn't say much about why this has to be suspicious. You are reasoning from intuition. So far, the one basis to say that something is "wrong" with the ballots is the Pew research mentioned in your link. That premise is already weak(I said why in my first post).

 

 

Link to post
Share on other sites
58 minutes ago, Eiuol said:

Just to add a few things and meant to get to:

Arbitrary as in by whim, such that the claim isn't actually about true or false (even if the words spell out a proposition). Whether chemtrails exist and are used to control the US population is a true or false claim. And you can't disprove it by simply pointing at an obvious fact. The problem is they can feel true because of intuition. Saying there might be fraud isn't as egregiously arbitrary as claims about chemtrails, but they seem similar as far as their basis. So when you say things like this:

it sounds like you are saying that intuition is a valid basis for coming up with hypotheses about the world. Intuition is based on a model of information, and expectation that the world to look a certain way. It doesn't make sense to develop that intuition model into a formal model until we start piecing together the facts, just the straight facts. You can say informal in terms of the rigor of your claim, in the sense that you have a lot of work left to do, but I am essentially asking for what information you are starting with. "The numbers look really big!" doesn't tell me anything. You went on to talk about how down ballot voting was left blank, but didn't say much about why this has to be suspicious. You are reasoning from intuition. So far, the one basis to say that something is "wrong" with the ballots is the Pew research mentioned in your link. That premise is already weak(I said why in my first post).

The chemtrails thing is different because there's no evidence, and whatever can be asserted without evidence can be dismissed without evidence. By contrast, the Zero Hedge article presents a little evidence for the statistical anomaly in the election results, although I certainly wouldn't regard it as proved just from that. It's a small amount of evidence, but it is a small amount of evidence. That's why an analysis is necessary in the first place.

Intuition and hunches are not automatically correct, but they are not automatically wrong, either. There's nothing wrong with testing them out (by using sense perception and a process of reason) if you think they may be important. If you do not think they are important then you can dismiss them.

I think that almost all scientific discoveries started out as hunches which were then confirmed by a process of reason. I'm sure there were also a lot of hunches which, upon further analysis, turned out not to be correct, and so were forgotten.

I have to emphasize that an intuition or hunch doesn't prove anything. It's the exercise of reason that proves or disproves it. Reason is authoritative.

But, yes, I would say that intuition is a valid basis for coming up with hypotheses. Hypotheses do not prove anything but have to be investigated or tested through a process of reason, and then it is the process of reason, based on the facts, that reveals the truth.

I think it's also possible to have an intuition or hunch which is arbitrary. You will find this out when you try to test it with reason and you don't have enough evidence to connect the hunch to reality one way or another. If you think evidence might become available later, you can set the hunch aside until that evidence arrives, but if not, the hunch should be dismissed.

Link to post
Share on other sites

From the perspective of the Objectivist epistemology, this is where I find problems with the ___ (claim, idea, argument…) in Durden’s post. The main problem is that we don’t have a clearly identified proposition that can be evaluated. Here are some propositions:

  • Many people specifically hated/loved the Republican presidential candidate to the point that they voted only on one issue
  • Many people were bored by other issues so that they failed to vote on those other issues
  • Many people voted for the libertarian presidential candidate, but there was no lower-office libertarian candidate
  • Election officials falsified voting results in favor of the Democratic presidential candidate
  • There was a national underground movement spearheaded by the Democrats to register fake voters, then mail-in vote votes for the Democratic presidential candidate
  • There was a foreign conspiracy involving Russia, China and France to undermine American democracy, and they did this

No proposition should be given a moment of consideration without evidence (OPAR 101). But none of these propositions is arbitrary, because there exists conceptual and observational evidence that makes each of these propositions possible. Still, “possible” especially “just barely so” isn’t good enough. How in the world can statistics ever constitute “evidence” that supports or refutes a claim? All that statistics can say is that the observed facts are “abnormal”. If I roll double sixes two consecutive times, that isn’t abnormal: if I do so 12 times in a row, that is abnormal. We need an empirical base for saying that this is abnormal – let’s not even raise the question of causality, we just care about correlation. Model 1 is that senatorial votes are the dependent variable, presidential votes are the independent variable: there’s also a constant. Model 2 is that senatorial votes are the dependent variable, the independent variables are presidential vote and swing state. Models 3-100 play around with the conceptually-reasonable independent variables (not the color of my shirt, definitely media recommendations; also, gubernatorial results, recent employment figures, stock market figures, number of registered firearms per resident, time from most recent terrorist attack, pandemic…). Some editing may be necessary, in case “swing state” is computed as the product of a few of those variables (that is, “swing state” is just an interaction between two or more variables).

While one can theoretically engage in a fishing expedition to see what model best predicts voting outcomes, the validity of the model depends on the conceptual validity of the variables and the consistency of the computation (a spectacularly-obvious problem with most covid statistical reporting). “Swing state” is a particularly dubious concept – a self-fulfilling prophecy. Washington has been a long-term swing state, with only a slight tendency to vote Democrat more often than Republican. But the pattern is decades-variable though Democrat since the end of the Regan era. Georgia is solidly Democrat except in the modern era, and Vermont is the opposite. In fact, I can’t find a definitive definition of “swing state”, which is a political meme invented less than 20 years ago.

I found a nice paper that discerns voting tendencies from mortality figures, where suicide is significantly correlated with voting for Trump (2016), also mortality from heart disease; also, Trump-voting correlates positively with motor-vehicular death but correlates negatively with Clinton-voting. Other variables tested are chlamydia, syphillis, MMR or DTaP immunization, rape, assault, property crimes.

IMO, the best model is the “honest answer” model: people will vote the way they say (in advance) that they will. The model has usually worked, but failed spectacularly in the previous election. The most useful statistical study would focus on the correlation between claimed voting preference, and actual results. I noticed that the level of opinion-poll based postulation was dramatically decreased this year, thank heavens. This website has accumulated polling results from various states, and they report a 50.5%-45.6% tendency in favor of Biden, whereas the unofficial final figure is Biden 49.7% vs 49%. Compare Washington state: predicted 59.4% Biden, 35.4% Trump and actual 58.4% vs 38.4%. Just because the honest-answer model is imperfect doesn’t mean that some other crazy theory is better. Instead, I would look at what causal principles underlie divergence from the predictions of the honest-answer model. Oddly, the opinion poll results for Washington closely match actual results, and despicable SurveyMonkey is the main source of Washington survey results.

 

 

 

Link to post
Share on other sites

"Tyler Durden" is a pseudonym on Zero Hedge, and is actually the name of a main character in the movie (and book) Fight Club -- which I thought was a very silly movie, and I haven't read the book, but whatever. Also, on Zero Hedge, "Tyler Durden" typically merely introduces information that comes from someone else.

Quote

How in the world can statistics ever constitute “evidence” that supports or refutes a claim? All that statistics can say is that the observed facts are “abnormal”.

True, and yet, "abnormal" results are important in a close race like this, and I'm interested in knowing just how abnormal they are.

There are allegations of fraud in certain states, and "abnormalities" in the voting results would tend to support such allegations, although it would not be enough to prove the allegations on its own.

Quote

“Swing state” is a particularly dubious concept – a self-fulfilling prophecy.

I'd pick the swing states as input, not output. A "swing state" in general is is a state that is nearly 50/50 and could go either way, as opposed to a state like Alaska or California where the outcome of the vote is more predictable. Which states are swing states can vary from one election to another, but it isn't a "prophecy" at all. It's an observation.

I suppose I'd pick out the states that seemed initially to have Trump leading, but then turned blue as the counting progressed. These states would be Michigan, Pennsylvania, Georgia, and North Carolina.

You need a baseline to determine what is "abnormal." For the baseline, I'd select the other 46 states.

It should be pretty easy to tell if the four swing states have abnormally high weirdness as compared to the other 46 states. If it's not easy to tell, then there's probably no fraud.

All these additional variables such as "number of firearms per resident" merely clutter things up and aren't important here. I would not include them in my model.

Link to post
Share on other sites
7 hours ago, necrovore said:

I think that almost all scientific discoveries started out as hunches which were then confirmed by a process of reason.

I'm not sure how you can say that scientific discoveries almost all start out as hunches or intuition. If you really want to get into it, your intuitions are built up from your knowledge base and conscious judgments, just like emotions. The beginning would actually be engaging the facts, or stated another way, the data and facts upon which the intuition rests. Intuitions are a good way to get rapidfire associations and connecting ideas which you don't normally connect, but this is a process of inspiration, not really a reasoning process itself. It would be dangerous, epistemologically speaking, if we consider this a starting point, because then scientific reasoning and logical reasoning pretty much boils down to proving and disproving intuition. 

Or more specifically to what David is saying, an intuition cannot provide a clearly identified proposition. Intuitions aren't propositions even, they are feelings. If by intuition you mean the rough initial thoughts after perceiving or looking at information, this is fine, but in any case, that would be a reference to a specific fact. 

 

Link to post
Share on other sites
4 minutes ago, Eiuol said:

The beginning would actually be engaging the facts, or stated another way, the data and facts upon which the intuition rests.

Yes, that's an important clarification. Thanks.

Link to post
Share on other sites

Browsing through some Slashdot comments, I found another statistical analysis, based on Benford's Law, here: https://threadreaderapp.com/thread/1324352213595181059.html

And another one for certain specific counties: https://joannenova.com.au/2020/11/biden-votes-pattern-fails-an-easy-first-test-for-tax-fraud/

Here's a link to the original comment: https://tech.slashdot.org/comments.pl?sid=17587176&cid=60702418

There's also a mentioned Reddit post that basically says that, when you look at the nation as a whole, the votes do seem to follow Benford's law. However, I would think that if there's fraud in only a few key states, it would be harder to spot when you add in all the other states.

Edit: I found another example via a comment on Hacker News (which usually leans left). https://github.com/cjph8914/2020_benfords

Edited by necrovore
Link to post
Share on other sites
Quote

I think that almost all scientific discoveries started out as hunches which were then confirmed by a process of reason.

I believe that that is distinctly not the actual case, although it is the way that certain philosophers of science purport that science works. Actually, they do not start with hunches, they start with prior observational knowledge and causal theories that explain the knowledge, plus some observations that cannot be integrated with the theories. “Hunch” is a slovenly expression referring to a process of reasoning that corresponds with Rand’s “check your premises”: you re-inspect the logic behind the original construction of the theory and discover viable alternatives. Some practitioners may do this at a more subconscious level: the confirmation part requires conscious and explicit argumentation. So the question is, a propos the putative “anomaly”, what are the actual observations, and causal principles, which allow one to say that there is an anomaly? In this case, you are dealing not with observations, but with very high level inferences which we hope are based on the axiomatic.

Before embracing Bedford’s Law, you ought to be skeptical about the law, to see if it really tells you anything about voting patterns, and why it does. Let’s take the Wiki statement of the law that “in many naturally occurring collections of numbers, the leading digit is likely to be small”. The first question is whether voting results are “naturally occurring”, or are they man-made. The second is, is there a “collection” or is there a single number? Third, what is the deal with “many” versus “most” or “all”? Finally, how do you validate this law? This study argues that it is empirically invalid.

Here is an example of my reasoned skepticism about the law: inspection of my monthly paycheck over 35 years. The law says that most of those paychecks should start with a 1, but actually 1 is the least frequent leading digit in that set. Does that prove that I was defrauded? How should one interpret the pictures provided by JoNova? The law doesn’t say that all collections of numbers have to have this distribution, so is there any significance to the three graphs chosen to make the case? Since there are nearly 200,000 voting precincts in the US, what is the probability that one could find three cities with this distribution. That is, is this a random sample, or a cherry-picked sample?

A basic descriptive-statistic fact missing from the discussion is information on precinct-size distribution. US Congressional districts have approximately 700,000 voters – there is a (legal) law that explains that. How big are voting precincts in Illinois, or Pennsylvania? There are 2,069 precincts in Chicago and 1,570,127 registered voters, so about 760 voters per precinct. How many votes should Trump and Biden have gotten, under the predictions of Bedford’s law? Using county-wide vote totally (invalid given the in-city vs. out of city political differences, but that’s what’s available to me), Trump’s “share” is about 33% and Biden’s is 66%. About 260+ votes for Trump per precinct, and 500 for Biden. There is a natural explanation for the graphs.

 

Link to post
Share on other sites
2 hours ago, DavidOdden said:

Actually, [scientists] do not start with hunches, they start with prior observational knowledge and causal theories that explain the knowledge, plus some observations that cannot be integrated with the theories.

Eiuol beat you to this observation and I already accepted that correction. Still, though, a factual observation preceding and giving rise to the "hunch" is only a step (albeit an essential one) inserted before all the steps that I described, so it doesn't really render anything I said invalid. You see something, then you have a hunch about it, then you gather more information and submit everything to the bar of reason.

What other alternative would you propose? It seems like you're trying to say that a hunch is only an emotion, and that therefore it's arbitrary by definition, and that therefore it is wrong to even investigate whether the hunch might be correct or not, and the result of such an investigation must also be wrong, and so, if scientists ever followed hunches we'd have to throw out all their results, so scientists can't be said to have ever followed hunches. I don't know if that's your actual argument, but it is a wrong argument, and it might be educational to consider why it is wrong even if it is not actually yours.

I'd rebut it by saying that a process of reason has inputs and it has outputs, and it also has a reason why you are carrying it out, but the reason why you are reasoning about something does not have to be one of the inputs to the process itself (and may even be invalid if used as one of the inputs). This is in much the same way that, you might cook up a recipe because you are hungry, but your hunger is not one of the ingredients in the food. So if you have a hunch about something, it's perfectly valid to use reason in an attempt to figure out whether the hunch is correct or not, even though the hunch in and of itself doesn't qualify as evidence. It is also valid to reason about things just because you are curious and for no other reason (even though curiosity isn't evidence of anything either, and shouldn't be treated as such).

This is not emotionalism, either, because emotionalism consists of elevating emotion above reason, and you can't do that until after you have the result of the reasoning (which you would then elevate the emotion above). It is a contradiction to claim that you can engage in reason before you engage in reason, for the purpose of deciding whether to engage in reason. I guess you'd have to engage in reason before you did even that, and so on back.

(As a slight correction to the paragraph above, I must add: It is also emotionalism to use an emotion as a justification for refusing to engage in reasoning in the first place.)

In OPAR, Peikoff doesn't say that emotions are arbitrary. He says that they are things whose relationship to reality is unknown. Any particular emotion could be arbitrary, but it could also be true or false. He also says that if you have a conflict between reason and emotion, that you should submit both to the bar of reason. But I would add to Peikoff here and say that, if you have an emotion versus nothing, because you haven't reasoned about the issue yet, then you can submit the emotion to the bar of reason alone. You might find out your emotion has a very good reason for existing -- or that it doesn't. And when you do get the result of your reasoning, if you reason correctly, you should have an argument that doesn't depend on the emotion at all, but on evidence and logic. So, through reason, an emotion can be validated or invalidated.

--

As for Benford's Law, the Wikipedia article that I linked to does say that there's some dispute about whether Benford's Law can be applied to election results or not. It also says that the Law can only be applied to certain sets of numbers. (I would say it applies to "random" sets of numbers such as the masses of planets or what-not.) It doesn't apply to sequentially assigned numbers, and it doesn't apply to the same number over and over (such as the amounts of someone's paychecks).

My interpretation of the page from Github is that the first two counties in FL and GA are "controls" and do not show fraud because Trump and Biden are the same, but the last three do show it because Trump and Biden are completely different. Also, the last three are not being "cherry-picked" because they are key counties in the same battleground states that I've mentioned before. It would be an amazing coincidence if these discrepancies popped up in so many random counties that you could "cherry pick" the counties in the same battleground states where fraud has already been alleged.

If these people are lying, it should be pretty easy to prove it.

Edited by necrovore
Link to post
Share on other sites
3 hours ago, necrovore said:

You see something, then you have a hunch about it, then you gather more information and submit everything to the bar of reason.

I don't think you understood my point completely. Yes, we agree that information comes first, and that intuitions are based on information, but I went further to say that intuitions don't tell you anything to use for reasoning about what is yet to be discovered. Sure, they have a lot to do with inspiration, figuring out your research interests, or simply mixing ideas. But they certainly don't tell you about the truth of things. As you said, the fact you are hungry is not an ingredient for cooking. Yeah, hungry is the analogy with intuition, but you seem to miss that cooking is the analogy with reasoning about facts. Intuition is not an ingredient for reasoning about facts. Like hunger, or any emotion really, intuition is an important ingredient of goal oriented action (purpose and self-esteem), and part of what it means to "do" science, but there isn't anything there to figure out anything to do with propositions. 

All that considered, your intuitions on voter fraud doesn't really count for much. Even if I don't convince you about intuition, I should be able to convince you about using Bayesian methods that make you quantify your belief. Despite all your intuitions, we can neutralize any concern of bias by saying that on the face of it, you have no information for or against this one idea. 

4 hours ago, necrovore said:

do not show fraud because Trump and Biden are the same, but the last three do show it because Trump and Biden are completely different.

I think all the law would say about numbers that don't match up with it are caused by "something" other than mere noise. You could say fraud. You could also say that people really really really hate Trump so much that they violate the usual expectations. You could simply say that people who voted for Jorgenson follow the usual principles of human action, as did those who voted for Trump, but those who voted for Biden did not follow the usual principles. The law is good for taxes, because the "somethings" that can happen are almost all fraud. For elections, fraud is one thing, but there are tons of other possibilities. 

If we go back to other elections, I actually wonder if violation of Benford's law is consistent with controversial elections, or for elections that the incumbent lost. Not because they are all fraudulent, but because voting out the incumbent is pretty uncommon anyway.

Even then, I don't see any confidence intervals or standard error on the github graphs...

Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
  • Recently Browsing   0 members

    No registered users viewing this page.

×
×
  • Create New...