Jump to content
Objectivism Online Forum

Math Guy

  • Posts

  • Joined

  • Last visited

Everything posted by Math Guy

  1. I agree with these concerns, broadly speaking, but I don't regard them as a conclusive argument against including the data. On the level of analyzing individual rulers, I won't use semi-legendary figures who live 400 years; but given some circumstantial evidence I will accept a dynasty whose precise sequence of rulers is unclear or disputed as nonetheless having governed for X years in total. I see your concern about pre-literate societies but then we have to ask, if they had no records and could not themselves determine who had ruled in the past, does the concept of a "dynasty" even apply? Part of the basic idea of a dynasty is that it establishes the legitimacy of successive rulers by demonstrating their connection to the past. At the point where records do not exist, any argument about the stability of multi-generation rule starts to become moot. I agree the early record may indeed be biased toward longer-lived dynasties, and that short-lived dynasties that did keep written records might have been overlooked. But to use this argument to exclude data items kind of begs the question. We're trying to discover what gives a society stability. If stability goes down as the size and complexity of a social system goes up, for example, then small kingdoms not in the historical record might actually have supported longer-lived dynasties. Obviously, we don't know, and I won't rule data in or out based on speculation. Yes and no. I considered throwing out everything after the start of the Industrial Revolution, for example, but I found it again begs the question. New dynasties formed after 1800 had lifespans averaging only a few decades. Existing dynasties made it through the century intact only to die in the Great War, or World War II. So under the pressure of republicanism, expanding literacy, steam power, capitalism, what have you, the early dynasties still behave differently from the late ones within that span of time. That's what we're testing for, so no data gets thrown out. Well, at this point you've thrown out roughly a third of my data, focusing on the extremes on either end, and yet there's still a downward trend. I agree that even after 1000 AD the list isn't complete, but again I'm reluctant to speculate about the statistical character of dynasties we know nothing about, and that may not even have existed. (Just for example, we now know that New Guinea supported a large population over the past several thousand years. There were enormous numbers of people living in the interior of the island that early 20th century explorers simply missed. But when first contacted, these previously unknown peoples didn't have large-scale political systems, or dynastic rule, or even written language. It seems unlikely that they had these things in the past, and lost them.) I agree that apart from the Old and New Worlds prior to 1492, dynasties were never absolutely independent of one another, but I think the idea that the trend can be explained by them competing for territory is kind of vague. I wouldn't say China was seriously competing with Spain at any point in their histories, for example. The Silk Road trade no doubt allowed East to influence West and vice versa, but it hardly seems like a determining factor in regime stability. And anyway, merely being able to come up with an alternative hypothesis for why there is a decline trend doesn't show mine is unsound. There are a bunch of remarks here that I've grouped together: I'm aware of the complexity of the story, and the problems with the quality of the data. However, the whole point of a maximum entropy curve is that it allows estimates to be made of the trend or the range of outcomes without knowing the details of how the system works. The data are sufficiently good to perceive that kind of simple trend (and sadly, as I observed, if kingship lists aren't good enough as data, then almost nothing from ancient history is). It's a startling claim, as even Jaynes observed, but it works in physics problems and there is no obvious reason why it would not work here. That is what my research was intended to determine. Toynbee makes a good point of comparison, as does Spengler. I don't think it's fair to say either man "cherry-picked" his data, just as I don't think that that complaint is true of me. If you want to accuse them of something, or me, I think a better charge would be overly broad generalization -- a theory that is excessively inclusive rather than exclusive. They intended to cover all the relevant cases and made a good-faith effort, but their arguments were too simple to explain all the facts. Mine starts with an apologia for why a very, very simple model might actually be sufficient. (And not to be snarky, but I think Toynbee and Spengler both stand up quite well as scholars and historians in comparison with Rand's "Attila and the Witch Doctor" essay.)
  2. First, let me say I'm excited to be talking with a serious historian (or history buff) about this, and thank you for commenting. If my thesis seems naive and presumptuous and annoying, well, perhaps it is all of those things. I started by saying it was preposterous, and I'm not going to flinch if people are skeptical. But let us see. The data for China that I used are (years duration first, then name) 628 Three August Ones and the Five Emperors 470 Xia Dynasty 554 Shang Dynasty 275 Western Zhou Dynasty 514 Eastern Zhou Dynasty 246 Spring and Autumn Period* 254 Warring States Period* 15 Qin Dynasty 215 Western Han Dynasty 16 Xin Dynasty 195 Eastern Han Dynasty 45 Three Kingdoms 52 Western Jin Dynasty 103 Eastern Jin Dynasty 161 Southern and Northern Dynasties 37 Sui Dynasty 289 Tang Dynasty 53 Five Dynasties and Ten Kingdoms 167 Northern Song Dynasty 152 Southern Song Dynasty 209 Liao Dynasty 119 Jin Dynasty 97 Yuan Dynasty 276 Ming Dynasty 1 Shun Dynasty 267 Qing Dynasty 4 Empire of China If I omit everything up to and including Shang, I still get a negative slope of -0.29 for the cumulative curve. (I have a reason for using the cumulative curve, which I will discuss below.) Wikipedia actually breaks down the Eastern Zhou into two partly overlapping sub-dynasties (shown with *), so if I use those instead of treating the Eastern Zhou as one, the slope becomes more shallow, and flattens noticeably at the end, but it's still -0.26. If I throw out the two rump dynasties (Shun in 1644 and Empire of China in 1912-16), there's a slight rise at the end which spoils the monotone shape of the curve, but it still doesn't get me back to the average of the first two accepted dynasties in the set. And at that point I would have thrown out 5 of my 22 data points. To get a positive curve I have to start with the Qin, which basically throws out everything in Chinese history prior to 221 BC. It would not take much for your knowledge of Chinese history to vastly exceed mine. But it seemed from my study of the Mandate of Heaven that the Shang Dynasty had to have some real historical foundation, since the theory of the Mandate was used by the Zhou to justify their overthrow of the Shang. Plus there seem to be ample primary sources cited for the Shang -- the "Bamboo Annals," inscriptions on bronze artifacts, archeological digs of palaces and tombs. Wikipedia lists several different sets of dates that have been proposed for the length of the Shang, but none put it any shorter than 500 years. Again, I'm not averse to doing the analysis independently of Wikipedia or some neutral source, assembling a custom data set. But coming from someone who is not a professional historian, I think the cries of "cherry picking" if I did it that way would be all the louder. First, about correlation coefficients, and confidence levels: In early drafts of my book, I actually computed all these, not just for the historical dynastic data but for all the various curves I presented. I took all that information out after about six months because 1) citing them all wreaked havoc with the flow of the text; 2) the coefficients and confidence intervals were typically good for large data sets but not as good for small ones; and 3) my particular method of presenting the data tended to force high correlation values in a way that a statistician would see as contrived. Out of these, 2) was probably the strongest argument. In effect, the specific figures didn't actually convey much in a given case, beyond whether I was working with a lot of data points, or a few. I could write a book that would have stupendously high correlation and confidence scores for every graph, simply by dropping 70-80 percent of my examples and sticking with large data sets. But it wouldn't convey the idea I'm working on. If I wanted to write about the application of the principle to a very broad range of items, then I would have to accept the data sets as they came. For example, if I have a dynasty that consists of 8 rulers, I can play with different ways of calculating the likelihood of the last 4 being less long-lived than the first 4 -- but all the methods I might use give the same weak confidence level, because N=8 just can't do much for you. I'm not expressing scorn for academic convention. It has its place. But this isn't a Popperian argument where I'm trying to falsify some proposition by showing the data lie outside an arbitrary confidence interval. Since the theme of the book is to illustrate the power of Jaynes' way of looking at probability, and since I'm arguing that these phenomena are invariably nonlinear, nonstationary, and don't obey the central limit theorem, it would be redundant to continually test, for each and every graph, the alternative thesis that they do obey the CLT and that the pattern is just random happenstance. Now, in regard to your first question, I use rank number versus cumulative years to get average duration, and plot the decline in average duration. Why do the calculation based on the cumulative curve? Because the theory says that the pattern depends on increasing set size, or rank. We can certainly do it other ways and still get a decline -- as you did when you fitted a linear estimate to my table and got the downward trend of about 1.5 years lost per additional dynasty. But if you think the cause is entropy, then you want above all to see what the pattern looks like in a cumulative plot. That allows me to compare apples to apples when I look at disparate data types. However, that also creates a strong auto-correlation effect. The scores for a cumulative plot are always good if the slope is even slightly downward. So if I were to cite correlation scores for the graphs the reader actually sees, it would be meaningless and misleading. I hope that's clear. I'll deal with the rest of your points tomorrow, when I have more time.
  3. You generally have to be careful to compare within a system rather than between systems, to look at many races taking place together at one time, rather than just one. So it works well if you compare small Canadian Parliamentary ridings (electoral districts) to large ones, or Senate races in small U.S. states versus large ones, or small districts in the City of London to large ones, for the same series of elections. It would not work, or at any rate the law would be much less apparent, if we created a grab-bag of different voter populations -- like mixing Greek, Ukrainian, and Iraqi election results in one data set. The law isn't that universal. It also doesn't work quite as well for presidential elections, where there is only one nationwide ballot. For more on this see below. Yeah, you got me on that one. I was being a little sloppy in referring to the 50.5-49.5 split, and didn't spell out what I meant. First, there is a very noticeable difference between small states and large ones. Hawaii or Alaska will generally deliver bigger majorities than California, when the vote is held for governor or Senate seats. But there's also a trick that distinguishes the American system from most others. Most systems are genuinely multiparty, and that means it is possible to win with no more than a plurality. Multiparty systems thus tend to have a steeper curve. The strength of the consensus fades more noticeably as we go from 2,500 to 250,000 voters. The U.S. system so effectively excludes serious third parties that the winner nearly always has a genuine majority. The failure to achieve consensus is much less noticeable if there are two and only two choices, not 3 or 5 or what have you. So the majorities are noticeably stronger in U.S. elections than in Canadian or European ones, and the curve isn't as steep.
  4. Yes. If you mean, should the very first managers of a team tend to last longer in the job, and later managers cycle through more frequently, then yes, that is what I would expect, though I haven't studied that case. It's true, for example, of comic books. A long-lived comic book will tend to keep its original creative team for many years, a decade or more. But eventually, when the founders leave, their replacements are on average less loyal, or less popular with the readers, or viewed as mere temp workers by management. In any case, they tend to have shorter tenures. This same principle applies to customer loyalty in a lot of industries. Your first customers are your most enthusiastic and committed. The 100th or the 10,000th shows up later and leaves much sooner. I call this "loyalty fatigue" and it has a big impact on businesses dependent on charging annual support fees, or "upselling" established customers. I've found a few counter-examples so far. It isn't true of mayors, at least not in the major American cities dating back to Revolutionary times. But the reason appears to be a massive change in the job description. Early mayors were unpaid, and were not elected. They were chosen by a committee and assigned the job. It was not unknown for even prominent citizens to move away rather than accept the honor. So comparing the tenure of modern mayors to 18th-century Boston is perhaps inappropriate. It's apples versus oranges.
  5. Hmm, these are excellent questions. It may take me several posts to deal with all the implications, so bear with me. To start by getting terminology right, you identify "random" events as always being independent. I agree that that is the conventional way of looking at the problem, the classical perspective laid down by Newton and Pascal in the 17th century. But that is not how Jaynes approached it. We actually go back to a more basic level of argument in Jaynes' system. Random behavior, Jaynes would say, simply means the behavior of a system with a range of possible states about which one has limited knowledge. If I must guess which state the system will be in at a given moment, then by definition the system is varying randomly. Randomness is a relationship between the observer and the observed. The system itself "knows" what it will do, or is doing. It is determinate. It is my knowledge that is indeterminate. I have to specify a range of possibilities because I don't know which one will occur. You may recognize that this is a Bayesian approach to probability. However, Jaynes then added his own innovation. He viewed probability as a branch of logic. He introduced the principle of maximum entropy as a kind of virtually all-purpose limiting case -- a logical axiom, if you like. The principle of maximum entropy is a supremely powerful principle because it is nearly always going to be applicable. Systems that are not at maximum entropy are the exception. Systems that are at maximum entropy are the rule. In everyday problems like economics or history, the principle of maximum entropy is even more useful than, say, conservation laws. There's no guarantee that the number of people in an economy will remain constant, or that the quantity of money will be conserved -- but there is every reason to think that disorder will always be at a maximum, no matter what sort of quantities we are measuring, whether any are conserved are not. One consequence of this approach is that the idea of independence recedes in importance. We don't necessarily need to know if the elements of the system are independent, or dependent, and we don't build up a detailed model of the classical kind of what the system might be doing. We take a different route entirely. We know that dependent or independent, they are thoroughly disordered -- and the kind of disorder is itself meaningful to us. Jaynes didn't use the Objectivist terminology, but he certainly would have understood it if I said that randomness is an epistemological concept, not a metaphysical one. The system does what it does, and we do not know the details: instead of trying to model the details we apply a few broad logical rules about what it has to be doing and make estimates based on those. It is possible to make a strong prediction about the behavior of very complex systems - that is, to specify a distribution curve that the observable system variables will obey -- using a very limited number of parameters, provided that the system does obey the principle of maximum entropy. The details of the system then remain unknown. One does not model the mechanics. One would instead compute what range of outcomes (what curve) would maximize the entropy -- that is, maximize the uncertainty of individual outcomes. This usually turns out to be a power law. Jaynes limited himself to demonstrating this in chemistry and physics but he acknowledged, in a number of papers, that it could be applied to economics and to other phenomena as well. In the past fifty years, many people have since done so. The results are elegant and powerful, and sidestep entirely the dependent/independent problem. So for example, if we sat down to estimate how elite incomes (the top 10 percent say) would be divided up in a society, we would immediately become bogged down in an insane number of details. What laws apply? What are the tax rates? What are the cultural imperatives? What sort of wealth does the society hold? Is it agrarian, industrial, post-industrial? Most people would shrug and say they have no idea how incomes are distributed. It would seem like an impossible task even to start building the model. However, Vilfredo Pareto discovered more than a century ago that ALL societies, from the medieval era to the present, obey the same simple rule for the distribution of elite incomes. They always follow a power law. The arrangement is a geometric function: it might be 100,000 people with incomes of $50,000 to 99,000, and then 40,000 with incomes of $100,000 to $199,000, and then 16,000 with incomes from $200,000 to $399,000. You see the rule? For every doubling in income, in this country there are just 0.4 times as many people who earn the higher amount. In another country the fraction might be 0.35 or 0.27 or some other number. However, plotted on a log-log graph, these cohorts or bins closely approach a straight line -- always. The only variation between countries is the slope of the line. Now, I contend, backed up by Jaynes, that the reason that Pareto's income curve works so neatly, and gives us this elegant power law, is because of the logical constraint of entropy. The set of all elite income-earners is large enough, the economy complex enough, and enough time has gone by, that you can ignore the details of the society and just assume that income uncertainty will be maximized according to the principle of maximum entropy. The range of incomes is not just very wide -- it is as wide as it can be, given the circumstances. It is maximally random. This makes it possible to estimate the distribution of incomes using a very simple curve despite the daunting social complexities involved. That's a very, very compact summary of a complex idea requiring several chapters in my book. It's a very different approach to probability, and it requires serious thinking about before one can do it confidently. When I say the cause of Pareto's elite incomes curve is a mathematical meta-cause, I mean this: Despite the economy's incredible complexity, and all the myriad possibilities of free will, the end result is always this very smooth geometric distribution. There are all sorts of things we don't know about the economy, but there is one thing we are confident we do know: its measurable variables must obey the principle of maximum entropy. So to answer your concluding question: Yes, I'm curve-fitting, but with a very powerful logical principle that says this shape of curve is privileged epistemologically.
  6. Ultimately I think the challenge to free will posed by curves like this can be met. I am not arguing that free will doesn't exist. But I think these curves do complicate the picture. The issues are subtle and far-reaching. For governments to move systematically from lasting 600 years to lasting 8 years, is good news overall for the cause of individual freedom. But if the erosion of regime stability is truly inexorable, if it follows a predictable mathematical rule, then the question arises: Where do philosophical ideas enter into the story? The decline in regime stability was evident several thousand years before Aristotle appeared on the scene. It continued in similar fashion after Aristotle, and after Augustine, and Aquinas, and Locke, and Kant. I cannot discern any acceleration or change in the pattern. The shape of the decline curve is the same throughout history. This contradicts (or at least appears to contradict) the Objectivist premise that philosophical ideas move history. Ideas might well move other aspects of history, but not regime stability. These curves appear to transcend (there's that word again) philosophy, religion, language, any influence other than numbers. I think the correct way to view these curves is as a manifestation of steadily increasing disorder. Early societies were extraordinarily uniform and predictable, at least as compared with the way we live now. Very little changed in any given year. There were no new language words, no new ideas, no economic improvements, and no political innovations. Thus regimes were very stable. However, each new nation-state that was founded necessarily had its own language and customs, its own economic foundation,and its own unique vulnerabilities. Innovation wasn't impossible, merely uncommon. The pace of change gradually accelerated, the nature of politics changed -- but the amount of change at any given point was governed by statistical rules unrelated to the content of the ideas. Knowing what philosophical ideas were in play in a given society is necessary to predicting its future course, but ideas alone are not sufficient to explain what happens. There is, so to speak, a speed limit for change. Societies will only absorb so much change at any one time -- not more, and not less either. Because these curves are so common (I have hundreds more to talk about), I think it is necessary to modify the Objectivist premise regarding the role of ideas. Ideas do drive history and social life, but only in a direction allowed by statistical laws, and only at the pace allowed by the size and complexity of social systems at that point.
  7. Well, yes, that's certainly true in 2009. But the whole list of dynasties covers every major power throughout history, since around 3,500 BC. Apart from a few very brief experiments in democracy, hereditary monarchy was the standard form of government everywhere, right up to the 20th century. So a decline in stability in hereditary monarchy might not be big news today . . . it affects places like Nepal or Saudi Arabia, and not much else. But it has huge implications for our study of history.
  8. My copy of OPAR is in a box somewhere and I haven't read it in a while. But I infer this has to be a reference to Peikoff's doctrine of "the arbitrary". Right? So I did a little searching on the Web under "arbitrary," just to make sure of the argument, and here's what Peikoff says: An arbitrary claim is one for which there is no evidence, either perceptual or conceptual. It is a brazen assertion, based neither on direct observation nor on any attempted logical inference therefrom. For example, a man tells you that the soul survives the death of the body; or that your fate will be determined by your birth on the cusp of Capricorn and Aquarius; or that he has a sixth sense which surpasses your five; or that a convention of gremlins is studying Hegel’s Logic on the planet Venus. If you ask him “Why?” he offers no argument. “I can’t prove any of these statements,” he admits—“but you can’t disprove them either.” And what, according to Peikoff, is one to do with such a claim? In the absence of evidence, there is no way to consider any idea, on any subject. There is no way to reach a cognitive verdict, favorable or otherwise, about a statement to which logic, knowledge, and reality are irrelevant. There is nothing the mind can do to or with such a phenomenon except sweep it aside. Okay, assuming for the moment that this is what you are referring to, let's next clarify what claim(s) you regard as arbitrary. I have made two distinct claims. One could perhaps break them down further but there are at least two separate ideas here: 1) If you consult Wikipedia's list of dynasties, and do some quick sums on the dates for each one, you will find that later hereditary rulers in a given dynasty on average had shorter terms than earlier ones. You will also find that the lengths of the dynasties themselves form a similar pattern. 2) The reason the terms of later rulers or the spans of later dynasties are shorter is because long series of hereditary rulers (and by extension, long series of party control in any system that transfers executive power) are governed by a power law, in conformity with the principle of maximum entropy. I am not at all surprised if anyone is skeptical about claim #2. I haven't come anywhere close to establishing it in a handful of posts. For a reader of this forum who does not have access to my book, it is merely an interesting hypothesis at this point. Not an arbitrary hypothesis in Peikoff's sense, let me stress, but not proven either. I am, however, surprised that you consign claim #1 to the status of "the arbitrary". Reading Peikoff's explanation of what that term means, I see no resemblance between it and what I am saying. It not only isn't arbitrary; I also have trouble even treating it as open to question. It is an observable fact. Responding to this has taken me some time. First, to understand what you were saying here I had to Google "taxonomic artifact". In 30 years of busy intellectual life I have never heard anyone (Objectivist or otherwise) use that term, and I did not think merely breaking it down into dictionary definitions of "taxonomic" and "artifact" was going to be sufficient. Google gave me 175 matches in total -- not a lot -- and I reviewed every single one. Of these, 94 were footnotes or endnotes referring to a 1987 essay in Nature entitled "Is the Periodicity of Extinctions a Taxonomic Artifact?" The overwhelming majority of the other 81 were references to paleontology or evolutionary biology, since 1987. In one case, a liberal blogger referred to the Republicans calling themselves "the party of morality" as a "taxonomic artifact". In one other case, Noam Chomsky made an analogy between the biological sense of the term and an obscure distinction between certain passive and active verbs. A book on museums also used the term as a sarcastic metaphor, referring to the Crystal Palace exhibition of 1851 as condensing the whole of human endeavor into one glittering and senseless category, roughly speaking, "things ruled over by Queen Victoria." That's all I could find. It is not a term in wide use even among paleontologists. So then I spent some time sorting out what sort of thing a "taxonomic artifact" actually is. It turns out that among paleontologists it has several meanings. It has been used to refer to a classification scheme that treats unrelated species as belonging to a different family. That seems straightforward, except that I have also found authors using it in the reverse sense -- a classification scheme that takes minor differences in the same species as constituting evidence of multiple species. Thus it does not seem to mean much more than "classification error," and might be dismissed as not a particularly useful term even in paleontology. However, the word "artifact" does have some significance here. It rescues the term from being redundant. Today's paleontologists quite often use statistical methods to assign classifications. They don't necessarily study the fossils by eye and form a reasoned judgment about them, finding such techniques both subjective and cumbersome. Instead they crunch numbers and let various scoring schemes determine what is a species and what isn't. So a "taxonomic artifact" arises specifically when you incorrectly classify fossils as a result of using an poorly chosen statistical procedure. Whew. Okay. So now I have some sense of this term's literal meaning, but I remain unable to apply it (even metaphorically, as I must assume you meant) to my own work. I didn't assign rulers to dynasties on the basis of a statistical procedure, and neither did the various scholarly authorities that Wikipedia references. The basic data don't rely on any kind of statistical reasoning. They are the most elementary sort of observation, e.g. in 1483 Richard III became protector on behalf of the 12-year-old son of his brother Edward IV. You would have grounds to worry about specific dynasties being artifacts if my calculations had any impact on how dynasties are identified and classified, but my calculations all come well after the fact, and nobody else involved in the process ever made any calculations. The metaphor of "taxonomic artifact" just doesn't apply here. It is a kind of error that cannot occur. I will deal with some of your other objections next. I'm not saying "dynasty" is primary and "nation" secondary. I'm saying "ruler" is primary in relation to "dynasty," since a dynasty is a collective noun relating to a number of rulers. A series of dynasties clearly is a derivative concept which requires both the idea of "ruler" and the idea of "dynasty" to make sense. It is still more derivative, hence my reference to it belonging on "the next level up". No epistemological error here. Yes, certainly the facts are primary. I'm not suggesting arguing from numbers unconnected to facts. But if I have 379 examples of a phenomenon to support one claim, and only 25 to support another, then all else being equal my claim with 379 examples is the stronger one. More measurements lead to a more precise abstraction about how the measurements are related. The larger the number of examples, the more convincing our estimate of, for example, the slope of a curve. I couldn't parse this sentence. You can certainly suggest countries at random for discussion, from the Wikipedia list. Here is the link: http://en.wikipedia.org/wiki/Dynasty I think saying I have abdicated responsibility for the relationship of my data to reality overstates your case by a long way. I was simply saying that I don't speak Portuguese, don't have access to the National Museum where the original source documents regarding the Portuguese monarchy are stored, and have not studied, for example, the long succession of treaties, surveys, tax censuses, and other documents that scholars use to define the ever-shifting borders of Portugal over the past 1,100 years. Much less do I speak Chinese or Russian or Swahili, or have access to the relevant national archives for their dynasties. If I must be responsible for proving all that firsthand, the project is impossible and secure knowledge of history is impossible. If there were time, or if I could find suitable helpers, I would do a more in-depth investigation of the sources underlying the Wikipedia list, and build a longer and more authoritative one. In fact I did do a search of many hours for kingship lists not on the Wikipedia page, and found several (such as Assyria). However, including or excluding these items did not alter my conclusions, and the items I found omitted tended to be disputed by scholars, or incomplete, much more often than the items Wikipedia accepted. So my effort to criticize the Wikipedia list for the most part reinforced its credibility. There is no absolutely secure position to take here. If I build a custom list of my own selections, I am open to accusations of cherry-picking my data. If I rely on Wikipedia, I accept responsibility for whatever flaws exist in their list. To maximize my credibility, I have chosen to cite Wikipedia but to do some independent work validating it. Yes. But this criticizes at least two distinct ideas while treating them as one. First there is the idea that later rulers have shorter terms. Never mind whether they obey a power law. Their terms are consistently shorter, in a very striking way. The question, "What fact of reality causes this distribution?" applies to that observation, that the later rulers stay in power for a shorter time. The power law is the start of an answer to that question. I maintain that the definition of "dynasty" behind the Wikipedia list is objective, that the data are reliable, and that the fact of a decline is firmly established. That is the correct starting point of the discussion: There is a decline, now how do we account for it? Historically, the idea of entropy was never strictly or solely about thermodynamics. That was its first and most spectacularly successful application, but it has many others. It began as a theory about enumerating the total number of possible states in a system. At that time (the late 19th century), the existence of atoms remained speculative. Details of their behavior were unavailable. So Boltzmann's arguments about entropy were on a very broad level. They referred, in effect, to the number of distinct moving parts (which might be atoms or molecules or something else altogether) and the number of states those parts could take on. Since 1948, science has recognized the distinct field of information entropy, introduced by Claude Shannon. It has nothing to do with thermodynamics, but it very definitely refers to the number of moving parts or possible states in a system. Since 1957, science has recognized the principle of maximum entropy introduced by Edwin Jaynes. Again, it has nothing specifically to do with heat loss or thermodynamics. There are textbooks applying it, for example, to economics. I cited one earlier. In writing about the principle of maximum entropy to an audience of lay readers, or even for scientists, it is best to be careful and start by assuming they know nothing about any other kind of entropy beyond the version governing thermodynamics. People are often surprised to learn that the principle even exists. But I thought that I already made that introduction. An appropriate challenge at this stage of the discussion might be that you haven't read or heard of Jaynes, and find it difficult to believe that the same mathematical rules can be applied to people (or viruses, or falling bombs, or other entities) that apply to atoms. That is legitimate and reasonable, and moreover I consider it my burden in the discussion to explain how that can work. Happy to do it, in fact. But I can't keep going back to the very beginning as if I hadn't spoken at all.
  9. That is a good way of putting it. I have a number of other stability measures that work the same way. For example, if you study election returns district by district, you will find a very regular power-law dependence of the result on the number of voters. Small districts tend to produce much more decisive majorities for the winner. Large districts produce weak majorities, or pluralities in a multi-party system. Once again the pattern appears to depend solely on numbers, not on culture or language or economics. In effect, when you try to achieve consensus on a particular candidate, information entropy (uncertainty) increases with the number of voters. A group of 2,500 voters will give the winner 75 percent of their votes, and distribute the remaining 25 percent among the other candidates. It can keep that kind of consensus up for election after election. A group of 25,000 voters will almost never do that well. They might manage a 60-40 split on a regular basis. A group of 250,000 voters will be lucky to manage 55-45. And so on. You can see the implications almost immediately. Large states or districts can't form strong consensus in favor of any one leader, and so they are more unstable. Small states or districts can stay with a policy or platform for several cycles with far less difficulty. The perpetual 50.5-49.5 split between presidential candidates in the modern U.S. system is not, at root, a problem with ideology or culture. It is not the fault of the Republicans and Democrats that America is perpetually locked in red-versus-blue warfare. It's a consequence of the principle of maximum entropy.
  10. DavidOdden put the argument slightly differently, but I think you are both concerned about the same thing, namely the legitimacy of my calling the principle of maximum entropy either a cause, or a law -- and a "transcendant" cause or law at that! Okay. First, can we agree that science does speak in terms of "laws" of entropy, that is, the laws of thermodynamics? In particular, the second law of thermodynamics says that for a closed system, the amount of entropy must either stay the same or increase. That is a law that is as firmly established as anything in science. I can cite prominent physicists saying they would doubt almost anything before they would doubt the truth of the second law. Now, that doesn't mean that all scientists would accept this application of the principle of maximum entropy as legitimate. If they are not familiar with the work of Jaynes, they might be quite surprised to see the idea extended to cover something other than heat energy and temperature. But from a philosophical point of view, my use of the principle doesn't raise any problems that weren't already implicit in the classic, universally accepted second law. I'm using the same mathematical expressions, I'm just counting different entities with them. What I'm saying is, the second law as it is used every day in physics now is a "transcendant" law or cause in the same way as the decline effect I am talking about. It is no less of a mathematical abstraction. If one is nonsense, so is the other. If one cannot qualify as a law, neither can the other. Of course, one occasionally sees harsh criticism from Objectivists of concepts from mainstream science as being incoherent or philosophically unsound. I've cited Rand myself as saying just that -- that science is riddled with theories that are "floating," that don't have proper referents. I believe Harry Binswanger is particularly noted for having carried on this line of attack since Rand's death. So possibly it is already part of the Objectivist canon that the second law of thermodynamics is flawed conceptually, and I just failed to see that particular claim in print. The short version is: Plasmatic, DavidOdden, I'm not rejecting your arguments as such. I just want to establish, before we get too far along, what the scope of your respective claims really is. So far they both seem to me to be very sweeping. You are indicting a huge range of established science along with my argument. I think before we can get any further we need to confirm that that is what you meant.
  11. Well, to take your second objection first, I certainly understand the need not to go picking data to fit one's thesis, but I really don't think I've done that. The primary level of the phenomenon, the level where it is most evident, is within any one particular dynasty. That is where examples exist in greatest number, and there is the least ambiguity about what we mean. I would say the idea of a dynasty is one of the oldest, and most firmly established, in historical thought. Indeed, one can make a good argument that the origin of history as a subject of study is profoundly intertwined with kingship lists. Dynasties were one of the first, if not the very first, objects of analysis by historians. There are hundreds of dynasties and the downward trend is quite unambiguous. Your point applies to the next level up, where we string together successive dynasties ruling the same country. There is admittedly some danger at that next level of putting together unrelated items in an arbitrary fashion, of picking one dynasty as being the 'first' to rule the country and forcing the data to fit the hypothesis. Plus there just aren't as many cases to work with, which given that this is a statistical argument is a handicap. My rule with regard to data is that I don't get to pick and choose. I identify a source that is widely recognized and as far as possible, impartial, and then I take all the cases presented by it. In the case of dynasties, I actually started out years ago by using encyclopedias and assembling the data manually. But not long ago Wikipedia assembled an article on dynasties that listed large numbers of them in convenient form. Since Wikipedia itself relies on sources like encyclopedias, and since it is far, far more convenient for readers to go there than to locate a printed Britannica, I now use Wikipedia as my source. So I can concede that there is some uncertainty about the precise origin and boundaries of the entity "Portugal," and whether Portugal was an hereditary monarchy throughout the years I've cited. But I'm not manipulating the data. It's up to other people, more expert than me, whether to list a given dynasty as ruling "Portugal". Your first point relates to something Plasmatic said, so I'll address it below, along with his point.
  12. Now, let's review what my argument regarding epidemics requires. It's really simple. All we need do is imagine that we were subtly wrong about the way that infectious organisms are distributed among many hosts. Historically we assumed that propagation from one host to another was strictly independent, that the mean amount of infectious material transmitted was stationary over time. It was X number of germs on the first occasion of transmission, and on the 100th, and on the 10,000th, and so on. We couldn't prove this notion, however, without actually counting infectious organisms for each and every patient. We couldn't do that in the 19th century, or the 20th century. We're barely able to do it now, in a few special cases. It was an inference, a plausible hypothesis, that was very, very convenient because it simplified the math. It was not an empirically observed fact. If the average amount were to vary, especially if it were to drop with increasing epidemic size, then all bets would be off. The standard equations wouldn't work. An increasing proportion of patients would get a sub-clinical dose of organisms, and their immune system would defeat the invaders, and they would not even become symptomatic, much less capable of infecting other people. A decreasing proportion of people would get a sufficiently high dose of organisms to wind up dying of them. So transmission would effectively fall, and so would mortality. Assume this one thing, and the rest of the scientific framework remains intact. The dose-mortality relationship is well established. It's the foundation of vaccination, for example. All we need is proof that over time, the distribution changes in this nonlinear, nonstationary way. We have one disease for which individual counts of organisms have actually been made, for essentially every patient -- AIDS. The dose-mortality relationship in AIDS has been documented. If your very first visit to the doctor shows you with very high "viral load," you will likely die in a matter of months. If it shows a sufficiently low "viral load," then technically you have HIV disease, but you will likely never develop full-blown AIDS. The range of possible initial doses has been demonstrated to vary by many orders of magnitude. One patient can easily have 1,000 times the viral load of another. The distribution is highly skewed, with a lot of people having small viral loads, and a few having high. All of this is totally consistent with my model. The curves look like curves for totally unrelated things, like the distribution of casualties from bombing, or the distribution of hits on webpages. The only question we cannot answer with certainty is what the distribution was 20 years ago, at the start of the epidemic. Viral load testing was unavailable then. If there was a higher proportion of people with high viral load 20 years ago, then the mortality rate would have been higher, and transmission rates would have been higher as well. And here we come to one of the great ironies of AIDS, because we know that they were. They have come down on their own, to a large extent, well before the drug cocktails were invented. This will sound odd simply because I'm saying it, and you didn't hear it from a CDC medical spokesperson. But (1) the official government numbers clearly show it, and (2) if the older folks among you think back, you will clearly remember it. At the start of the AIDS epidemic, mortality was frighteningly high. Not only did everyone who had AIDS die, they died in weeks or months. The number of people with the disease was doubling every few months. There were projections (by Oprah Winfrey, among others) that 50 million Americans would have AIDS by 1990. Every year, the latent period for AIDS was re-estimated as being longer. The death rate as a percentage of those infected went down, from 40 percent every six months to 11 percent every six months. There were also numerous controlled studies that showed transmission in 1995 could not possibly be as efficient as it had been in 1985. And this was all before medical science had anything that would slow the disease. It predated the "drug cocktails" and the present era of managing AIDS as a chronic disease. I'll say it again: Most of the mortality decline in AIDS happened before the protease inhibitors became available. My thesis is that AIDS, despite being viewed as unique among diseases, tells us something fundamental about the nature of all diseases. We are able to see, for the first time, how the way in which organisms are distributed among individual hosts influences the clinical presentation of the disease. It is viewed at one point as a ferocious killer tearing through entire communities; then, when the caseload has grown by a factor of 10 or 100, it is viewed as being only half or a quarter as dangerous. The main reason, the universal reason, is not mutation or natural selection of different strains (although this may be present). The reason is that that is how randomness actually works. This is a summary of something that takes 30-odd pages and numerous graphs in my book to explain. It is informal, without footnotes. I hope it won't get me labeled as an AIDS kook. (If it gets me labeled as a probability kook, so be it.) EDIT: fixed a BBCode problem
  13. Yes, although the discussion of why this particular curve should show up so often still remains. "It is what it is" doesn't satisfy everyone, hence the difficult and nuanced discussion of what qualifies as a cause, or a transcendant cause, or a law. Well, I'm glad you find it intuitive but that is emphatically not what they teach in medical textbooks. Here's a brief summary of what I found with regard to epidemic diseases. In the mid-19th century, before the germ theory of disease was firmly established, a man named William Farr, in charge of public health statistics in England, began putting out studies of mortality and morbidity (death rates and transmission rates). These were not the first systematic studies ever done, but they were much more authoritative and insightful than anything previous. Farr made several key observations that (so far as I can discover) were unique to him and the first of their kind. First, Farr observed that the number of people infected in epidemics tends to rise exponentially, then abruptly break, and drop faster than it went up. He suggested, as a rough approximation, that the data would fit a quadratic equation. In effect, this implies that every person who is infected will pass the disease on with the same efficiency as the last one. Transmission numbers rise at a constant rate up to the point where the supply of people to infect starts running out. If adopted without further examination, this leads directly to the modern S-curve taught in epidemiology texts. But there is reason to doubt whether Farr ever actually thought that transmission was uniformly efficient, or whether he was just grabbing a handy function as a first approximation. For one thing, the S-curve is symmetrical, and Farr said epidemics come down faster than they go up. The figures Farr actually had at the time, and those gathered over the next half-century, did not fit an S-curve or normal curve very well at all. As late as the 1940's one could find people marveling at how poorly the data fit the model. What they actually suggest is that long before the disease runs out of victims, it is already slowing down. Transmission starts becoming less efficient almost immediately. However, once normal curves became the fashion everywhere in science, they were seized upon in epidemiology as well. "Farr's Law" is generally expressed as an S-curve in textbooks now, without even a footnote regarding what Farr himself knew or thought. Farr's second important discovery had to do with how one tracks mortality. He believed that mortality rates changed over the course of an epidemic. He had curves for cholera and other diseases that clearly implied it. One reason why Farr believed this was because his knowledge of disease predated the germ theory. He believed in miasmas and other environmental causes, which of course people thought of as having less and less effect if they were diluted. It mattered a great deal how much of the toxic substance was present in a given neighborhood, how much had been washed away by the rain, and so on. If divided up among more victims, miasma would again be less dangerous to any one person. So people coughing up miasma that they got in one district could be expected to "infect" their secondary victims, but with ever-increasing inefficiency. Once the germ theory came in, this idea of diminishing effect became obsolete. It now seemed clear that a very small quantity of infectious organisms would multiply many times over in each patient. The idea of diminishing mortality across the course of an epidemic seemed forlorn, since each patient basically started over at the beginning. Science became focused on antibody protection and the strength of the immune system in attacking the invading organism. Farr's idea was forgotten -- literally, for over a century nobody pursued it. The idea of the later victims of an epidemic automatically being less sick, simply because they come later, isn't up for discussion at this point in medical circles. You'll notice that epidemiologists refer to "the" mortality rate for a disease. Ebola (hemorrhagic fever) has a mortality rate of 80 percent. Smallpox (when it still existed) had two mortality rates, one for the "major" strain of 30 percent, the other for the "minor" strain at 1 percent. And so on. Now, I am not a doctor, and I don't mean to sound like an arrogant ass. However, for more than a decade I have tried and tried, and I cannot find any infectious diseases that erupt into epidemics that do not decline in both transmission rate and mortality. For the older diseases, such as plague in Europe or smallpox in the New World, the evidence is sketchy and anecdotal, but witnesses clearly report the disease moving more slowly and killing fewer people in the late stages of the epidemic. For modern diseases, we have actual World Health Organization data to look at, and the result is the same. One gets the impression from media reports (and Tom Clancy novels) of Ebola being fantastically lethal, basically not only killing an entire village in days, but killing the hospital staff who try to treat them, and then burning out before it can spread further. But in fact there have been less well-publicized outbreaks of hemorrhagic fever in Asia that involved thousands or tens of thousands of people (such as during the Korean War), and in those places the majority of patients survived, even though we have absolutely no drugs that work on any of the hemorrhagic fever varieties. In the case of H1N1, I've had the opportunity to test my model in real time. The initial outbreak in and around Mexico City raised tremendous alarm because the mortality rate was 5 percent or more among the first thousand patients. The authorities did not at that time have a lab test for H1N1 and so there was some doubt about the real mortality rate. But while lab testing eliminated some cases as not being H1N1, the mortality rate for the remaining cases was still 5 percent in late April. From there, as it spread, the official WHO mortality rate kept going down and down. At the end of the first week of May it was 2,400 cases and 1.9 percent. By mid-May, 7,500 cases and 0.9 percent. By the third week in May, 11,000 cases and 0.8 percent. At month end, 22,000 cases and 0.6 percent. Eventually the WHO reporting system broke down. The real patient count was in the millions but most national authorities simply could not identify or test more than a tiny fraction of the total. Now the only estimates were based on sampling, and reports from doctors' offices of increased requests for flu shots, and so on. The CDC reported on November 12th that as of October 18, 22 million Americans had caught H1N1, and 3,900 had died of it. That's a rate of 0.02 percent, or an incredible 1/250th of the mortality rate for the first hundred patients in Mexico City. Using my standard decline curve, I would have predicted a less dramatic drop to about 0.08 percent. Either way, the idea of there being one fixed mortality rate for H1N1 seems not just doubtful, but kind of ridiculous. It has done nothing but decline the entire time. The WHO figures for transmission (up to the point where they stop making sense) give the same impression. Neither transmission nor mortality were constant in this global pandemic. Of course, medicine has a backup answer for why epidemics behave this way, and it has enjoyed great plausibility for generations now. The argument is that mutation, followed by rapid natural selection, leads the later part of the epidemic, or later epidemic waves of the same disease, to be less lethal. Basically, a random mutation shows up that kills fewer patients, and because its victims live longer, it is transmitted to more people. It gains a competitive advantage over its more immediately lethal cousin. I'm not ridiculing this theory. It is plausible, there is certainly lots of mutation going on, properties of diseases have been proven to change. But it is used surprisingly often, one might say compulsively. It comes up for virtually every major disease. That's a lot of very convenient and rapid natural selection! Plus it is used not only for diseases like smallpox that kill large numbers, but also for diseases like H1N1. It is hard to see how a drop in mortality from 2 percent to 0.02 percent could give a disease much of an advantage. There are variations of the argument. Maybe the mutation makes the disease easier to transmit as well. Maybe different strains attack different risk groups. There is more than a century of thought behind the idea, and I am sure there is much in the theory that is true. But the whole thing is a bit of a Rube Goldberg machine. The mutation shows up again and again, just when it is needed to explain inconvenient statistics. So in 2007, a team studying smallpox set out to confirm or deny the model, comparing different samples from historical smallpox victims to see just how different the strains were, and to correlate the spread of particular strains with changes in the mortality rate. Unfortunately they found that the "major" strain associated with a 30 percent mortality rate sometimes showed up in regions with very low mortality, and the "minor" strain showed up where mortality was high. Differences in the strains could not explain the huge differences in mortality that were observed. I'll conclude this in the next post as this one is getting long.
  14. I read them all as a teenager (devoured them would be closer to the truth) and I was bitterly disappointed at the way Asimov resolved the story. I felt that he was saying something profoundly interesting and serious when he proposed the idea of psychohistory, and that the later business of telepathy and mind control, while it made for an interesting story, pretty much abandoned the promising argument he started with for a much more familiar sort of space opera. I've looked at the sequels that appeared later on (both Asimov's own and those written after his death for his estate). Some are interesting, but my feeling has always been that the science lost out to the fiction. What would really have made his work epic in intellectual history was if he had given more substance to the idea of psychohistory, made it more like a real science. But of course, real science is hard, hard work, and Asimov wrote Foundation when he was a very young man and desperately trying to make a living. He couldn't have afforded to spend more time on it than he did. When I started seeing the implications of my work, 10-15 years ago, I seriously considered adopting the pen name Harry Seldon, as homage.
  15. None of my examples so far, no. I agree, one can make the argument that an office-holder's behavior is decisively shaped by the imperatives of the office, the social network as you call it. But I have some examples where this argument is not available, where we are talking about the behavior of individuals, which I think are very interesting. More on that presently. Well, it might seem hasty to you given that I got to it after a dozen posts, but it actually took me 20 years. I'm familiar with the Objectivist argument against reification, and I don't think that's what I'm doing. I agree I'm being a little provocative in using the word "transcendant," because that is a word that gets abused quite a bit. But here I am using it in the correct sense: there are endless local, anecdotal causes for these curves. We can see that the factors governing traffic accidents are different from the factors governing participation in online forums, and different again from transmission or mortality rates in epidemics. Yet they all yield the same shape of curve, and close to the same slope. I say that what causes them to be similar is something that transcends the particulars of each case. You want to say that the systems are similar, and their similarities are what make the curves the same. I think the hangup here is more a matter of terminology than real content. I am not arguing for some kind of ineffable World Spirit of the Hegelian kind, working behind the scenes. The transcendant cause I have in mind is mathematical, not mystical. If you find it more satisfactory to say that the cause is mathematical similarity, that is fine. Given the vastness of the range of cases, I like a word that emphasizes scope. Given that I haven't demonstrated that huge scope as yet, you quite reasonably prefer more conservative language. Yes. Well said. But there is more in play here than I have been able to say so far. The insights that Edwin Jaynes had into probability and inference are truly awesome, and we have barely touched on them as yet. The first problem is that these curves not only aren't normal, they're not lognormal either. They don't obey the central limit theorem, at all. The mean value changes as the set grows larger. That's one of the truly frustrating aspects common to all these obscure ad hoc laws. They can't be dealt with using standard statistical methods. If you take a look at Nicholas Taleb's books, such as Fooled by Randomness or The Black Swan, you'll see example after example of what I mean. They are orphan curves precisely because they don't fit the standard model. They have been left to languish in dark corners because in dealing with them one cannot assign a z-score, or arrive at a stable mean, or do any of the things that universities train us to do. Jaynes came up with arguments that make such curves more tractable. The principle of maximum entropy reframes the whole question of independence, and the applicability of the central limit theorem. It's very subtle stuff. I don't want to go off in all directions here, so I will simply say that for now. Independence in the sense used by classical probability remains on the table as an issue. Incidentally, Jaynes found the quantum-mechanical notion of reverse time traveling coordination bogus. He was a lifelong skeptic regarding the Copenhagen Interpretation and the obscurantism associated with it. I think Objectivists would find his complaints in that area very welcome and insightful. Ultimately he believed that the principle of maximum entropy could be used to reconcile quantum phenomena with everyday phenomena, to treat them all according to a single set of rules. That is surely an exciting prospect, and I have been strongly encouraged by my own work to believe Jaynes was right. To put it very simply, for those readers who aren't necessarily familiar with statistics, or some of these philosophy of science issues: I think these curves seem spooky because we have the wrong idea about how probability works. I'm not arguing against free will. I'm arguing for a more robust view of probability that will help us understand why these curves are "normal". Along the way, we will also develop a more subtle understanding of just what it is we are free to do.
  16. Well, I have given you a "why" as well as a "what" here. I've said that the common cause is that all these processes conform to the principle of maximum entropy, even the ones that don't result in this particular power law. That's actually pretty uncontroversial among mathematicians familiar with the principle. It all depends on what a system is capable of doing. I gave the example of a six-sided die. If you throw it many thousands of times, the behavior (at least on the level of whether it comes up '1' or '6') doesn't resemble a power law because there are exactly six possible outcomes and the design of the die makes them all alike. Viewed on that very basic level, the die isn't capable of exhibiting the sort of behavior I am interested in, the decline of rare outcomes over time. It's too simple an object, with too few possible states. Nonetheless Edwin Jaynes said specifically, and other mathematicians back him up, that the behavior of the die does conform to the principle of maximum entropy -- just in its own way. I can lay out the math behind this, but there isn't anything novel or controversial about that aspect of the story, and it's kind of boring to laymen. I'm taking that particular aspect of the problem for granted. I suppose you can call it "cherry picking" that I focus on all these orphan laws that have no explanation, and try to establish that the explanation is entropy. But it's a virtuous kind of cherry picking. I'm going after low-hanging fruit that is delicious and nutritious, and there's nothing wrong with that. Explaining these obscure curves from diverse fields as all arising from one cause is something Rand urged scientists to do more often.
  17. Well, I don't feel that I'm "baiting the suckers" exactly, more like acknowledging concerns that I have heard before and that I anticipate hearing here. But okay. Your point about vocabulary is certainly on point. We have to agree on the meaning of terms. Now, for example, I refer to this curve as a "cause," or more precisely as a meta-cause. It is not the only cause operating in a given case, but it is among the causes for the outcome we observe, and it is very broad in application, hence my use of the term "meta-cause". You prefer to say it is not a cause, but a law. You give gravity as an example of a law. I admit to some concern about calling maximum entropy a cause, but I don't think there is much to be gained by calling it a law instead. For example, if I fall out of a window and break my leg, the cause of my falling is gravity -- yes? I fell because there is an attractive force between two bodies that is proportional to their masses and inversely proportional to the square of the distance between them. The word "cause" in this context is appropriate. If I said, I fell and broke my leg in accordance with the law of gravity, that avoids using the taboo word "cause" but I don't see that it really changes very much else. I'm prepared to say that human beings make choices in accordance with this power law, and to refrain from calling the law a cause, if that will help. But I'm really not sure that it will. Perhaps you can say more about why you think this distinction matters. Yes, okay, two questions here, both good ones. The first is "What is this a law of?" It is a law pertaining to randomness, to the behavior of large, complex systems with many seemingly independent parts capable of taking on many different values or measurements. The law says that the observed behavior of such systems will evolve in a consistent way, predictable using simple mathematical rules. Short version: It is a law of randomness. Second question: Prove the claim, at the level of data. Okay, but just to save everyone paging through endless posts consisting of nothing but figures, I'll do some sample calculations and refer you to my sources for the rest. First some examples of decline on the dynastic level: Dynasties ruling Portugal, 868-present Vimara Peres to 1072, 204 years, so for N=1, mean length 204 2nd County to 1139, 67 years, so for N=2, cumulative mean 135.5 Burgundy to 1385, 246 years, so for N=3, cumulative mean 172.33 Aviz to 1495, 110 years, so for N=4, cumulative mean 156.75 Aviz-Beja to 1581, 86 years, so for N=5, cumulative mean 142.6 Hapsburg to 1640, 59 years, so for N=6, cumulative mean 128.67 Braganza to 1853, 215 years, so for N=7, cumulative mean 140.71 Saxe-Cobourg Gotha to 1910, 57 years, so for N=8, cumulative mean 130.25 Braganza to 2007, 97 years, so for N=9, cumulative mean 126.6 If you plot these on a spreadsheet and fit a power-law curve to them, you get an exponent of -0.178. My universal model predicts -0.30576. So this is a slightly shallower decline than expected, but then they cluster around -0.30576, they don't precisely match it every time. You can test the data set several ways, such as looking at how many items are above the median value at any given point. They all show a robust decline. For a data set this small, it is pretty clear. Dynasties ruling France, 843-1870 There are 14 dynasties in the series, from the Carolingians to the last of the Bonapartes. I'll give the lengths of each, and the cumulative means: 144 / 144 341 / 242.5 170 / 218.3 17 / 168 74 / 149.2 203 / 158.2 12 / 137.3 10 / 121.4 1 / 108 0.5 / 97.2 15 / 89.7 18 / 83.8 4 / 77.6 18 / 73.4 Stability in France declined a good deal more steeply than in Portugal. The best fit curve has an exponent of -0.385, a little worse than my universal model. Again, you can play with these numbers in various ways but I think the conclusion is always going to be a power law decline. I'll post examples of decline within a dynasty next.
  18. The former, emphatically. The name "Math Guy" works great in most contexts, and I've used it for years, but it conveys a slightly rationalistic, abstracted perspective to Objectivists that isn't at all the way I actually work. The reality of most of these ad hoc laws isn't seriously disputed. They've been tested and retested for decades. So they are facts. The trouble is that they're disconnected from the rest of our knowledge. I'm trying to establish the common connection between them, the power of the principle of maximum entropy, as also being a fact -- that is, as much more than a convenient hypothetical model. I'm not just doing curve-fitting here. I see this as a truth, an important epistemological and metaphysical truth about the way the world works. Entropy always increases, and that truth influences religion and economics and disease and war and so on. But if it was easy to show that, it might very well have been done decades ago. So in conversation I refer to it as a hypothesis, or as a mathematical model, and urge people to make up their own minds. I distinguish between my own conviction about the connection between these laws, and what my readers are able to see at any given point.
  19. It's great the way people respond to my examples with just the right vocabulary. In fact, that terminology -- "early adopters" and "late adopters" comes from the work of Everett Rogers, and his curve is one of the many ad hoc curves I want to replace with my universal one. Rogers made an assumption that is very, very common in science history. He assumed that the bell curve dominated the process of innovation adoption. He divided the population into groups according to their willingness to adopt a new technology or idea, basically taking slices of a bell curve. This worked okay because most people took his categories as loose, descriptive metaphors, not as precise mathematical definitions. So relatively few people even know what percentage of the population is supposed to be "early adopters" and what percentage "late adopters". It doesn't matter. Most of the time we just wave our hands and use the terms without specifying any numbers. But if you actually dig down into Rogers' scheme, it becomes clear that his assumption was totally arbitrary, and unnecessary. There's no bell curve evident in the process. Data from real historical adoption processes -- like the spread of the Model T car in America -- show that the decline in customer commitment follows a power law, not a bell curve. If you have 10,000 customers, they are generally willing to pay X dollars for a product. In order to secure 100,000 customers, the price typically needs to come down to X/2. The later customers simply aren't as keen, can't use the product in as many ways or for as many hours of the day. Their lower valuation is rational given their context. And so it goes, for the first million and the first ten million and so on. The steepness of the demand curve varies from case to case. It isn't always perfectly in accord with my model. But it's very visibly a power law, not a bell curve or modified Bass S-curve (which is sometimes used instead of Rogers' original version). So yes, I think you're following the argument perfectly well to this point.
  20. Well, a little hasty, yes, but I was expecting more or less exactly this objection. This is where my Objectivist friends and acquaintances have parted company with me in the past, often with this exact phrasing: Anything that occurs in history will be based entirely upon what people do or think. Eiuol, I don't think it's fair to drag astrology into this, as I've given you no cause to do that. But the comparison with numerology is not unreasonable. As an aside to everyone else: Please don't recoil in horror if you don't get this right away. I'm not into numerology. The problem is that the principle of maximum entropy deals with the number of elements in a set, and the number of different states each element can take on. It is explicitly, and by design, a method of predicting behavior using nothing but rank in the set, i.e. a pure number. The Nth element behaves the way it does because it is the Nth element. So yes, it does sound rather like numerology. But there are plenty of precedents in science for attacking problems this way, which I think is Grames' point. It is no more unreasonable to make arguments based on rank order, than arguments based on geometry. Consider, for example, those experiments in quantum physics where a particle emits two photons in opposite directions, and measuring the first photon somehow influences the state of the second photon. That too is a theory predicting behavior based strictly on rank order. Reverse the order, and the 'second' photon now calls the tune, and the 'first' photon dances to it. (Also, as it happens, the golden ratio phi actually comes into the story eventually, in the details of scale invariance, so the story is not simply about rank, but also about geometry to some degree.) Here's the basic problem, which I think Eiuol has sensed correctly. If you all are kind enough to give me sufficient time and attention, I'm going to present a series of these curves. My next example is going to be from epidemiology. I have a bunch of interesting research I've done on the spread of H1N1, HIV-AIDS, and so on. Then I want to talk about participation in websites, which is a nice, concrete, tangible sort of case to deal with. For each curve we can have a vigorous discussion and dream up plenty of local, anecdotal causes -- call them incentives, if that makes more sense -- that would cause people to choose to behave a certain way. We might not be able to prove that these were the incentives that people acted upon, but we can at least imagine them as being present. For example, the growth of agricultural output might make people richer, and thus ironically make kingdoms less stable, because they reduce the incentive to be loyal to the king. However, as we consider one curve and then another, we are going to be left without a good argument for why they are all so similar. A local cause simply will not do. There has to be a transcendant cause, a meta-cause, something that operates everywhere regardless of context. And that, as Eiuol observed, raises uncomfortable questions about free will. This common shape of curve cannot really be a cause, can it? It can only be coincidence. If the institution of monarchy was predestined to shrink to a pitiful vestige of itself after 1,000 dynasties . . . if EVERY dynasty was doomed to shrivel in similar fashion over time . . . and if participation in websites or in churches or on battlefields all follow this same curve . . . at some point we have to ask, isn't this determinism? Doesn't this presume that free will is being overridden by the imperative of the common curve shape? It's a good question. One might say it is THE question, certainly for Objectivists. These curves are hugely useful in predicting customer behavior, or battlefield outcomes, or the spread of a disease. They are much better tools that what science is using now. But to adopt them without a thorough discussion of the philosophical implications would be risky. The challenge here is that we have to talk coherently about large numbers of human choices forming a distribution curve, and yet still being free. If we can do that, then we can embrace all the practical insights that these curves provide, and not have to fear that we are undermining the idea of man as a rational, sovereign, autonomous being. This was the challenge that I frequently failed to meet in the early years of my research. We'll have to see if I've learned anything from those early setbacks. More on this presently, after I take care of some minor points.
  21. The key is that the power law applies both to sets and subsets. Suppose that I have a country like China with 20-30 dynasties, and each dynasty having some number of rulers. Each dynasty constitutes a distinct subset and has its local decline, according to the power law. Then the set of dynasties has a similar decline, with the individual rulers in each dynasty now subsumed and dynasties being treated as individual elements. Because the curves on both scales tend toward the same slope, the overall result is fractal. It looks the same as we "zoom" in or out. A given ruler is subject to several overlapping constraints. His term, on average, has to be X amount lower than the previous ruler in the dynasty. It also has to be low enough that his dynasty length is shorter than the previous dynasty. And his term is also affected, in a distant way, by the diminishing stability of monarchy across all of history. Think of each set as imposing one constraint, with the result being a series of simultaneous equations. No, quite correct. I'm not saying every aspect of nature follows a power law. What I'm saying (and Jaynes is saying) is more subtle. Entropy always increases. In many situations the increase in entropy results in a unique distribution of measurements that follows from the special conditions prevailing at the time. But in a surprisingly large range of situations it results in these power law curves. For example, consider a 6-sided die that is uniform and unweighted. Throw it a few thousand times and it will settle into a uniform distribution with equal likelihood of '1' through '6' coming up. The range of outcomes does not contain rarer and less rare items, and so no decline effect is possible. In the long run, you get equilibrium. Jaynes observed that this actually IS the maximum entropy distribution, it maximizes entropy given the range of possible outcomes. We don't think of it as a maximum entropy distribution, we just think of it as being dictated by classical statistics. But it happens that the two methods of forecasting the distribution of outcomes agree in this case. On the other hand, consider traffic accidents. Sixty years ago, a researcher named Smeed observed that countries with relatively small numbers of cars had higher per-capita fatal accident rates. Plotting the data for dozens of countries revealed a power law curve. If you doubled the number of cars on the road, the number of fatal accidents only rose by about 62 percent. Traffic accidents by any standard of measurement are rare events. Increasing entropy means that they get rarer. Thus if you have a larger pool of cars, more and more of them in percentage terms manage to avoid fatal collisions. Despite sixty years of theorizing by traffic analysts, no one knows why Smeed's Law works. My proposal is that it falls in this large category of things that all behave the same way, due to entropy. I think I get it. I would put it this way. Entropy applies to everything -- so says Jaynes, at least. However, entropy doesn't dictate a single uniform distribution for everything. It operates contextually, as we saw with the six-sided die. These particular power laws apply to a very large range of phenomena, much larger than anyone presently suspects, but not to everything. A formal definition of the conditions under which a power law is sure to apply is going to be a while in coming. At this point I prefer to work by giving a large range of examples rather than trying to establish a formal definition. I have shown that the same power law applies to: -- cost reduction curves in manufacturing (Wright-Henderson law) -- output reduction curves in macroeconomics (Cobb-Douglas factor elasticity) -- diminishing customer loyalty as the market for a product grows -- diminishing per-capita participation as membership in a web forum grows (more and more people join but then lurk rather than post) -- diminishing per-capita audience participation for web-based media (comments on YouTube videos) -- lower per-capita casualty rates for battles involving larger armies -- lower per-capita casualty rates for bombing raids involving larger numbers of bombs -- lower per-capita attendance in churches as they grow -- lower per-capita tithing and donation in churches as the grow also -- lower metabolic output per unit mass in single-celled organisms, insects, reptiles, and mammals (Kleiber's Law) -- slower mass gains in percentage terms for developing human fetus as it approaches maturity -- lower per-capita mortality for infectious epidemic diseases as total caseload grows (H1N1, cholera, plague, scarlet fever, Ebola, HIV-AIDS, basically all of them) -- lower transmission rates for the same diseases, again as total caseload grows -- voter turnout in large versus small jurisdictions and so on, and so on. I can list huge numbers of examples, but I am reluctant to try and draw a boundary around them and say definitively, 'Here are the precise conditions required for this behavior to occur.' All the examples above involve large numbers of discrete entities (people, or manufactured goods, or cells in a body) engaged in some probabilistic activity (whether to participate, get hit by a bullet, donate, metabolize, become infected). As the set of items subject to the same probability range grows, the mean for that range shifts. The rare items get rarer. If you think of the whole universe of activities in which rare items could get rarer, you will see why I resist trying to frame permanent, exhaustive conditions for the decline effect to occur. I've barely started cataloging all the possibilities.
  22. I've read Fischer, he did a very thorough and scholarly job. But I have to disagree that the cyclic aspect is lacking. What you are seeing in this "fractal" decline pattern is cycles within cycles. Decline on multiple levels creates a cyclical, sawtooth-type decline pattern. Say a monarch establishes a new dynasty, and rules for 30 years. His son then rules for 20, and the fortunes of the kingdom go up and down for several generations before finally trailing off in the 4-year reign of the infant great-great-grandson, and conquest by an upstart baron. The rise of the rebel baron to the throne represents the start of a new cycle. The Chinese were familiar with this kind of cycle, they called it "the Mandate of Heaven". At the beginning of a new dynasty there is not only the promise of reform, but quite often actual reform. The barbarians and bandits are driven off for a time, new roads and bridges are built, commerce revives, the arts and sciences receive fresh patronage. Then, gradually, the new court becomes as corrupt, cynical, and ineffectual as the old one was. Successive monarchs make promises they do not keep. The bandits and barbarians creep in. Eventually the kingdom is in total disarray, perhaps broken into pieces, and the dynasty is totally discredited. The Mandate then passes to whoever can keep order and restore confidence. The first reign of the new dynasty is typically much longer than the last reign of the old one. It sounds rather like Obamamania, doesn't it? No accident. The decline pattern for monarchies has continued into the modern era, for democracies. If we plot party control of the various U.S. state governments, from when they were colonies up to the present, there is a fairly smooth continuity present between the modern democratic era. The English crown was increasingly unstable during the 17th century, and so regimes lasted an average of 50-60 years -- typical of dynasties at that point. In the late 18th century, parties held power over a given state for an average of 10-11 years. In the first half of the 20th century, they averaged 8 years, and in the late 20th century, just 7 years. Control of the presidency has also declined in stability. During the first century or so the average was 20 years, now it too is down to 8 years. The different levels of decline make it difficult to adopt one single explanation. We can explain the long sweep of declining stability using such fundamental factors as agriculture, but then how do those same factors reverse themselves when a new dynasty is established? I have considered the argument that the change in stability in monarchies was due to systematic changes in agriculture, population growth, and so on. It is hard to refute because it requires proving a negative. How can we know there wasn't a poorly documented or misunderstood trend taking place down through the years that made monarchy more and more fragile? Perhaps increasing literacy? Or greater wealth, the growth of the middle class? Certainly those things were happening, and populations with greater wealth, knowledge, and numbers tend to demand more say in how they are governed. Given so many plausible possibilities and so little hard data, we might argue perpetually and not reach a final conclusion. But then as I said, we would still be stuck with the puzzle of why the next dynasty starts out more stable. The shape of the curve is a separate puzzle. Throughout my book, I make a distinction between local contributing causes and what Aristotle might have called the "formal" cause. A local, contributing cause is a reason for the trend to be downward: Monarchies become more unstable because the population becomes richer, smarter, better fed. The "formal" cause is the thing that gives the trend its specific form. So why does the curve have this shape, so similar to Smeed's Law and the Wright-Henderson law and all those others that I listed? Because of the principle of maximum entropy. This shape maximizes the spread of possible results, the uncertainty at any given time about what will happen. This kind of "formal" cause can also be called a meta-cause. It transcends the specific content of the curve, and imposes a universal shape on all sorts of different raw material. So in effect, I agree with you. Changes in fundamental economy, the accumulation of wealth and the rise of markets, were definitely a factor. This discovery doesn't reverse or overthrow the narrative of history that we all know about, from the bread and circuses of the Roman Empire to Magna Carta to the English Parliament and the French Revolution. What this curve does is reveal a spooky underlying order that has nothing to do with economy as such. To adequately explain the "flow" of history we need to consider both the local, anecdotal causes like agrarian economics, and this mathematical metacause. I hope that makes sense. I will tackle TuringAI's post when I have time later today.
  23. All right. Now we come to the weirdest aspect of the hereditary monarchies data set. We have observed two kinds of decline so far: -- Within a given dynasty, later rulers have shorter terms, and the cumulative average for all rulers in the set declines according to a power law with exponent roughly -0.30576. -- Within the set of dynasties for a particular country, later dynasties are also shorter, and the cumulative average again declines according to the same power law. This is, if you will, a fractal sort of relationship. We examine history on a "micro" level, looking at one family ruling one kingdom, and where we might expect to find bland uniformity, we find a very emphatic downward trend. Then we examine history on a more "macro" level and we see the same trend. We zoom out to a wider scale but find the same shape of curve. What will happen when we zoom out yet again? If we simply take all dynasties, regardless of what country they pertain to, and assemble them in chronological order, what will we see? Common sense cries out that what we should see here is a resumption of the indifference principle. Very well, within a country, there is decline. That is unexpected but not alarming. It suggests there is something at work on a local level that causes the later items to be influenced by the earlier ones. But now we are relating dynasties that have no historical, geographical, or logical relationship to one another. What does the Tang dynasty in 7th century China have to do with the House of Savoy in 1705? Why on Earth would measurements of one have any sort of specific relationship to measurements of the other? Yet they do. If we assemble all the dynasties known to history, in order, we see a very clear progression in their values. Later dynasties (wherever they are on the surface of the Earth) are shorter, with the cumulative total once again approximating a power law based on their rank order. By the time we get to the i-th dynasty, wherever it might be, the cumulative total has fallen to approximately i^-0.30576. Here is a breakdown of the data. It shows how many dynasties I found for that period in history, and their average durations. # AVG 3050-2500 BCE 6 637 2499-2000 BCE 10 507 1999-1500 BCE 14 304 1499-1000 BCE 8 216 999-500 BCE 16 137 499-1 BCE 26 189 CE 1-500 27 297 CE 501-1000 61 184 CE 1001-1500 109 132 CE 1501-1800 73 107 CE 1801-2008 38 52 You can see that on a detail level, the data are not absolutely uniform. The trend plunges downward steeply in the years approaching 500 BCE, then rises to a local peak in the early centuries AD, before plunging down again. But taking the cumulative average, we see a curve very closely approximating the previous two. We have zoomed out to an even more colossal scale, and the shape of the curve is the same yet again. History . . . is fractal! There are plenty of issues that can be raised here about the quality of data. Do we know if the early kings referred to in documents from 4,000 years ago actually existed? Shall we trust the inferences of archeologists about the Pharaohs? Perhaps in those distant times there are empires completely unknown to us, that lived much shorter lives and died without records. But even if we discarded the data prior to 500 BCE as being untrustworthy, and took only those dynasties that were established since Aristotle, or since Christ, we'd still have the same basic result. Decline pervades the data set. It cannot be purged by removing a few unusual items, or postulating a few missing ones. What this suggests is that long spans of history are tied together by a mathematical rule. If staying in power represents success, then throughout human history, hereditary dynasties have been inexorably growing less and less successful, and more and more unstable. The presence of the rule on the other levels suggests that on this level as well, the trend has real predictive and causal meaning. We don't know yet WHY this rule operates, but it appears to operate pervasively: within dynasties, between dynasties, and across continents, oceans, and millenia of time. Now consider the implications for our present system of democracy. This very simple rule effectively dictates that hereditary monarchy HAD to collapse as a social institution, and moreover that it wound up being replaced at the precise time that it did, because by the 1800's, all the new dynasties that were being established were short-lived, lasting on average only 52 years, or barely one human lifetime. Once the older dynasties collapsed in WW I and WW II, they were not replaced, and dynastic rule came to an end. The trend in the global data set implies that hereditary monarchy, as an institution, had built-in obsolescence. Each dynasty grew increasingly frail. Each country ruled by dynasties also grew more unstable. And the entire global system of dynastic rule was bound to hit bottom one day. It is, eerily, a forecast for the future of humanity. Had there been a sharp mathematician in Aristotle's time, or in the early Roman Empire, who had access to lists of kings, he could have worked out roughly when and how the entire system of hereditary rule would cease to operate -- literally thousands of years in advance. The rule operates, at least as far as we can see so far, in complete indifference to the ideas that the rulers had, or their subjects had. It is an impersonal rule, an inhuman rule, that takes no special notice of philosophy or language or religion. Now, I don't want to force an interpretation on the reader. I don't expect you to go running screaming into the street, as if I have overturned all your beliefs about free will and the role of ideas in history in one fell swoop. At this early stage I simply want to ask: What does this make you think of? It makes me think of Hegel. More about that tomorrow.
  24. Yes, agreed to all of this. Not trying to pull a fast one. I like to use this example as a quick and easy introduction, to lay out the basic idea without dwelling too much on technical issues. Once I've established the basic idea -- given you a suitably prepared concrete to contemplate -- then I can go back and do some of this other work. The point you raise about splitting the list into two is good. Let me express that premise a little more precisely. Take a set of n ordered measurements M1 through Mn. The cumulative average of M1 through Mi for any i < n will tend to fall in proportion to i ^ a, where a is an exponent between 0 and -1. Over many different sets of M, the exponent a clusters around -0.30576. Thus if i=1, 1^-0.30576 = 1, and if i=2, 2^-0.30576 = 0.809. If i=10, 10^-0.30576 = 0.495. The fact that the second half of the set is smaller is a derived property. Putting the relationship that way makes it easy for non-mathematicians to follow, and splitting the set exactly in half illustrates the indifference principle quite neatly. But the underlying power-law relationship is better expressed by the formula above. What is a "large, complex system"? For the moment, I want to leave this as an implicit, partly ostensive definition. If I had to break it down at this point, I would say it is a system composed of discrete elements that can each take on a range of possible values in random fashion. But I don't think that really conveys the idea fully. In the book I give hundreds of examples of such systems, so the reader can get a good range of concretes fixed in his mind before trying to understand what systems will or won't obey the decline law. This is an inductive argument, and so I'd like a little leeway in laying out the evidence before attempting the actual induction. How is the concept of entropy applicable outside of thermodynamics? To be clear, in this area I claim relatively little originality. I lean heavily on proofs established by Jaynes himself, starting 50 years ago. I also cite a huge range of applied work that has been done, particularly in the last decade. Maximum entropy methods have been used already in a variety of fields, outside of thermodynamics -- for example, distribution of wave heights in a disturbed body of water. There's an economics textbook, Maximum Entropy Econometrics: Robust Estimation with Limited Data, and work in species abundance in biology, and a variety of other niche applications. So the necessary arguments and mathematical formalisms are out there for use of entropy in diverse fields. It's not Boltzmann entropy, it's information entropy, it relates to what we are able to know about the behavior of a given system. Unfortunately the Wikipedia article on Jaynes is terribly short and uninformative, his essays are hard to find, and his textbook (published posthumously) is kind of daunting to laymen. There aren't good comprehensive sources for laymen on the power and applicability of the principle of maximum entropy. That's a major symptom of the problem. No one thinks it is important to explain in a broad, accessible way. I address the history of the concept of entropy, and Jaynes' key innovations, in several chapters in the book. I will delve into those issues here as well, once I have accomplished some other things. What I am doing is putting all those applications, and some more of my own invention, in one volume, written for the layman and/or scientists who want to understand how the principle is applied outside their own fields. I'm trying very hard NOT to bury the reader in equations, but to tell the story with words and graphs and simple, tangible examples. The equations do exist, but an over-emphasis on mathematical formalisms too early in the discussion tends to leave most people behind. Yes, let's focus on that for the moment. Next weird development coming up in my next post. EDIT: I originally wrote the cumulative average would "rise" instead of "fall". The cumulative total rises, the average falls.
  25. Yes, but a new episode of House was about to come on. Even science has its shrines and rituals.
  • Create New...