Friday, June 13, 2025

Does the Arc of History Bend Toward Justice? Outline of an Empirical Test

Overall, on average, do societies improve morally over time? If maybe not in actual behavior, at least in expressed attitudes about right versus wrong?

There's some reason to think so. In many cultures, aggressive warfare was once widely celebrated. Think of all the children named after Alexander the Great. What was he great at? Aggressive warfare is now widely condemned, if still sometimes practiced.

Similarly, slavery is more universally condemned now than in earlier eras. Genocide and mass killing -- apparently celebrated in the historical books of the Bible and considered only a minor blemish on Julius Caesar's record -- are now generally regarded as among the worst of crimes. Women's rights, gay rights, worker's rights, children's rights, civil rights across ethnic and racial lines, the value of self-governance... none are universally practiced, but recognition of their value is more widespread across a variety of world cultures than at many earlier points in history.

An optimistic perspective holds that with increasing education, cross-cultural communication, and a long record of philosophical, ethical, religious, social, and political thought that tests ideas and builds over time, societies slowly bend toward moral truth.

A skeptic might reply: Of course if you accept the mainstream moral views of the current cultural moment, you will tend to regard the mainstream moral views of the current cultural moment as closer to correct than alternative earlier views. That's pretty close to just being an analytic truth. Had you grown up in another time and place, and had you accepted that culture's dominant values, you'd think it's our culture that's off the mark -- whether you embrace ancient Spartan warrior values, the ethos of some particular African hunter-gatherer tribe, Confucian ideals in ancient China, or plantation values in antebellum Virginia. (This is complicated, however, by the perennial human tendency to lament that "kids these days" fall short of some imagined past ideal.)

With this in mind, consider the Random Walk Theory of value change.

For simplicity, imagine that there are twenty-six parameters on which a culture's values can vary, A to Z, each ranging from -1 to +1. For example, one society might value racial egalitarianism at +.8, treating it as a great ethical good, while another might value it at -.3, believing that one ethically ought to favor one's own race. One society might value sexual purity at +.4, considering it important to avoid "impure" practices, while another might treat purity norms as morally neutral aesthetic preferences, 0.

According to Random Walk Theory, these values shift randomly over time. There is no real moral progress. We simply endorse the values that we happen to endorse after so many random steps. Naturally, we will tend to see other value systems as inferior, but that reflects only conformity to currently prevailing trends.

In contrast, the Arc of History Theory holds that on average -- imperfectly and slowly, over long periods of time -- cultural values tend to change for the better. If the objectively best value set is A = .8, B = -.2, C = 0, etc., over time there will be a general tendency to converge toward those values.

Each view comes with empirical commitments that could in principle be tested.

On the Arc of History Theory, suppose that the objectively morally correct value for parameter A is +.8. Cultures starting near +.8 should tend to remain nearby; if they stray, it should be temporary. Cultures starting far away -- say at -.6 -- should tend to move toward +.8, probably not all in one leap but slowly over time, with some hiccups and regressions, for example -.6 to -.4 to -.1 to -.2 to +.2.... In general, we should observe magnetic values and directional trends.

In contrast, if the Random Walk Theory is correct, we should see neither magnetic values nor directional trends. No values should be hard to leave; and any trends should be transient and bidirectional, at least between cultures -- and with sufficient time, probably also within cultures. (Within cultures, trends might have some temporary inertia over decades or centuries.)

It would be difficult to do well, but in principle one could attempt a systematic survey of moral values across a wide variety of cultures and long historical spans -- ideally, multiple centuries or millennia. We could then check for magnetism and directionality.

Do sexual purity norms ebb and flow, or has there been a general cross-cultural trend toward relaxation? Once a society values democratic representation, does that value tend to persist, or are democratic norms not sticky in that way? Once a society rejects the worst kinds of racism, is there a ratcheting effect, with further progress and minimal backsliding?

The optimist in me hopes something like the Arc of History is true. The pessimist in me worries that any such hope is merely the naive self-congratulation we should expect from a random walk.

ETA, 9:53 pm: As Francois Kammerer points out in a social media reply, these aren't exhaustive options. For example, another theory might be Capitalist Dominance, which suggests an arc but not a moral one.

[image of Martin Luther King, Jr., adapted from source; the arc of the moral universe is long but it bends toward justice]

Friday, June 06, 2025

Types and Degrees of Turing Indistinguishability; Thinking and Consciousness

Types and Degrees of Indistinguishability

The Turing test (introduced by Alan Turing in a 1950 article) treats linguistic indistinguishability from a human as sufficient grounds to attribute thought (alternatively, consciousness) to a machine. Indistinguishability, of course, comes in degrees.

In the original setup, a human and a machine, through text-only interface, each try to convince a human judge that they are human. The machine passes if the judge cannot tell which is which. More broadly, we might say that a machine "passes the Turing test" if its textual responses strike users as sufficiently humanlike to make the distinction difficult.

[Alan Turing in 1952; image source]

Turing tests can be set with a relatively low or high bar. Consider a low-bar test:

* The judges are ordinary users, with no special expertise.
* The interaction is relatively brief -- maybe five minutes.
* The standard of indistinguishability is relaxed -- maybe if 20% of users guess wrong, that suffices.

Contrast that with a high-bar test:

* The judges are experts in distinguishing humans from machines.
* The interaction is relatively long -- an hour or more.
* The standard of indistinguishability is stringent -- if even 55% of judges guess correctly, the machine fails.

The best current language models already pass a low-bar test. But it will be a long time before language models pass this high-bar test, if they ever do. So let's not talk about whether machines do or not pass "the" Turing test. There is no one Turing test.

The better question is: What type and degree of Turing-indistinguishability does a machine possess? Indistinguishability to experts or non-experts? Over five minutes or five hours? With what level of reliability?

We might also consider topic-based or tool-relative Turing indistinguishability. A machine might be Turing indistinguishable (to some judges, for some duration, to some standard) when discussing sports and fashion, but not when discussing consciousness, or vice versa. It might fool unaided judges but fail when judges employ AI detection tools.

Turing himself seems to have envisioned a relatively low bar:

I believe that in about fifty years' time it will be possible, to programme computers... to make them play the imitation game so well that an average interrogator will not have more than 70 per cent chance of making the right identification after five minutes of questioning (Turing 1950, p. 442)

I've bolded Turing's implied standards of judge expertise, indistinguishability threshold, and duration.

What bar should we adopt? That depends on why we care about Turing indistinguishability. For a customer service bot, indistinguishability by ordinary people across a limited topic range for brief interaction might suffice. For an "AI girlfriend", hours of interaction might be expected, with occasional lapses tolerated or even welcomed.

Turing Tests for Real Thinking and Consciousness?

But maybe you're interested in the metaphysics, as I am. Does the machine really think? Is it really conscious? What kind and degree of Turing indistinguishability would establish that?

For thinking, I propose that when it becomes practically unavoidable to treat the machine as if it has a particular set of beliefs and desires that are stable over time, responsive to its environment, and idiosyncratic to its individual state, then we might as well say that it does have beliefs and desires, and that it thinks. (My own theory of belief requires consciousness for full and true belief, but in such a case I don't think it will be practical to insist on this.)

Current language models aren't quite there. Their attitudes lack sufficient stability and idiosyncrasy. But a language model integrated into a functional robot that tracks its environment and has specific goals would be a thinker in this sense. For example: Nursing Bot A thinks the pills are in Drawer 1, but Nursing Bot B, who saw them moved, knows that they're in Drawer 2. Nursing Bot A would rather take the long, safe route than the short, riskier route. We will want attribute sometimes true, sometimes false environment-tracking beliefs and different stable goal weightings. Belief, desire, and thought attribution will be too useful to avoid.

For consciousness, however, I think we should abandon a Turing test standard.

Note first that it's not realistic to expect any machine ever to pass the very highest bar Turing test. No machine will reliably fool experts who specialize in catching them out, armed with unlimited time and tools, needing to exceed 50% accuracy by only the slimmest margin. To insist on such a high standard is to guarantee that no machine could ever prove itself conscious, contrary to the original spirit of the Turing test.

On the other hand, given enough training and computational power, machines have proven to be amazing mimics of the superficial features of human textual outputs, even without the type of underlying architecture likely to support a meaningful degree of consciousness. So too low a bar is equally unhelpful.

Is there reason to think that we could choose just the right mid-level bar -- high enough to rule out superficial mimicry, low enough not to be a ridiculously unfair standard?

I see no reason to think there must be some "right" level of Turing indistinguishability that reliably tests for consciousness. The past five years of language-model achievements suggest that with clever engineering and ample computational power, superficial fakery might bring a nonconscious machine past any reasonable Turing-like standard.

Turing never suggested that his test was a test of consciousness. Nor should we. Turing indistinguishability has potential applications, as described above. But for assessing consciousness, we'll want to look beyond outward linguistic behavior -- for example, to interior architecture and design history.