Transhuman Goodness is Roko Mijic's virtual soapbox; on these pages you'll find posts about about emerging technologies, values, ethics and philosophy, the humanity plus movement, artificial intelligence, and a whole assortment of futurist and humanist topics.

 

Universal Instrumental Values

Yesterday I claimed that there is an objective notion of what is good in life, which I call Universal Instrumental Value. This is a bold claim, and I am going to try to justify it carefully, because my previous posts on this subject have been more of a process of exploration for me than of explanation and justification to my readers.

Any agent who acts in the world to achieve certain goals has to contend with two fundamental facts about the nature of the interaction of an agent with the real world. The first fact is that my desire to achieve some goal ("I want to be in Los Angeles") does not make that desired state happen. In order to impose our goals on the world, we have to manipulate the world, and those manipulations follow a set of rules, called the laws of physics.

This seems trivial, but as far as finding an objective system of ethics is concerned it is very important, and in fact it is a good thing. If it were the case that as soon as I desired some state, the world instantly transformed itself into that state with no side-effects, then there would be no mathematical structure to the set of goal states that an agent could have. In the case of a set of possible goal states with no mathematical structure, i.e. such that there are no objective relations between those goals, there is clearly no objectively best goal. Like elements of an abstract set, goals without relations between them cannot be superior to one another.

But our world is not like this! Goals do have relations between them. Steve Omohundro wrote two papers about the relations between various goals that an agent can have.

The most important relation that goals can have is the following: Goal A is instrumental to Goal B. That is to say, if we first achieve Goal A, then it will be easier to achieve Goal B.

If there were one single goal state that was instrumental to every other goal state, then we would live in a world with a distinguished state, which I might call the good state. From the good state, you can easily get anywhere.

There is a slightly more complex possibility, namely that there is no one good state, but a sequence of better states B1, B2, B3, ... . Each better state is strictly more instrumentally useful than the one before it, and for any other state X, there exists a better state - for example B16 - from which we can easily reach X.

If the world is like this, then, for a very large set of agents, each considered in isolation, the notion of the "right" thing to do is will end up being the same.

Richard Hollerith put it well:

"If the goal specifies no time discount rate ... then the initial behavior of a paperclip-maximizing SI will be exactly the same as the initial behavior of any other SI with a goal without a time discount rate -- even if the goal is human happiness or the happiness of sentients."

Almost all arguments advanced in favor of transhumanism are essentially of this form. There are many different things that people want to do, like not get ill, not die, have lots of friends, visit interesting places, play tennis, be happy, etc. There is, in fact, a "good state" for society, from which one can easily get to states where I am engaging in the above activities, and this "good state" is a society with a very high level of technology and a reasonable level of personal freedom for its citizens. Technology contributes very strongly towards getting to more universally useful states.

In fact the above picture is grossly oversimplified, but it conveys the essence of the solution to the moral grounding problem, and it shows quite neatly why transhumanism comes very close to embodying the one distinguished, objectively valuable way of acting.

6 comments:

Carl said...

Roko,

I've worked with Omohundro on those two papers, and you seem to be interpreting them in a very odd way. The desirability of having more material resources, or of survival, etc, is quite contingent on the system's terminal values and circumstances. I see very little advantage for this proposal relative to saying that we should all adopt a utility function that assigns the multiverse infinite or maximum utility automatically, no matter what happens.

"There is a slightly more complex possibility, namely that there is no one good state, but a sequence of better states B1, B2, B3, ... . Each better state is strictly more instrumentally useful than the one before it, and for any other state X, there exists a better state - for example B16 - from which we can easily reach X.

If the world is like this, then a very large collection of agents will end up agreeing on what the "right" thing to do is.""

No, because the different agents will have different terminal aims. If Agent X wants to maximize the amount of suffering over pleasure, while Agent Y wants to maximize the amount of pleasure over pain, then X wants agents with X-type terminal values to acquire the capabilities Omohundro discusses while Agent Y wants Y-type agents to do the same. They will prefer that the total capabilities of all agents be less if this better leads to the achievement of their particular ends.

Roko said...

Me: If the world is like this, then a very large collection of agents will end up agreeing on what the "right" thing to do is.

Carl: No, because the different agents will have different terminal aims. If Agent X wants to maximize the amount of suffering over pleasure, while Agent Y wants to maximize the amount of pleasure over pain, then X wants agents with X-type terminal values to acquire the capabilities Omohundro discusses while Agent Y wants Y-type agents to do the same. They will prefer that the total capabilities of all agents be less if this better leads to the achievement of their particular ends.

- ah, it seems that I have introduced an ambiguity into my writing. What I meant was:

If the world is like this, then, for a very large set of agents, each considered in isolation, the notion of the "right" thing to do is will end up being the same

Roko said...

Carl: "The desirability of having more material resources, or of survival, etc, is quite contingent on the system's terminal values and circumstances"

- contingent upon, but still independent of.

Consider the function f(x) = 1.

The output of this function is contingent upon it having an input, but in the end, then actual output is independent of what that input is.

What I'm saying is that you have to have a goal (of a certain consequentialist kind), but once you've settled upon some goal, you will end up doing roughly the same thing as if you'd picked a different goal.

You example of an AI whose top-level goal is to maximize suffering of humans vs. an AI whose top-level goal is to maximize pleasure of humans is, I believe, of this form.

Coathangrrr said...

If there were one single goal state that was instrumental to every other goal state, then we would live in a world with a distinguished state, which I might call the good state. From the good state, you can easily get anywhere.

I lost you here. What is a "goal state"?

Again, you seem to be verging on the naturalistic fallacy, but you shy away from it at the last minute. You seem not to make moral or ethical claims, but instead make claims of utility. Such and such is useful and such and such is useful and therefore there is some most useful thing. I don't see that as working logically, and if it does I don't see it being a move towards an objective ethic.

Craig Ewert said...

It is implausible that the same set of B1, B2, ... serves as instrumental for all possible terminal goals X, Y, Z, ...

In order to demonstrate their universality, you'll have to make some demonstration of the Bs that serve some diverse goals.

Roko said...

@ craig: those universally useful goals include: free energy, intelligence, space, matter. See Steve omohundro's papers on basic AI drives