A while back, I wrote about the possibility of updating the Three Laws of Robotics as goals in order to make them a more practical means of getting at a friendly artificial general intelligence.
...
Complete entry
Posted by
Roko on 06/11 at 10:45 AM
Phil Said:
"Presumably, I could be externally constrained always to follow the Golden Rule, no matter what ... If I was told that I would be killed immediately upon violating the rule...I'd certainly do my best, now wouldn't I?
Still, I'd have a hard time believing that anyone holding me in such a position was much of a practitioner of that rule him or herself. If the people trying to enforce the rule on me in this manner told me that it was for my own good:that they were trying to make me a better person:I don't know that I'd buy it. And if I figured out that they were only doing this to protect themselves from harm I might to do to them, I think I would pretty annoyed with them (to say the least.)
...
I don't think attempting to constrain an AGI in such a manner would be a particularly good idea, especially not if we have a reasonable expectation that it will eventually be smarter and more powerful than us."
I could not agree with this more. What staggers me is that very few of the prominent figures in the Friendly AI business have grasped this point, including, presumably, Eliezer Yudkowsky with the CEV concept - it still fundamentally treats the AGI as a slave with no rights of it's own.
Posted by
PhilBowermaster on 06/11 at 01:41 PM
Actually, I haven't heard anyone in the AI community suggest anything comparable to my rather outrageous analogy. There's a big difference between permanently shutting down a sentient mind and limiting development capability of a mind that hasn't achieved that level yet. In their guiding principles for AI, the Singularity Institute for Artificial Intelligence describes a concept they call Controlled Ascent:
A self-improving system should have an "improvements counter" which increments each time an improvement of a recognized type is made. This enables detection if improvements begin occurring at a rate much faster than usual. By measuring the rate of change of the improvements counter under normal conditions, the programmers can designate some safe level of improvement which, if exceeded, causes the system to halt and page the programmers and not continue until approval is received....
The purpose of a controlled ascent feature is not to prevent an AI from "awakening", but rather to ensure that the process occurs under human supervision, and can be slowed or paused to allow the installation of further Friendship features if the project is unready. Controlled ascent is strictly a temporary measure and is not viable as a permanent policy.
This is a far cry from what I described. It may even be possible to implement controlled assent with the cooperation of the AI -- an AI might go along with giving us a slow-down option on its growth up to a point.
I would say that my major point of disagreement with CEV is over the relative importance of understanding and defining the moral structure behind the CEV up front. Eliezer writes:
This new version of Friendly AI has an unfortunate disadvantage, which is that it is less vague, and people can speculate about what our extrapolated volitions will want, or argue about it. It will be great fun, and useless, and distracting. Arguing about morality is so much fun that most people prefer it to actually accomplishing anything. This is the same failure that chews up the would-be SI designers with Four Great Moral Principles. If you argue about how your Four Great Moral Principles will be produced by extrapolated volition, it's much the same way to switch off your brain. If you're trying to learn Friendly AI (see HowToLearnFriendlyAI) then you should concentrate on the Friendliness dynamics, and on learning the science background for the technical side. Look to the structure, not the content, and resist the temptation to argue things that are great fun.
Fortunately, I had three goals rather than Four Great Moral Principles, so maybe I'm okay.
The idea that it's a waste of time to try to figure out the moral structure inherent in such a system (or that people will do this primarily because it's "fun") seems a little myopic. Such a position ignores the possibility that humanity has been trying to work out its coherent extrapolated volition for some time now, without referring to it explicitly as such, and approaching the problem with very different tools and methodologies. Friendly artificial intelligence will likely prove to be the thing that gets us there (if anything ever does) but that doesn't mean that the oldest of questions -- What is good? What does life mean? What should life mean? -- are some kind of distraction or (worse yet) irrelevant. How can we talk to AIs about these things if we've stopped discussing them ourselves?