You interact with the world and you want certain results to happen from these interactions. I think I figured out a big piece that I have systematically been doing wrong.
I don’t reward agents.
You can think about the world as one big chessboard where you have different components independently interacting with one another. In chess of life, there are many more fields and figures than in regular chess. In chess of life, the pieces do different things, the rules for moving them are more complex and vary with time, the fields change in number and surface area. Just like chess, but much more complex. In chess of life, one important thing that you really want to do is to affect the behavior of intelligent (and not-so-intelligent) agents.
And psychological literature is very clear on how you do this: you reward good behavior and you punish bad behavior. All life on Earth adapts to positive and negative reinforcement.
Punishing is very tricky. Punishing humans is even trickier. Things can go very wrong, and punishing can easily be morally wrong. I don’t really do punishing. But I figured out that I don’t do rewarding either. I just… interact with people, and since we’re all equals (at least in my idea of the relationship in question), I don’t shape them: I neither punish bad behavior nor do I reward good behavior.
And this is stupid for a bunch of reasons: one, I have wishes or goals (utilities) but I systematically barricade my way from achieving them. Two, I probably do both rewarding and punishing but reactively, not like a planned thing. And three, there is no reason not to shape people, even if you’re equals.
I always look for ways of achieving my goals independently, as if I took no part in the real world of humans. This gives me the advantage that I get to learn to act independently, but if I systematically ignore a big aspect of human existence, that’s bad. That’s just throwing away a lot of good tools for no good reason. Shaping agents by reward is not bad, if you don’t do it for bad purposes. It’s not manipulative, within the standard meaning of the word. It’s no more bad or manipulative than bringing up a child is. Shaping agents by punishment isn’t necessarily bad either, but it can be bad much easier than rewarding agents. (Punishment is great at the level of populations – evolution doesn’t really reward populations, it just “prunes” them. But such pruning doesn’t do much good to the individual.)
This is a conceptual switch for me. Saying “good job”, getting someone some food, patting someone on the back – I guess I would sometimes do these things, either as learned behavior in specific scenarios, or just instinctively. But I never fully generalized this to an entire life philosophy: watch agents around you, figure out which rewards they would like, and when they do the thing you want them to do, reward them. Do this a thousand times and it compounds like crazy. Suddenly everything you want in your life is moving much faster and better because you have motivated agents that put effort into doing things. It’s not just you anymore.
This is just one of the many conceptual switches I had in the last couple of years. One of the more recent ones was presenting choice architectures with default choices (from the book “Nudge”), instead of giving absolute freedom in deciding or deciding for someone. That one goes hand in hand with this one, making the sequence look like this: watch agents around you, figure out which rewards they would like, present them with choice architectures with the choices you want them to make, and when they do something good, reward them.
Is saying “good job” manipulative or inappropriate in a relationship of equals? It isn’t, if you have good intentions. But there is something that bothers people (?) if this “good job” is deliberate – even if it is honest. The idea that another agent is deliberately shaping me seems shady – who are you to say “good job” to me? (if it is all a part of your master plan where I start doing more of the things you like)
But it’s not malevolent, it’s just deliberate, and it doesn’t take away another person’s agency… So I think that’s definitely in the realm of OK. (I’d be okay if someone with good intentions was shaping me towards a better version of myself)
I practiced positive reward in dog training for quite some time, but it still didn’t occur to me that it can be generalized to life in general. What other things, like rewarding agents, do I already do, or know, but only in specific, closed-off domains? What else can become a more general tool?