What are Punishment and Reinforcement?

This blog is part of a series on training methods and training advice. The other blogs in this series can be found in the links at the bottom.

Most conscientious dog owners have heard a lot about the positive reinforcement revolution. With this blog, I hope to help inform readers about some of the more pernicious but popular misconceptions about training, training advice and professional trainers.

Before I can talk about punishment and reinforcement, I need a definition first. For the purposes of this blog, I'm going to define punishment in quite a strict sense. Any reinforcement applied in an effort to reduce the frequency or magnitude of a particular behaviour will be considered punishment.

For example, if Fido is barking at a guest and you say "No Fido!", according to our definition, that would be punishment because you are providing feedback and trying to reduce the frequency and magnitude of Fido's barking.

In other words, anything you do or withhold to try to lessen or stop a behaviour is punishment. It's a broad definition, but it works, and there's more here to explore. In the training world we have "four quadrants" of reinforcement learning...

The four quadrants of reinforcement learning (source).

As you can see, the language here isn't what we're used to in everyday life. Here, the word "positive" means the dog's handler has introduced something as shown above the top left square. The word "negative" means something has been taken away. The word "punishment" is used to describe anything meant to decrease the frequency or magnitue of a behaviour (usually something the dog doesn't like) and the word "reinforcement" is anything meant to increase the frequency or magnitude of a behaviour (usually something the dog likes). Each of these can be applied in response to things the dog does.

The idea is that when good things happen, the dog tries to repeat what happened just before the good thing happened in order to get that good thing again -- and the opposite also applies, when something bad happens, the dog tries to avoid doing whatever it was doing right before the bad thing happened. In theory and applied consistently, with these four tools we should be able to shape the dog's behaviour in any way we please (there are a lot of hidden caveats and wheretofores here, but we'll save those for later).

So let's look at some examples. Please note these are NOT training recommendations. They are only thought experiments illustrating different ways to think about training. Pooch Perfect DOES NOT recommend any of the following as training advice.

Barking at someone:

If you don't like your dog barking at people and you want the barking to stop, you might decide to punish the dog for barking. There are two ways to punish. One is to introduce something new, like reprimanding the dog, the other is to take something away, "No supper til you behave." A reprimand, since you're introducing something new would be considered positive punishment -- you add something to the situation to decrease the behaviour. Withholding supper is taking something away from the situation to decrease the behaviour and is therefore negative punishment. Punishment is always negative in that it is usually unpleasant for the dog, by distinguising between positive and negative punishment, we're just specifying how the punishment is being applied.

Walking at heel:

Walking at heel is a behaviour desirable to most dog owners, so how can we train this behaviour? If the dog is doing well and walks at heel we can introduce something into the situation (making this positive) that is pleasurable to increase the likelihood of the behaviour in the future -- like a treat. This would be positive reinforcement. On the other hand if the dog is pulling on the lead, that brings a lot of tension into the leash and collar that is unpleasant. If the dog begins to pull less, the dog is rewarded because the unpleasant tension stops. Something has been removed, that removal is pleasurable and increases the likelihood of the dog pulling less -- negative reinforcement. All kinds of reinforcement are pleasurable and therfore increase the likelihood of the behaviours occurring again, the only distinction between positive and negative reinforcement is whether something pleasurable has be introduced or whether something unpleasant has been removed.

Seeing these examples, it becomes clear there are many many possible ways for conscientious dog owners to address behaviours of all kinds. It always seems a shame to me when pro trainers box themselves in to one or another of the quadrants by declaring, "I'm a positive reinforcement trainer!" Doing so unnecessarily limits the training oppotunities available to such an extent that I'm almost certain that in real life there can be no such thing.

From time to time, everyone says "no" to their dog (positive punishment), takes something displeasurable away from their dog (negative reinforcement), or withholds attention, however briefly (negative punishment) -- so-called positive reinforcement trainers included. The only problem is that in when they claim an exclusive use of positive reinforcement these trainers are being less than truthful to their clients ...

But it's not only their public relations that suffer. These trainers are most often unaware they are using all four quadrants and as a result, can pass on their misunderstandings to their clients. Any decent trainer needs to understand all aspects of training in order to provide good client service!

Pooch Perfect often advocates for positive reinforcement, but when we talk to our clients we always make sure we also identify what other aspects of the four quadrants they're using and how these effect the dog/owner relationship (whether intentionally or unintentionally).

In a future Blog, I'll explore more of the reasons punishment training so often becomes problematic.

Let us know if you have any questions, comments or additions -- Contact us!