Read On: Oct 2016
Reading Time: 7 hours
Rating: 8/10

Summary

In this book, Karen Pryor clearly explains the underlying principles of behavioral training and through numerous fascinating examples reveals how this art can be applied to virtually any common situation. And best of all, she tells how to do it without yelling threats, force, punishment, guilt trips–or shooting the dog.

Notes

A reinforcer is anything that, occuring in conjunction with an act, tends to increase the probability that the act will occur again. You cant reinforce behavior that is not occurring.
- It doesn’t have to be something that the learner wants. Avoidng something you dislike can be reinforcing too. Negative reinforcement, however, is not the same as punishment. Punishment occurs after the behavior it was supposed to modify and therefore it can have no effect on the behavior,
Reinforcing late or early are ineffective.
If you are using food as reinforcement. it should be as small as you an get away with.
One extremely useful technique with food or any other reinforcement, for animals or people, is the jackpot. The jackpot is a reward that is much bigger, maybe ten times bigger, than the natural reinforcer and one that comes as a surprise to the subject.
A conditioned reinforcer is some initially meaningless signal-a sound, a light, a motion - that is deliberately presented before or during the delivery of a reinforcer. Practical animal training that uses positive reinforcement should almost always begin with the establishment of a conditioned reinforcer. You pair it with food, petting or other real enforcers. You can tell when the animal has come to recognize your symbol when it visibly startles on perceiving the conditioned reinforcer and begins seeking the real reinforcer.
The click in clicker training constitutes an event marker. It identifies for the trainee exactly what behavior is being reinforced. It also puts control in the hand of the learner. It is also a termination signal.
When training behavior by positive reinforcement, constant reinforcement is needed just in the learning stages. To maintain an already learned behavior, switch to using reinforcement only occasionally, and on a random or unpredictable basis.
Once a behavior has been at least partially trained, introduce variations in all the circumstances that do not matter to you to make sure that no accidental conditioning develops.
Shaping consists of taking a very small tendency in the right direction and shifting it, one small step at a time, towards an ultimate goal. Our success or failure in shaping a behavior ultimately depends not upon our shaping expertise but upon our persistence.
There are two aspects to shaping: the methods, that is, the behaviors that are to be developed and the sequence of steps used to develop them; and the principles, or rules governing how, when and why those rules are reinforced. Most trainers are concerned almost entirely with method. Principles are generally left to chance or intuition, but their application makes the difference between an adequate teacher and a great one.
Ten laws of shaping:
- Raise criteria in increments small enough that the subject always has a realistic chance of reinforcement. Every time you raise a criterion, you are changing the rules. The subject has to be given the opportunity to discover that though the rules have changed, reinforcers can easily continue to be earned by an increase in exertion.
- Train one aspect of any particular behavior at a time; don’t try to shape for two criteria simultaneously. Practice and repetition are not shaping. If a behavior has more than one attribute, break it down and work on different criteria separately.
- During shaping, put the current level of response onto a variable schedule of reinforcement before adding or raising the criteria.
  - When we train with aversives, we try to correct every mistake or misbehavior. When errors are not corrected (for e.g., if we are absent from the scene), the behavior breaks down.
- When introducing a new criterion or aspect of the behavioral skill, temporarily relax the old one. What is one learned is not forgotten, but under the pressure of assmimilating new skill levels, old well-learned behaviors sometimes fall apart temporarily.
- Stay ahead of you subject. Plan you shaping program so that if your subject makes a sudden leap forward, you will know what to reinforce next.
- Don’t change trainers in midstream. Everyone’s standards, reaction times, and expectations of progress are slightly different and the net effect for the subject is to lose reinforcers until those differences can be accomodated.
- If one shaping procedure is not eliciting progress, try another.
- Don’t interrupt a training lesson gratuitously; that constitutes a punishment.
- If a learned behavior deteriorates, review the shaping, If a well-trained behavior breaks down, recall the original shaping behavior and go all the way through it very rapidly, reinforcing under new circumstances and just reinforcing one or twice at each level.
- Quit when you’re ahead. Stop on a good response.
Targeting: You shape the animal to touch his nose to a target- a knob at the end of a pole, or your closed fist. Then by moving the target around and getting the animal to merely go and touch it, you can elicit all kinds of behavior.
Mimickry: Most dogs are not good at learning by observation. A major part of the shaping of behavior of our children takes place though mimickry.
Modeling: Pushing the subject manually through the action we want the subject to learn. The combination of modeling and shaping can often be an effective training behavior.
The difficulty with shaping yourself is that you have to reinforce yourself, and the event is never a surprise.
The single most useful device in self-improvement is record keeping. Perfection might be a long way off, but the curve or the sloping line of the graph in the right direction provides enough motivation to keep going.
When shaping in informal situations in real life, you can do the shaping, but not talk about it. Talking about it ruins it and might escalate the problem, or cause the other person to rebel. If yor achieve success, you cannot brag about it later either.
Anything that causes some kind of behavioral response is called a stimulus.
When a behavior is under good stimulus control, it gets done immediately on command.
Conventional trainers start with the cue, before they begin training. They are conditioned negative reinforcers. In operant conditioning, we shape the behavior first. Once the behavior is secure, we shape the offering of the behavior during or right after some particular stimulus.
To introduce the cue, produce the cue just as the behavior is starting, reinforce the completion of the behavior, and then repeat the sequence, at different times, and in different locations, gradually backing up the cue in time, until the cue comes before the behavior starts.
When your dog is ready to learn a new cue, ‘click’ as soon as the ‘sit’ movement starts. This speeds up the movement.
Bringing behavior under stimulus control is not accomplished until the behavior is also extinguished in the absence of the conditiional stimulus. Complete, perfect stimulus control is defined by four conditions:
- The behavior always occurs immediately upon presentation of the conditioned stimulus.
- The behavior never occurs in the absence of the stimulus.
- The behavior never occurs in response to some other stimulus.
- No other behavior occurs in response to the stimulus.
Establishing a second cue for a learned behavior is called transferring the stimulus control. To make a transfer, you present the new stimulus - a voice command, perhaps - first and then the old one - a hand signal, say -and reinforce the response; then you make the old stimulus less and less obvious.
“Fading” the stimulus: Once a stimulus has been learned, it is possible to not only transfer it but also to make it smaller and smaller, until it is barely perceptible.
A physical target can be a very useful type of discriminative stimulus. You can use a target stick to teach a dog to walk nicely in heel position. You can stick the target stick in the ground and teach the dog to go away on cue.
A discriminative stimulus that is a cue for avoiding an aversive event can not only reduce any need for physical control on intervention, it can even suppress behavior in the trainer’s absence. For e.g., a scented spray.
A very useful technique for getting a prompt response to a discriminative stimulus is the limited hold. You start by estimating the normal interval in which the behavior usually occurs, then you only reinforce behavior that occurs during that interval.
To handle overbeager subjects who try to anticipate before the cue happens, use timeouts and do nothing for one full minute.
A discriminative stimulus signals the opportunity for reinforcement so it becomes a desirable event and a reinforcer in itself. This is the strategy behind Behavior chains. The pattern of the sequence is not essential to the nature of a chain. What is essential is that the behaviors in the chain follow each other without a time gap, that they are governed by cues, either from the trainer or from the environment, and then the primary reinforcer occurs at the end of the chain.
- Behavior chains should be trained backward. Start with the last behavior in a chain, make sure it has been learned and that the signal to begin it is recognized, then train the next-to-last one, and so on.
- Behavior chain example: teach frisbee. Dog chases frisbee -> Dog catches frisbee -> Dog brings frisbee back for another throw. Train in reverse order.
- When training begins, the subject might exhibit a prelearning dip and/or tantrum. This might be because the subject is gaining new awareness about what it is doing and realizes that some of its past assumptions are false. It’s a strong indicator that real learning is finally about to take place.
People who have a disciplined understanding of stimulus control avoid giving needless instructions, unreasonable or incomprehensive commands, or orders that can’t be obeyed. Good stimulus control is nothing more than true communication - honest, fair communication. It is the most complex, difficult and elegant aspect of training with positive reinforcement.
Training methods to get rid of unwanted behaviors
1. Shoot the Animal: This is pretty severe, but always works, and is appropriate only when the offense is too major to endure and seems unlikely to be easy modified. This method teaches the subject nothing.
2. Punishment: This is a very common approach but rarely works. The punishment doesn’t usually work because it does not coincide with the undesirable behavior; it occurs afterward. The subject therefore may not connect the punishment to his or her previous deeds. Also the subject learns nothing good. The subject might only learn to not get caught. Repeated punishment leads to resentment in the punished and the punisher. These mental states are not conducive to learning. If you are going to use punishment, you may want to arrange things so that the subject sees the aversive as a consequence of its own acts, and not as something associated with you. Punishment often constitutes revenge, or a tendency to establish and maintain dominance. If you want the subject to alter behavior, it’s a training problem, and you need to be aware of the weaknesses of punishment as a training device.
3. Negative Reinforcement: A negative reinforcer is any unpleasant event or stimulus, that can be halted or avoided by changing one’s behavior. Negative reinforcement is an inappropriate teaching mechanism for babies. Babies are born to please, not to obey.
4. Extinction: Behavior that produces no results will probably extinguish. We often accidentally reinforce the behavior we wish would extinguish. For e.g., parents who give into whining children. By ignoring the behavior without ignoring the person, you can arrange for many disagreeable displays to extinguish by themselves, because there is no result, good or bad. The behavior has become unproductive. Hostility requires a huge amount of energy, and if it doesn’t work it is usually quickly abandoned.
5. Train an incompatible behavior: Train the subject to perform another behavior physically incompatible with the one you don’t want. For eg. training a dog to lie down in the doorway to prevent it from begging for food during meal time. Training an incompatible behavior is quite useful in modifying your own behavior, especially when dealing with emotional states such as grief, anxiety and loneliness.
6. Put the Behavior on cue: Bring the unwanted behavior under the control of a cue, and then never give the cue.
7. Shape the Absence of the Behavior: This is useful when you want the subject to stop doing something.
8. Charge the Motivation: Eliminating the motivation for a behavior is often the kindest and most effective method of all. The trick in any circumstance is to identify the motivation, rather than just jump to conclusions. One way to do that is to note what actually helps change the behavior and what doesn’t.

Thoughts

We got a puppy this year (first time dog owners), and have no experience with training dogs. This book was highly recommended to learn the basics for animal (as well as human, to some extent) training, and does a good job at that. Applying the knowledge, however, is the hard part, and is something I have not been very successful at yet. But that’s my shortcoming.

DotNetSurfers

Latish Sehgal's Blog

Book Notes: Don't Shoot the Dog

Summary

Notes

Thoughts

Comments