This is part 2 in a sequence… part 1, the intro, is found on Feb 8.
The subject is: Different ways to solve the puzzle of “what do I do when my animal makes a mistake”, taken into consideration what behavior my animal should do then and how I in turn reinforce that behavior?” Using the LRS is one of the possible solutions.
Least Reinforcing Stimulus
The LRS as described by Steve Aibel, Sea World San Antonio, at the ORCA conference this weekend is a beautiful example of this type of consequence. Thoughts about LRS is what’s spurred this blog post, which is why I’ll discuss them first 🙂
In the LRS, the consequence is that nothing happens in the environment. The mistake has no effect on the environment. The animal notices that the expected click/whistle isn’t occurring, goes to the trainer (or stays with the trainer), and waits calmly. A brief description can be found by scrolling to the end of Sea Worlds web page: http://www.seaworld.org/animal-info/info-books/training/application-of-philosophy.htm
A training plan for the LRS might look like this:
First, train a “default behavior”. For the dolphin or whale out in the pool, that default behavior is “come to the trainer and wait calmly in front of him for a few seconds”. If the animal is already close by the trainer, the behavior will be just “wait calmly”. This default behavior is of course built in small approximations.
Next, put the “default behavior” on cue. When should the animal do the “default behavior”? Whenever he doesn’t know what to do! That is, whenever he doesn’t get another cue (like a click/whistle which is a cue for “come get your reward), or some other cue telling him something else to do to earn reinforcement. So the trainer does nothing (alters nothing), the animal comes and waits calmly, and the trainer rewards that.
Now this can be made use of in a situation when the animal makes a mistake. Say the trainer gives a cue, the animal does something else, and the trainer doesn’t click/whistle. Now what? The animal comes and waits calmly! And that behavior, in turn, is reinforced – maybe with a cue to another behavior, maybe with a direct reward.
A little aside on how to reinforce the default behavior when it occurs after a mistake: From Ken Ramirez’ teaching, what I understand he does is to either give a cue to a simple behavior (one he is sure the animal will do correctly) and whistle/click + reward for that, or sometimes just give a reward for waiting (Ken doesn’t click/whistle that waiting behavior as I understand it, he just gives the reward in that situation, reasoning that there’s no specific muscle movement he wishes to mark with a click/whistle).
On the consequence side, the idea of the LRS is that it’s less reinforcing than the click/whistle + immediate reward – both because the marker is absent, and because the reward is a little bit delayed.
Might the LRS still be reinforcing the “mistake” behavior? It might. That’s just it – it is the LEAST reinforcing, yet still keeps the animal on track and engaged in the game. If it was less reinforcing – if it wasn’t a positively taught cue, signaling available rewards for the default response of returning and waiting calmly – various problematic behavior (frustration, aggression, displacement, scanning the environment for other possible reinforcers) might creep in…
On the antecedent side, since the LRS cues a behavior (come and wait) with a strong reinforcement history and with continuous varied reinforcement (cue to other behavior, or direct reward) the flow of behavior will go in a nice direction. After the mistake, the animal immediately gets back on track with desired behavior earning reinforcement. That’s the REINFORCING part of the LRS.
In what type of situation can the LRS be an appropriate consequence to a mistake?
In a given context, is there a “default behavior” that I want my animal to go to if I do not reinforce? Have I trained that default behavior, so that my animal will do it if I don’t reinforce? Then an LRS can be appropriate in that context.
For me, in my own dog training, I specifically use the LRS after mistakes following distinct cues in situation where the default behavior is given by contextual cues. My default behavior is different in different context. Not as clear as in the dolphin training, I know, but these are a couple of my applications…
Example 1: Heelwork with some tricks.
My dog is heeling nicely, my body language cues “default behavior = heel”.
I say “spin” (A) but my just turns her head (B – mistake). I don’t click/reward, and my body language still says “default = heeling” (LRS; C for the mistake, A for default behavior = heeling), so my dog comes back to heeling (B). I reinforce that – maybe after 1 second, maybe a bit later – either by giving a new cue that I’m pretty sure my dog will perform correctly, or by directly rewarding the heeling behavior.
(A little later I might decide to try cueing “spin” again – or, I might decide I need to do adjustments to my “spin on cue from heelwork” training…)
Example 2: In front of me, following cues.
I’ve cued “sit” and my dog is now sitting in front of me, waiting for the next cue (my “rule” is: If I’ve given you a cue and reinforced you for doing the behavior, the default behavior is to hold the position you’re in).
I say “down” (A) but my dog stands up (B). Since I don’t click/reward for that and since the context still says “default=hold the position you’re in” (LRS: C for the mistake, A for default behavior = hold the position you’re in”), my dog looks at me and waits. After a second I give reinforce the wait by giving a new cue that I’m pretty sure my dog will perform correctly (such as a hand touch).
(A little later I might decide to try cueing “down” again – or, I might decide I need to do adjustments to my “down on cue from sit” training…)
… and when is it not an appropriate tool to use?
In context where I don’t have a trained default behavior, or where I do not want my animal to revert to a default behavior in case the expected click/whistle is withheld, the LRS will not be an appropriate tool to use (oh, I just hear Ken Ramirez’ voice talking about tools 🙂 ). Just to give one example; in a ”free shaping session” I want my animal to just keep trying until I reinforce or until I give some other cue – so in that setting, the LRS is just the wrong tool to use.
More about some other tools some other day…