Thursday, February 23, 2012

We're all clicker trainers?

A committed clicker trainer friend posted this (not her content, just a link she liked) on Facebook this morning.
An Open Letter to Buck Brannaman

I really encourage yall to go read it if you have time. It's long, but it's really good - I especially like the point she makes about using clicker-type training to teach riders how to ride. In her open letter to Buck, she suggests that he break down the rider's movements and mark when they're doing it right.

For example (I'm making this one up): tell the rider that to steer the horse to the left, she should turn her head left, pick up contact on the left rein, support the horse with the right rein, and use her legs to ask the horse to walk. Then have the rider just concentrate on the head movement, and say "there" whenever she remembers to turn her head. Then add in the inside rein, then the outside rein, etc. "There" is the marker sound to reinforce the behavior, and by breaking it down into component body movements it should be easier for the rider to piece it together.

Sounds familiar, yes? It's similar to how we teach horses. We ask for some component and keep asking til we get it, then we ask for more. Trot in a circle. Ok, good, now trot in a circle with a little actual bend. Good! Now slow that trot down (or extend it, or collect it - whatever you're looking for in your discipline).

Now, the two points I think Gretchen, the blog author, got totally wrong:

One.
The practice [of clicker training] is relatively simple in broad outline, but in detail as complex as the teacher’s knowledge and creativity can make it. Mark and reward what you want, block/ignore/wait out what you don’t. Repeat. Repeat. Repeat again. If you’ve ever shaped another creature’s behavior using those simple steps, you’re a clicker trainer, whether or not you’ve ever touched a clicker. Whether or not you’ve ever given an animal a food treat. Guess what, Buck. You’re a clicker trainer, insulting as that might be for you to hear.

When you artfully channel your green filly’s longing for peace, when you dole it out to her in tiny sips with every well-timed release, would you call that exploitive? I wouldn’t.

Clicker training, by every definition I've ever read, uses positive reinforcement. (Click that link - I go back to it several times.) Positive reinforcement means that the trainer gives something to the subject in order to increase the frequency of a behavior. You give the dog a bit of kibble when she sits. You give your husband a kiss when he loads the dishwasher. You give the horse a treat when she walks calmly past the scary trash can. That's a great tool, and it's extremely effective in all kinds of situations if you use it right, but it's not what most horse owners do most of the time when we interact with our animals.

I might get this wrong - and if so, I hope the clicker nerds who read this will correct me - but I think most human/horse interaction is negative reinforcement ("the taking away of an aversive stimulus to increase certain behavior or response.") An aversive stimulus doesn't have to be harsh to be effective - it just means pressure and release. We all know to release the pressure as soon as the horse does what we want, right? I think that's pretty fundamental to every effective non-clicker form of horsemanship. If you're tapping your horse's butt with a lunge whip to encourage her to load in a trailer, you stop tapping as soon as she starts moving toward the trailer. By taking away that aversive whip stimulus, you've showed her that yes, that's what I want.

Now, look back up at that quote. "When you artfully channel your green filly’s longing for peace, when you dole it out to her in tiny sips with every well-timed release..." Release is not a reward. Rewards are positive reinforcement; releases are negative reinforcement. I'm a big fan of negative reinforcement - but it's not clicker training.

Two. (Emphasis in original.)
The moment in your brief anti-clicker tirade when your ignorance was most glaringly exposed was when you scoffed that a clicker trainer “couldn’t click fast enough” if you put her in a dangerous situation. It would make as little sense to say of one of your students that she couldn’t yank on the bit often or hard enough to survive such a test. The problem wouldn’t lie with the bit, it would lie with the unprepared rider and horse. A clicker trainer uses the clicker to nurture a feel and to establish, refine, and then occasionally maintain specific cued behaviors. If a trainer hadn’t worked hard and long (possibly with the help of a clicker) to get the feel of her horse and to get the behaviors she would need in such a situation solidly on cue, if she hadn’t already established that she could bet her life on her horse responding as he needed to in order to keep them both safe, she would be a suicidal idiot to get the two of them willingly into such a fix. As would any student using your methods.

Um, I actually can't ask for a behavior that's incompatible with the horse trompling me (such as "head down" or "feet still") and click to reward it faster than the horse can run over me (or buck me off, or spin and bolt). Especially if I've only used clicker training - meaning, I've only used positive reinforcement of the behaviors I want to see. I haven't yet met a horse that was exclusively trained using positive reinforcement - and to be honest, I don't ever want to be on the same side of a fence as that horse.

You know what I can do? Scream and wave my fists, for trompling on the ground. Yank the horse's nose to my knee with my instruments of oppression reins. It might not save me - horses are dangerous! - but it's more likely to save me than asking for a behavior that's incompatible with splattering me.

Let's go back to operant conditioning terms for a second. My panic reactions to my horse trying, inadvertently, to kill me are all positive punishment methods: the adding of an aversive stimulus to decrease a certain behavior or response. If my horse walks right into my bubble like she's forgotten I exist, I will absolutely add an aversive stimulus (yelling, thrashing about with the lead rope, flailing with my arms) to decrease that behavior.

In contrast, if I'm trying to train her to stand in a slightly different spot, I might use pressure/release negative reinforcement: jiggle the lead rope til she backs up with her head at my shoulder. Or I might use clicker training positive reinforcement: when she's standing exactly where I want, make a marker sound that she associates with something pleasant, to indicate that that's where I want her to stand. Both of those methods take repetition, and they'll both get you a horse that's interested in leading and standing in exactly the way you've trained - but I just don't know if that horse wouldn't leap into you to get away from a plastic bag blowing across the yard. That's what positive punishment is for.

Additionally: you only become prepared for your horse by doing shit with your horse. I do not know how to get behaviors on cue 100% of the time without exposure to different scenarios, and no one else does either. That's why endurance riders are prepared to eat dirt at the first start line. It's why your barrel horse runs differently away from home. It's why dressage people start out at lower levels than what they train at home. It's why eventers practice jumping so many types of jump - it takes exposure to get the cues right, whether you're using positive or negative reinforcement. Yes, I'm sure a committed and gifted clicker trainer could get the cues right to perform well in any of those scenarios - but positive punishment is there to save you from a world of hurt if it all goes sideways.

What do you think, clicker and traditional people?

15 comments:

  1. You are one of the few people that got the part about negative reinforcement right--most non-psych-trained folks think "punishment". You've done your homework and make several valid points. It's late--I'll have to read the BB post tomorrow.

    ReplyDelete
  2. Didn't read the "original" article yet, but "I" don't believe in clicker training.

    I mean, I believe it works for those people that do it, yet it would never be the way I choose to train a mammal.

    Operant conditioning removes intelligence. Yes, the animal is thinking about "What did I do that made the click happen? I want the click (and treat) again." The animal is focused though, on reward, not on the activity. It is not using its reason (at whatever level it is capable), and honestly half the fun is teaching a horse new things and watching it learn, grow, think for itself and STILL want to be your partner.

    Um, I want a horse that likes jumping because he LIKES JUMPING. That picks up his up-to-now-untrained six year old feet because it has LEARNED that I am trust worthy and safe, and never hurt him. I am not using a frickin' pole to tap and desensitize, or a rope. I am putting my unhelmeted head near his wild-ass feet; I am taking as many days (or weeks) and incremental steps that it takes.

    THAT horse will give his all and do things other than the "reinforced" behaviors because he is a thinking and willing partner, concentrating on what you want and not on a "reward".

    ReplyDelete
  3. Loved the letter, and love your response even more! Your mental cogs were whirling to catch those points! I thought similar to you, only it came out more like "But..but..that's not what we always do and..but.." Thanks for organizing my thoughts for me ;) Good insights.

    ReplyDelete
  4. Really excellent post! Great job of verbalizing the difference between the two types of training. I'm with you all the way...I would not want to be the one interacting with a 100% positive reinforcement trained horse. I ride a pony, and ponies can be little stinkers that need some of the negative reinforcement to make their lives a bit more uncomfortable and get through to them.

    ReplyDelete
  5. Yes, removing pressure is negative reinforcement, and most horse training is just that. Negative = removing. Positive = adding.

    I have a younger horse that I have been doing clicker training since the day I could convince him that treats were good. I have an older, insulin resistant horse that I have had for many years--he can't have treats. They are both great horses, but guess which one work with joy and enthusiasm during ring work and which one prefers to skip it all together?

    Sure, Cole learned a few tricks that we do because he has fun with them, but the practical things he learned are priceless. Best of all of them, when I say "whoa," I get an instant response, and he will stand in place until told otherwise. How awesome is that when we are walking down the street and a big truck comes barreling towards us?

    I can go all day about the benefits of clicker training. I taught a horse in pain to allow me to do medical treatments, it kept me from sending the rescue dog we adopted to live with the coyotes and I have a cat that I can direct with a target toy and will jump through a hoop. Anything that works with cats has got to be good!

    I wish more people used clicker training, and it is disappointing that an influential person like Buck should not support a training method that can work so well. I hope he learns about it and reconsiders.

    Judi
    Author of "Trail Training for the Horse and Rider" and "Trail Horse Adventures and Advice"

    ReplyDelete
  6. You are exactly right on the definition of positive and negative reinforcement- I like to tell people to think of math instead of good/bad, it's easier to wrap your head around that way. The aids that we all know and love are negative reinforcement, the trick is to be as tactful as possible when applying them. (I have to admit, I'm only human and I don't apply them as gently as I'd like all the time).

    I haven't used clicker training as well as I should for my horses, I keep coming up against holes and then doing a "DOH" when I realize I could have done x, y, or z way better if I'd taken my time and clicker trained it.

    I hadn't heard about Buck's anti-clicker training rant. I'll have to look into that now.

    ReplyDelete
  7. Well, I am a traditional horseman, and I agree with you 100%. I also admire the logic in your argument. I have never seen a horse trained solely with positive reenforcement, and the horses I have seen that were trained mainly with such methods were very ill-broke by my standards. In my eyes they were objectionably pushy and downright dangerous, under a sort of thug-like friendly veneer. I have never known a truly well-mannered, reliable horse, capable of competing at the upper levels in the sports I competed at (cutting, cowhorse and roping) that was not trained in a more or less traditional fashion (and yes, some folks do use positive AND negative reenforcement, and that seems to work). But I am not a fan of clicker training, for a wide variety of reasons--still, there is more than one way to train a horse and if folks are happy with the results they get that way, then I guess it works for them.

    ReplyDelete
  8. Great post Funder. Haven't checked the Brannaman link yet - sounds like it will be disappointing...

    I believe well trained horses get that way with both positive and negative reinforcement.

    My horses is food motivated and quickly learns whatever action will get him another treat. When too many cookies leads to pushy horse, he'll get backed up a ways with a stern voice and gesture.

    The "release is the reward" model works very well in the riding context - immediately stop giving the aid when the horse gives a proper response, or an honest try, is how I was taught. I also give pats and verbal praise whenever my horse gives me what I was asking for under saddle.

    Seems to me, with both methods, timing is critical, and fairness. And when under saddle, confidence that you've applied aids correctly in the first place - that you know what you've asked for.

    Overall, isn't a horse who respects our leadership and works harmoniously with us of their own volition, the outcome we want from training? Surely that is the measure of of a successful training method...

    ReplyDelete
  9. I use a variation on clicker-training to teach obscure behaviors that won't occur often naturally, such as "stretch your front leg wa-a-a-a-ay out and stay that way until I mount up." I don't actually carry a clicker with me, I just use the word "good." But the whole behavior-shaping thing does work, especially if you are trying to teach something weird and obscure.

    I find that teaching the more immediate stuff, such as "get out of my space" demands direct behavior on my part, rather than just shaping a behavior gradually. If the horse is intent on trompling me (as Fiddle was in early days), I jump up-and-down like a troll. That works--it communicates the message clearly and quickly.

    Behavior shaping tools like clicker-training are terrific when you've got time and focus for teaching a specific skill. In the dressage arena, when Fee brings her back up and stretches down for the bit, I say "good"--the same positive reinforcement that she gets while I'm shaping the behavior to stretch her leg way out. She doesn't get a cookie every time. Intermittant reinforcement is the strongest kind.

    I have trouble with extreme training gimmicks. Clicker training is a tool for training new skills in a controlled place. Natural horsemanship gimmicks include a few excellent tools. But I don't rely on those exclusively, ever--they are far too limiting.

    ReplyDelete
  10. Hi Funder, I read the post on Facebook. I have to admit, I am ignorant re this "clicker" training. But I gather we all do it? Perhaps the misnomer is the word "Clicker" . Anyway, I got lost reading it. I am not in any way a technical horse person, nor do I profess to be. I appreciate those that do, it must be great to understand these things. However, I learn what I can, try and understand what I can, I do read Buck`s stuff, and have never read anything about clicker training. But that doesnt mean to say either is right or wrong, both may be good. I liked your post tremendously. I thought you had it about right.

    ReplyDelete
  11. Here's a version of the opposite training method - our friend Adwyn says that riding instructors should have remote control zappers for students. When the student does something wrong, they get a zap! Riders would learn so much faster, and instructors wouldn't have to repeat themselves 300 times in one lesson! lol!

    ReplyDelete
  12. Very well written and thoughtful post. I like it! However, I would like to point out a few things, if I may. I will freely admit that I know nothing about Buck Brannaman and I only know a little about clicker training as it is practiced outside of the laboratory. (I studied neuroethology and worked as a behaviorist for a University before "retiring" to start a family and a farm, so I am not just any old quack commenter. I'm a quack commenter who knows lots of big words... Ha!)

    I don't really like the article on positive reinforcement that you linked. It relies far too heavily on B.F. Skinners original 1950 work. We have made great progress in the study of behavior since then, particularly in identifying the neurologic pathways involved in operant conditioning. There is little evidence that negative and positive reinforcement are any different from a neurologic perspective. Many behaviorists are actually pushing to drop the words "negative" and "positive" completely and simply go with "reinforcement". Both negative and positive reinforcement are reward based. The only difference in the two is the application of the stimulus: Negative = stimulus removed to gain desired behavior, positive = stimulus applied to gain desired behavior. Realistically, arguing negative vs. positive reinforcement is arguing semantics.

    Both negative and positive reinforcement are part of operant conditioning. Operant conditioning is a method by which a behavior is "shaped" (trained) to a stimulus which is irrelevant to the behavior. So, a rat running on a wheel when he hears a bell and a horse knowing to stop when he hears the word "whoa" are both examples of operant conditioning and have been trained by reinforcement, either positive or negative. Applying pressure via yelling, arm flailing or a good smacking to get a horse out of your space is not operant conditioning, as the horse's response is innate: he is biologically programmed to get out of the way of a flailing, yelling, smacking predator. It's the same with the most basic of the riding aids: the horse moves away from pressure. That's an innate response, and is not training. It's not until the advanced work comes in that we begin the operant conditioning (training) with negative and positive reinforcement.

    Clicker training is the use of a "click" as a stimulus. Whether or not it is positive or negative reinforcement is a matter of semantics. In the end, the "click" is no different from the "aids" that conventional trainers use. We are all using a stimulus and a reward to shape a desired behavior.

    In a true emergency, I don't think anyone relies on "training". We go to the innate conditioning of the horse: moving away from pressure, or making the behavior physically impossible. I think that's the biggest problem I have with Brannaman's statement that you "couldn't click fast enough". Of course not! You have to go to the innate behavioral responses in an emergency situation. But, simply by interacting with your horse and establishing a relationship in which he trusts and respects you, by whatever means you favor, you are significantly increasing the likelihood that you can get your horse under control more quickly in an emergency.

    ReplyDelete
  13. These are some high-quality comments, yall. I've been enjoying reading them while I wailed upon installing the hardwood (almost halfway done YAY!). I'm pretty tired so this won't be the most coherent or inclusive comment I've ever made ;)

    Bif - C/T totally has a place - any hands-off training, IMO, like dog agility, is perfect for clicker work. I don't really understand what you're getting at with "the horse is focused on the reward" - maybe with more heavily c/t'd horses, but the way I use it, it's just another tool in my box. Like Aarene, I use c/t to teach "weird" behaviors, or something that Dixie's just not getting with negative reinforcement. Eventually, I fade the treat totally out - I taught her to stand at a mounting block with c/t, but she doesn't even want the treat most of the time now. It's just a behavior she does because I ask her to, same as anything else.

    Shannon - GREAT comment. I think that's what the original blog post I linked was talking about - it's the only thing that makes sense, if you view + or - reinforcement as being pretty much the same thing. I still don't really buy it, that the click is the same as a tap-tap-tap type aid, but I'm definitely still thinking it through.

    I might have another post in me tonight - I've been thinking about this today :)

    ReplyDelete
  14. Hi Funder,

    Thanks again for the link and the thoughtful, thorough response. I've finally had a chance to write further on the R+/R- divide in a new post. It's in line with Shannon's comment, and definitely on the technical side. I don't imagine it'll be of interest to anyone whose oppositional reflex gets excited by all reference to clickers ;-), but I hope you'll find it interesting, and look forward to hearing your thoughts.

    very best,
    Gretchen

    ReplyDelete

Feel free to comment!