Within The Angel of Darkness we meet one of Stevie Taggert’s friends, the endearing orphan and petty criminal Hickie the Hun, who allows Stevie to borrow one of his many animals to assist the team during their investigation. Hickie had originally trained the animal, a ferret named Mike, to locate specific scents in order help him commit his robberies. When Hickie drops Mike off to Dr. Kreizler’s house, the Doctor is impressed by Hickie’s “homegrown methods of animal training” and jokingly suggests that the famous Russian physiologist, Ivan Pavlov, would benefit from talking to Hickie about his training methods. In Part One of the Hickie the Hun’s Homespun Behaviorism series, we examined how Pavlov’s research into animal learning in the 1890s related to Hickie’s training methods, and established that although Pavlov would indeed have been fascinated to learn about Hickie’s methods just as Dr. Kreizler suggested, the learning Hickie was employing was ultimately of a different form to what Pavlov investigated with his conditioning research. As a result, the second half of this two-part blog series will feature another researcher, this time a young American psychologist, who would go on to become famous just one year following the events of The Angel of Darkness when he published the first formal research into the same type of learning Hickie had been employing. However, before we go into more detail, let’s take another look back at Stevie’s description of Hickie’s training methodology:
The Angel of Darkness, 210:
Would Mike be able to detect if the person was in fact in the house, and find the right room? Indeed he would, Hickie said; in fact, it would be a breeze, compared to some of the jobs Mike’d handled in the past. Then I asked about the training, and was surprised to learn how simple it would be: all I’d need would be a piece of clothing from the person I was looking for, the more intimate the better, as it would be that much more steeped in the person’s scent. Mike was already so well trained that when he began to connect a particular object or smell with his feeding, he quickly got the idea that he was supposed to find something that looked or smelled the same; only a couple of days would be needed to get him ready.
As we discussed in Part One, we can see from this extract that Hickie was using meat as a means of rewarding Mike for performing a desired behaviour, a technique that would come to be known as positive reinforcement from the 1930s onwards when another renowned American psychologist, Burrhus F. Skinner, established operant conditioning as the other half of ‘behaviourism’, a field of psychology John B. Watson had popularised in the 1910s on the basis of Pavlov’s classical (or Pavlovian) conditioning. However, three decades prior to Skinner, another American psychologist, Edward Lee Thorndike, was conducting the research that would form the foundation of Skinner’s work. Thanks to Thorndike’s innovative methods, his research would take the world of psychology by storm when it was first published in 1898, and within the year he would have publications in the prestigious generalist journal Science and the equally prestigious specialist journal Psychological Review, and would be invited to present his work at both the New York Academy of Sciences and the annual meeting of the American Psychological Association. Little did Hickie the Hun know that he had anticipated what would become one of the most important and revolutionary ideas in animal and human learning — no wonder Dr. Kreizler was impressed!
Who was Edward Lee Thorndike?
As with Pavlov, Edward Lee Thorndike, born on August 31, 1874, was the son of a clergyman. He grew up in Massachusetts in an unusually gifted academic family. His two brothers would go on to professorships at Columbia University, and his sister would become a high-school teacher. Edward, similarly bright, consistently topped all his high school courses, and went on to receive his Bachelor degree from Wesleyan University in 1895, with the distinction of achieving the highest average grade at the university in fifty years. Having read and enjoyed William James’ Principles of Psychology during his undergraduate years (a text readers of The Alienist should recognise as the primary source Dr. Kreizler required the team to read at the start of the Beecham investigation), Thorndike decided to enroll at Harvard in order to work with James for his graduate study.
Although Thorndike was not destined to become the household name that Pavlov became, he had an extremely successful and productive career as a research psychologist, with an output that included over 500 books, monographs, and journal articles in a variety of fields ranging from educational psychology (his primary area of interest) to child development, language acquisition, and social psychology. However, within today’s post we will be focusing primarily on the graduate work he carried out at Harvard and Columbia on animal learning. This work began shortly after Thorndike arrived at Harvard. He was awarded a second Bachelor degree the same year he arrived, and still working with James, he chose to research “the instinctive and intelligent behaviour of chickens” for his Masters degree. Later in his career, Thorndike would claim that he had no particular interest in animals, and that his motivation for this work had been “chiefly to satisfy requirements for courses and degrees”. Nonetheless, his research into how chickens learn laid the foundation for his doctoral work that would go on to have a profound impact on psychology for years to come.
For his chicken research, Thorndike obtained a batch of chicks that he initially housed in his private rooms due to a lack of laboratory space on campus, but then moved to the cellar in James’ home when his landlady discovered their presence. In James’ cellar, he built a rudimentary maze out of books that contained four alleys: one of the alleys led to an enclosure containing food, water, and the companionship of other chicks, while other three led to dead ends. The experiment consisted of individual chicks being placed in the maze on multiple trials and their reactions being recorded. He found that when the chicks were initially placed in the maze, they would blunder around blindly until, through trial and error, they eventually found the alley that lead to the enclosure where food and the other chicks were present. When the same chick was placed in the maze on subsequent occasions, it would slowly get better at finding its way to the exit, but he observed that there was never an ‘ah ha!’ moment where the chick would learn the correct route and find it every time that followed. This observation lead Thorndike to conclude that it was not logical reasoning or insight at work in the chicks, but a simpler process:
The chick, when confronted by loneliness and confining walls, responds by those acts which in similar situations in nature would be likely to free him. Some one of these acts leads him to the successful act, and the resulting pleasure stamps it in. Absence of pleasure stamps all others out.
Thorndike was describing a principle he would eventually go on to call the Law of Effect (which he would also describe as the “law of acquired brain connections”). That is, when a situation (S) produces a variety of responses (R), and one response is followed by pleasure or satisfaction, the pleasure or satisfaction “stamps in” (reinforces) a connection or bond between the situation and the response (S-R bond). On the other hand, unpleasant or dissatisfying consequences weaken a connection. Thus, when the same situation reoccurs and the response associated with pleasure is obtained again, the connection or bond is strengthened and the pleasurable response is more likely to be selected in the future. In-keeping with the resurgent Darwinism of his time, it is worth noting that the Law of Effect was essentially a Darwinian concept; an argument that adaptive behaviour is selected on the basis of its satisfying consequences.
The reason this principle was considered revolutionary can be traced primarily to methodology. Thorndike’s maze study represented the first deliberate attempt at examining animal learning using a carefully controlled experimental procedure. Like Pavlov, Thorndike had little patience with the introspective methods that had been the mainstay of psychological investigation throughout the nineteenth century, instead believing that psychological theory should be driven by objective observations obtained in laboratory settings rather than abstract philosophy. In animal learning, for example, most biologists and psychologists prior to Thorndike had relied on methods such as “introspection by analogy” whereby the investigators would ask themselves what they would do if they were the animal under observation in a variety of circumstances. Thorndike’s maze study, on the other hand, provided a means to obtain precise recordings of the length of time it took chicks to successfully navigate the maze on each trial, thereby providing objective evidence for his finding that learning takes place slowly over multiple trials, while also providing tight control of extraneous factors such as the stimuli chicks were exposed to in the maze.
Following his initial maze study, Thorndike transferred to Columbia University to complete his PhD with James McKeen Cattell after a personal setback (having a proposal of marriage rejected) made him to desirous to leave the Massachusetts region. Once at Columbia, Thorndike threw himself into his work and he managed to be awarded his PhD within the year. To follow up his research with chickens, Thorndike devised his most famous animal learning experiments in which he created “puzzle boxes” that cats and dogs had to learn to escape from in order to obtain a food reward. From fruit and vegetable crates, he created a total of fifteen puzzle boxes for cats and nine puzzle boxes for dogs. Each box was only just large enough to fit the animal, and each had a different method of escape. In one box, for example, a cat could open a door by pulling on a wire loop. In another, the door could be opened by pressing a lever. In yet another, the door could be opened by pressing a lever, pulling on a string, and pushing a bar up and down. In other boxes, manipulating the box itself was inadequate to elicit escape; instead, the animal would need to perform some of deliberate behaviour such as licking itself. As with the chicks in the maze, Thorndike recorded the time it took animals to escape from the box, which he graphed as “time curves”, and he made detailed notes on how the animals’ behaviour changed over repeated trials in the same box as well as how effectively learned responses transferred to new boxes.
Regardless of species, Thorndike found that an animals’ behaviour upon being placed in a box for the first time would start as chaotic and random as it tried to escape — some of his notes include cases of cats trying to squeeze out between the slats in the box and so on — until the animal eventually came across a solution by accident. Although he once again failed to observe any ‘ah ha!’ moments from the animals that would suggest logical reasoning or insight, he observed that if an animal was placed in the same box multiple times, their behaviour slowly became more efficient and deliberate. That is, the frequency of effective behaviours increased while the frequency of ineffective behaviours started to drop off. Consequently, as the number of trials in the same box increased, the less time it took the animal to escape from the box. In addition to supporting the observations he made during his earlier research with chicks, he also observed that animals could not learn to escape from a box by watching an experienced animal escape from the box, or by being manipulated by Thorndike to perform the correct response (e.g., by having Thorndike place an animal’s paw in the loop that opened the door). The animals did, however, demonstrate generalisation: if an animal had learned to successfully open the door to one box by pressing a lever, they would try to perform the similar behaviours upon being placed in a different box.
On the basis of his work in animal learning, Thorndike formulated his theory of connectionism that comprised the Law of Effect and the Law of Exercise. Related to the Law of Effect, the Law of Exercise stated that “a response will be more strongly connected to a stimulus in proportion to the number of times it has been connected with that situation and to the average vigour and duration of the connection”. Although he had conducted his initial research on animals, it should be noted that Thorndike did not limit his theory of connectionism to non-human animals. Indeed, in his 1905 publication, Elements of Psychology, he stated:
Complex as human life is, it is at bottom explainable by a few principles … it has been shown that in great measure the intellects and characters of men are explainable by a single law [the law of effect].
Thorndike ultimately believed that while some S-R bonds, or neurobiological connections, are established prior to birth in healthy animals (including humans), most are acquired through experiences after birth. When Thorndike’s doctoral thesis, Animal Intelligence: An Experimental Study of Associative Processes in Animals, was published in 1898, his findings skyrocketed him to the top of his profession. As described in Hunt’s The Story of Psychology, the thesis:
…lent new, research-based meaning to old philosophic notions of associationism; it provided convincing support for C. Lloyd Morgan’s dictum against assuming higher mental functions if lower ones could explain behaviour; and it established animal experimentation as the pattern for most learning research for the next half century.
Reward vs. Punishment
Much later in his career — in the early 1930s — Thorndike made a brief return to animal learning after having spent several decades examining questions related to human learning in fields such as educational psychology. He made the return to animal learning in order to study the processes involved in strengthening and weakening the S-R bonds proposed in his theory of connectionism. Specifically, he wanted to determine if unpleasant or dissatisfying consequences (punishment) produced a weakening of S-R bonds that were symmetrical in magnitude to the strengthening produced by pleasurable or satisfying consequences (reward). In other words, Thorndike was examining the relative benefits and costs of reinforcement and punishment in what B. F. Skinner would also study and come to call operant conditioning only a few years later.
In 1932, Thorndike ran a second study with chicks in which he constructed yet another maze, this time with three alleys. As with his first maze study, one of the alleys led to food, water, and the companionship of other chicks for a duration of one minute: the reward. The other two alleys, on the other hand, led to confinement in isolation for a duration of thirty seconds: the punishment. As a result of this study, Thorndike concluded that reward strengthened an S-R bond while punishment had relatively little effect on the bond. He then followed this study up with a number of human experiments in which “right” and “wrong” pronouncements (reward and punishment, respectively) were provided to participants after they responded with a word or a number to long lists of words. Once again, participants were likely to repeat a response if they had previously been told the response was “right”. On the other hand, being told “wrong” did not appear to decrease the likelihood of making that response.
Although there were a number of problems with the design of these experiments relating to preliminary training provided to the subjects and issues regarding whether the rewards and punishments employed were of equivalent magnitude (e.g., was one minute of companionship and food for chicks as a reward equivalent to thirty seconds of isolation as punishment?), Thorndike concluded that reward produced reliable strengthening effects on the S-R bond while punishment had minimal effect. Skinner’s early work in the 1930s would also support these conclusions, although in later years he would observe that if the severity of punishment was adequate, it could be just as effective in weakening a response as reward was in reinforcing a response. Nevertheless, both Thorndike and Skinner advised that reward, or reinforcement, was the most effective means of producing a learned change in behaviour.
How Does Thorndike’s Work Relate To Hickie’s Training Methods?
Returning to Hickie the Hun’s training methods, we should now be able to see plainly how Hickie’s method of pairing a scent with something pleasurable (meat) corresponds to a form of learning that relates directly to Thorndike’s original Law of Effect. That is, when Hickie would place Mike the ferret in a situation in which he was required to locate a desired scent (a successful response), the ferret would receive a pleasurable stimulus or reward (meat) intended to “stamp in”, or reinforce, the bond between the situation and the response. Hickie may have been ahead of his time in developing an effective form of training using reward-based learning, or positive reinforcement, without having any formal background in animal behaviour, but he was not alone in seeing the benefits of employing such methods in animal training.
From the time operant conditioning was popularised by Skinner in the 1940s and ’50s, it has been applied to both wild and domestic animal training through to the present day. Skinner even proposed a “Project Pigeon” during World War II in which he trained pigeons to guide missiles to enemy targets by pecking screens located in the nose of a missile. Although Skinner did successfully train his pigeons to complete the task, and the National Defense Research Committee provided $25,000 to fund the research, the military ultimately didn’t adopt the approach due to a fundamental lack of trust in the pigeons abilities no matter how reliable they proved to be during training — a frustrating outcome for Skinner, but a fortunate escape for the pigeons involved!
- Burnham, J. C. (1972). Thorndike’s puzzle boxes. Journal of the History of the Behavioral Sciences, 8(2), 159-167.
- Catania, A. C. (1999). Thorndike’s legacy: Learning, selection, and the law of effect. Journal of the Experimental Analysis of Behavior, 72(3), 425-428.
- Chance, P. (1999). Thorndike’s puzzle boxes and the origins of the experimental analysis of behavior. Journal of the Experimental Analysis of Behavior, 72(3), 433-440.
- Donahoe, J. W. (1999). Edward L. Thorndike: The selectionist connectionist. Journal of the Experimental Analysis of Behavior, 72(3), 451-454.
- Goodenough, F. L. (1950). Edward Lee Thorndike: 1874-1949. The American Journal of Psychology, 63(2), 291-301.
- Hearst, E. (1999). After the puzzle boxes: Thorndike in the 20th century. Journal of the Experimental Analysis of Behavior, 72(3), 441-446.
- Hunt, M. (2007). The Story of Psychology. New York, USA: Anchor Books.
- Lilienfeld, S. O., et al. (2012). Psychology: From Inquiry to Understanding. New South Wales, Australia: Pearson Australia.
- Nevin, J. A. (1999). Analyzing Thorndike’s law of effect: The question of stimulus-response bonds. Journal of the Experimental Analysis of Behavior, 72(3), 447-450.