Saturday, April 12, 2008

Individual Behaviour Handout # 8

Learning and Reinforcement

What is learning?
Webster’s Dictionary defines learning as “the act or experience of one that learns; knowledge of skill acquired by instruction or study; modification of a behavioral tendency by experience."

Learning is often defined as a change in behavior , which is demonstrated by people implementing knowledge, skills, or practices derived from education.

Learning is also defined as ‘the cognitive and physical activity giving rise to a relatively permanent change in knowledge, skills or attitude.

A fairly standard consensual definition of learning is that "it is a relatively permanent change in behavior that results from practice. However, a few scholars believe that learning implies changes in "capability" or even simple "knowledge" or "understanding", even if it is not manifest in behaviour.

Broadly speaking Learning is:
natural, and life-long
fundamentally personal, yet also social
active and interactive
greatly influenced by organizational factors, including leadership, culture and structures.

Why is it important to understand learning process?
Theories of learning are important for a number of reasons. First, training designed with an awareness of how people learn is clearly more likely to be effective. Secondly, if learning theories can explain how people initially acquire competence they might also help explain what differentiates excellent from merely competent individuals. Finally, learning theories can also help when considering how work is described. Trainers need valid methods for describing what people at work do in order to identify what skills individuals need to do a job and what opportunities the job provides for development of skills in an individual.

Theories of Learning
Classical conditioning
This theory was propounded by Ivan Pavlov, a Russian psychologist in early 20th centuary. Classical conditioning is a process by which individuals learn to link the information from a neutral stimulus to a stimulus that causes a response. This response may not be under the control of an individual. In the classical conditioning process, an unconditioned stimulus (environmental event) brings out a natural response. Then a neutral environmental event, called a conditioned stimulus is paired with unconditioned stimulus that brings out the behaviour. Eventually the conditioned stimulus alone brings out the behaviour which is called a conditioned response.

Pavlov proved his point by his famous experiment with dog. He noticed that the dog salivated (unconditioned response) whenever meat powder (unconditioned stimulus) was present it. He then paired meat powder (unconditioned stimulus) with a bell (conditioned stimulus) and the dog salivated (unconditioned response). After repeating this exercise several times, the dog salivated (conditioned response) whenever the bell (conditioned stimulus) rang even without the meat powder (unconditioned stimulus).

Based on Stimulus-Response psychology, this theory suggests that learning/conditioning takes place when Stimulus-Response connection is established.
Unconditioned Stimulus—Unconditioned Response
Conditioned Stimulus—Conditioned Response
Classical conditioning is not widely used in work settings.
Operant Conditioning
B F Skinner propounded the ‘Operant Conditioning’ theory. This refers to a process by which individuals learn voluntary behaviours from the consequences of their previous actions. Managers are interested in operant (voluntary) behaviour because they can influence the results of such behaviours. For example, frequency of a particular behaviour may be increased or decreased by changing the consequences.

Based on Response –Stimulus psychology, there is strong association between consequence and response to a particular stimulus. Learning takes place somewhat like the flow given below:

Stimulus—Response—Consequences—Future Response on the basis of consequence
Social Cognition:
Propounded by Albert Bandura, this theory suggests that learning takes place through the metal processing of information. While individuals learn by being a part of the society, they use thought process to make decisions. People actively process information when they learn. By watching others perform a task, people develop mental pictures of how to perform the task. Observers often learn faster than those who do not observe the behaviour of others because they do not need to unlearn behaviour and can avoid needless and costly errors.
Social cognition has five tools:
Symbolizing: An individual associates a symbol to his future responses.Forethought: An individual anticipates the consequences and accordingly makes a choice of responses
Observational: An individual observes others before choosing his/her own responses.
Self-regulatory: an individual controls his/her action by setting internal standards (aspired levels of performance) and by evaluating discrepancy between the standard and the performance
Self-reflective: An individual reflects back on his/her actions and perceptually determine the causes of success or failure and possible measure to improve the quality of responses.

Reinforcement Theory
The one theory of influence almost everyone knows about is reinforcement. It works in a variety of situations, it can be simply applied, and it has just a few basic ideas. In fact, reinforcement theory boils down to a Main Point: Consequences influence behavior.
Think about that for a moment. Consequences influence behavior. It means that people do things because they know other things will follow. Thus, depending upon the type of consequence that follows, people will produce some behaviors and avoid others. Pretty simple. Pretty realistic, too. Reinforcement theory (consequences influence behavior) makes sense.
Principles of Reinfrocement
There are three basic principles of this theory. These are the Rules of Consequences. The three Rules describe the logical outcomes, which typically occur after consequences.
Consequences which give Rewards increase a behavior.
Consequences which give Punishments decrease a behavior.
Consequences which give neither Rewards nor Punishments extinguish a behavior.
These Rules provide an excellent blueprint for influence. If you want to increase a behavior (make it more frequent, more intense, more likely), then when the behavior is shown, provide a Consequence of Reward. If you want to decrease a behavior (make it less frequent, less intense, less likely), then when the behavior is shown, provide a Consequence of Punishment. Finally, if you want a behavior to extinguish (disappear, fall out of the behavioral repertoire), then when the behavior is shown, then provide no Consequence (ignore the behavior).
Now, the Big Question becomes, "What is a reward?" or "What is a punishment?" The answer is easy.
What is a reward? Anything that increases the behavior.
What is a punisher? Anything that decreases the behavior.
The Process of Reinforcement
The Rules of Consequence are used in a three-step sequence that defines the process of reinforcement. We can call these steps, When-Do-Get.
Step 1: When in some situation,Step 2: Do some behavior,Step 3: Get some consequence.
According to Reinforcement Theory, people learn several things during the process of reinforcement. First, they learn that certain behaviors (Step 2: Do) lead to consequences (Step 3: Get). This is the most obvious application of the Rules of Consequence. A student realizes that if she does well on an assignment (Do), then she will get a Rewarding Consequence of a pretty sticker (Get). Another student discovers that if he speaks out inappropriately (Do), then he will receive the Punishing Consequence of reduced recess time (Get).
But second, and as important, people learn that the Do-Get only works in certain situations (Step 1: When). For example, a child may discover that when she is with her parents (When) and she throws a temper tantrum (Do), she embarrasses them and they give her Rewards such as attention, toys, candy, or whatever (Get). Now when this child hits school and tries this trick, she is cruelly disappointed when the teacher provides a Punishing Consequence rather than a Rewarding Consequence. She soon learns that Tantrum ---> Reward only works When she is with Mom and Dad.
This is simple. When in some situation-Do some behavior-Get a consequence. And there are only three consequences, Rewarding, Punishing, and Ignoring. Let's look at some examples in action.

Limitations of Reinforcement
While Reinforcement Theory is a powerful influence tool, it does have several serious limitations. To use it effectively, you must be aware of these difficulties in application.
1. It is difficult to identify rewards and punishments. As noted earlier in this chapter, reinforcers are identified by their function. Thus, there is no cookbook list of Rewards and Punishments. Candy increases student cooperation, but has no value as payment to a factory worker. Thus, you have to observe your students very carefully to discover the things they find most rewarding or punishing. (See the coach example above.)
And once you do find things that function effectively, you can be seriously disappointed to discover that they lose their value over time. As the students become accustomed to receiving some Reward (say candy or stickers), they may grow bored over time. This is perhaps the greatest challenge for any teacher. Finding good Rewards and Punishments requires a great deal of experience and insight.
2. You must control all sources of reinforcement. Teachers often must compete with the student's peer group. Peers provide an extremely important source of reinforcement, sometimes greater than any Reward or Punishment a teacher can give. The child's parents and family are another source of reinforcement. Teachers sometimes think their reinforcement applications are failing because the teacher is not using the "right" Reward or Punishment. Instead the problem may be that the student wants or needs the reinforcers the peer group offers more than the ones the teacher gives.
3. Internal changes can be difficult to create. One side effect of reinforcement theory is that children learn to perform behaviors we want them to show only when the Get is available. If the Reward is not present, then the child will not show cooperation or good effort or attention or friendliness. The child becomes little more than a well-trained monkey who does a trick, then holds out a hand waiting for the banana. The child has not internalized the behavior but instead requires the full process (When-Do-Get). This means that the teacher must always be running around providing the correct consequences for the desired behaviors at the right time. In such an instance one wonders who is being trained, the teacher or the student.
You should also realize that reinforcement works best with the heuristic thinker ("If I get a Reward, then the thing is good. If I get a Punishment, then the thing is bad."). It does not require systematic thinking. As we discovered in the Dual Process chapter, influence with heuristic thinkers is often short lived and usually situation dependent. The influence lasts only as long as the cue (in this case the Reward or the Punishment) is available. This simply means you need to maintain a steady diet of reinforcement cues to maintain the actions you desire.
4. Punishing is difficult to do well. Punishment is an extremely powerful consequence for all living things. Whether it is a monkey, a pigeon, a child, or an adult, punishing consequences can produce extremely rapid, strong, and memorable changes. The problem is that effective punishment demands certain requirements. The research clearly shows that effective punishment must be: 1) immediate (right now!), 2) intense (the biggest possible stick), 3) unavoidable (there is no escape), and 4) consistent (every time). If you cannot deliver punishment under these conditions, then the punishment is likely to fail.
Thus, the best punishment would be something like this. A kid does the Bad Thing, then: the kid is instantly placed in a dark room filled with snakes and bugs and jungly vines while weird and frightening voices shriek, "Don't do the Bad Thing, Don't do the Bad Thing." And as soon as the kid stops doing the Bad Thing, bang, the kid is back in class, safe and sound.
While this example is an exaggeration, you get the point. We know that most principals, almost all school boards, and all parents would be against this kind of punishment. Therefore, one of the most powerful aspects of reinforcement is effectively taken away from the teacher. Yet, some teachers persist in using weakened forms of punishment, often with unsuccessful and frustrating effects.
5. Students may come to hate teachers who use punishment. Punishment is, by definition, an aversive, painful consequence. People experience very negative emotional states when they get punished. And, as we learned in the Classical Conditioning chapter, it is very easy to condition emotions. Thus, when a teacher uses punishment, the students will probably feel angry or fearful or hopeless and they will then connect or associate these negative feelings with the source of the punishment, the teacher.
This is not a good state of affairs. As a teacher you want to use influence tools to accomplish important learning goals. If the influence tool produces negative affect for the teacher, the teacher is essentially shooting herself in the foot. Sure, the punishment helps accomplish one goal, but at the same time the punishment is making other goals more difficult to achieve.
6. It is easy to reinforce one pigeon, but a whole flock? Reinforcement theory has been most strongly tested with animals, particularly pigeons. And that research with pigeons has yielded outstanding results. The problem for teachers is this: The research used reinforcement principles on one pigeon at a time. Teachers teach a whole flock. The sheer size of a classroom brings a very difficult dimension into the proper application of reinforcement theory.

Reinforcement at the workplace
In operant conditioning, reinforcement is an increase in the strength of a response
following the change in environment immediately following that response. Response
strength can be assessed by measures such as the frequency with which the response is
made (for example, a pigeon may peck a key more times in the session), or the speed
with which it is made (for example, a rat may run a maze faster). The environment
change contingent upon the response is called a reinforcer. Reinforcement can only be
confirmed retrospectively, as objects, items, food or other potential 'reinforcers' can only
be called such by demonstrating increases in behavior after their administration. It is the
strength of the response that is reinforced, not the organism.pes of reinforcement
B.F. Skinner, the researcher who articulated the major theoretical constructs of reinforcement and behaviorism,refused to specify causal origins of reinforcers. Skinner argued that reinforcers are defined by a change in response strength (that is, functionally rather than causally), and that what is a reinforcer to one person may not be to another. Accordingly, activities, foods or items which are generally considered pleasant or enjoyable may not necessarily be reinforcing; they can only be considered so if the behavior that immediately precedes the potential reinforcer increases in similar future situations. If a child receives a cookie when he or she asks for one, and the frequency of 'cookie-requesting behavior' increases, the cookie can be seen as reinforcing 'cookie-requesting behavior'. If however, cookie-requesting behavior does not increase, the cookie cannot be considered reinforcing. The sole criterion which can determine if an item, activity or food is reinforcing is the change in the probability of a behavior after the administration of a potential reinforcer. Other theories may focus on additional factors such as whether the person expected the strategy to work at some point, but a behavioral theory of reinforcement would focus specifically upon the probability of the behavior.
The study of reinforcement has produced an enormous body of reproducible experimental results. Reinforcement is the central concept and procedure in the experimental analysis of behavior and much of quantitative analysis of behavior.
Positive reinforcement is an increase in the future frequency of a behavior due to the addition of a stimulus immediately following a response. Giving (or adding) food to a dog contingent on its sitting is an example of positive reinforcement (if this results in an increase in the future behavior of the dog sitting).
Negative reinforcement is an increase in the future frequency of a behavior when the consequence is the removal of an aversive stimulus. Turning off (or removing) an annoying song when a child asks their parent is an example of negative reinforcement (if this results in an increase in asking behavior of the child in the future).
Avoidance conditioning is a form of negative reinforcement that occurs when a behavior prevents an aversive stimulus from starting or being applied.
Skinner discusses that while it may appear so, Punishment is not the opposite of reinforcement. Rather, it has some other effects as well as decreasing undesired behavior.
decreases likelihood of behavior
increases likelihood of behavior
positive punishment
positive reinforcement
taken away
negative punishment
negative reinforcement
Distinguishing "positive" from "negative" can be difficult, and the necessity of the distinction is often debated. For example, in a very warm room, a current of external air serves as positive reinforcement because it is pleasantly cool or negative reinforcement because it removes uncomfortably hot air. Some reinforcement can be simultaneously positive and negative, such as a drug addict taking drugs for the added euphoria and eliminating withdrawal symptoms. Many behavioral psychologists simply refer to reinforcement or punishment—without polarity—to cover all consequent environmental changes.
Primary reinforcers
A primary reinforcer, sometimes called an unconditioned reinforcer, is a stimulus that does not require pairing to function as a reinforcer and most likely has obtained this function through the evolution and its role in species' survival. Examples of primary reinforcers include sleep, food, air, water, and sex. Other primary reinforcers, such as certain drugs, may mimic the effects of other primary reinforcers. While these primary reinforcers are fairly stable through life and across individuals, the reinforcing value of different primary reinforcers varies due to multiple factors (e.g., genetics, experience). Thus, one person may prefer one type of food while another abhors it. Or one person may eat lots of food while another eats very little. So even though food is a primary reinforcer for both individuals, the value of food as a reinforcer differs between them.
Often primary reinforcers shift their reinforcing value temporarily through satiation and deprivation. Food, for example, may cease to be effective as a reinforcer after a certain amount of it has been consumed (satiation). After a period during which it does not receive any of the primary reinforcer (deprivation), however, the primary reinforcer may once again regain its effectiveness in increasing response strength.
Secondary reinforcers
A secondary reinforcer, sometimes called a conditioned reinforcer, is a stimulus or situation that has acquired its function as a reinforcer after pairing with a stimulus which functions as a reinforcer. This stimulus may be a primary reinforcer or another conditioned reinforcer (such as money). An example of a secondary reinforcer would be the sound from a clicker, as used in clicker training. The sound of the clicker has been associated with praise or treats, and subsequently, the sound of the clicker may function as a reinforcer. As with primary reinforcers, an organism can experience satiation and deprivation with secondary reinforcers.
Other reinforcement terms
A generalized reinforcer is a conditioned reinforcer that has obtained the reinforcing function by pairing with many other reinforcers (such as money, a secondary generalized reinforcer).
In reinforcer sampling a potentially reinforcing but unfamiliar stimulus is presented to an organism without regard to any prior behavior. The stimulus may then later be used more effectively in reinforcement.
Socially mediated reinforcement (direct reinforcement) involves the delivery of reinforcement which requires the behavior of another organism.
Reinforcement hierarchy is a list of actions, rank-ordering the most desirable to least desirable consequences that may serve as a reinforcer. A reinforcement hierarchy can be used to determine the relative frequency and desirability of different activities, and is often employed when applying the Premack principle.
Contingent outcomes are more likely to reinforce behavior than non-contingent responses. Contingent outcomes are those directly linked to a causal behavior, such a light turning on being contingent on flipping a switch. Note that contingent outcomes are not necessary to demonstrate reinforcement, but perceived contingency may increase learning.
Contiguous stimuli are stimuli closely associated by time and space with specific behaviors. They reduce the amount of time needed to learn a behavior while increasing its resistance to extinction. Giving a dog a piece of food immediately after sitting is more contiguous with (and therefore more likely to reinforce) the behavior than a several minute delay in food delivery following the behavior.
Noncontingent reinforcement refers to response-independent delivery of stimuli identified serve as reinforcers for some behaviors of that organism. However, this typically entails time-based delivery of stimuli identified as maintaining aberrant behavior, which serves to decrease the rate of the target behavior. As no measured behavior is identified as being strengthened, there is controversy surrounding the use of the term noncontingent "reinforcement".
Natural and artificial reinforcement

No comments: