The use of tokens as rewards and tools by chimpanzees(Pan troglodytes)
Claudia Sousa, Tetsuro Matsuzawa
This paper explores the effectiveness of token rewards in maintaining chimpanzees (Pan troglodytes) in working at intellectually costly tasks, and studies the "saving" behavior of the subjects, investigating the factors that can condition it. Two experiments were run. Tokens were introduced as rewards in a matching-to-sample task and used as exchange tools for food by three adult female chimpanzees. Subjects' performances were maintained at constant high levels of accuracy, suggesting that the tokens were almost equivalent to direct food rewards. The results also showed the emergence of saving behavior. The subjects spontaneously saved the tokens during the matching-to-sample task before exchanging them for food. The chimpanzees also learned a new symbolic discrimination task, with tokens as the reward. During this learning process a rarely reported phenomenon emerged: one of the subjects showed symmetry, a form of stimulus equivalence.
Tokens, Tools, Matching-to-sample, Chimpanzees, Stimulus equivalence
In the past and for many years, tokens have been described as secondary rewards (Wolf 1936) that can be exchanged for a primary reward or food reward. They have mainly been used in studies of operant behavior, and with the purpose of studying their effectiveness in comparison to food rewards.
In the classical experiments of Wolf (1936), six immature chimpanzees had to lift weights in order to receive food, an exchangeable poker chip, or a non-exchangeable poker chip. By comparing performance achieved by each individual with the three different types of rewards, Wolf concluded that valuable-token rewards were almost as effective as food rewards, but that non-valuable ones altered the subject's motivation.
Later, Cowles (1937) ran a series of experiments with small catalin (plastic) disks as tokens. This author showed also that the subjects spontaneously responded to food-tokens and non-food-tokens in different ways. In comparison to food rewards, token rewards caused a motivation decrement. Beyond that, he also investigated the effectiveness of tokens in a learning task. The chimpanzees succeeded in learning new tasks with food-tokens as rewards, reaching above-chance accuracy. From his results, Cowles (1937) concluded that chimpanzees can work for tokens, and, after training, they can even accumulate several tokens (bouts) before exchanging them. However, he recognized the importance of individual differences in determining the lengths of these bouts.
Kelleher (1956, 1957a, 1957b, 1957c, 1958) trained chimpanzees to obtain tokens (i.e., poker chips) by pressing a telephone key. He studied the characteristics of the subjects' performance, by varying the number of responses (presses on the telephone key) required to get one poker chip (called fixed ratio, FR, schedule of reinforcement for tokens), and the number of poker chips received before it was possible to exchange them for food. The results were comparable to those obtained with food reinforcement, that is, subjects showed highly stable rates of response. The only difference was the occurrence of prolonged pauses at the beginning of sessions with higher FR values for tokens. In general, in previous studies the task that the subjects had to perform to receive tokens consisted only of responses on keys or levers, which require mainly physical cost (Suzuki and Matsuzawa 1997). The only exception was the study of Cowles (1937), in which the tasks used also required some degree of intellectual cost, in that the subjects had to solve some simple discrimination tasks. So, although we know that chimpanzees can work for exchangeable tokens, an empirical question remains to be answered: can chimpanzees perform an intellectually costly task reinforced by tokens?
The previous studies focused only on the use of tokens as rewards. However, there is another neglected aspect of “tokens": they can also function as tools to get a goal. There are several definitions of tools or tool-use. A tool can be defined as “a detached object that is used in some way to arrive at an apparent goal" (Matsuzawa 1999, p. 650). Our definition of tool-use excludes the use of individual body parts to get a goal, such as the use of the tail by the rhesus monkey (Macaca mulatta) to acquire objects (Erwin 1974). It also does not include the use of any object permanently connected to the substrate, like the use of fixed lianas by aye-ayes (Daubentonia sp.) to gain access to food (Sterling and Povinelli 1999). The term tool-use is applied to the use of a detached, easy-to-handle, and transportable object, used to achieve a well identified goal. In this sense, a token can also be a tool. A token is a detached object, which is easy to handle and transport, and which can be used to get a specific goal.
A token is a very special tool which, like money, has several interesting properties. It can be exchanged for different kinds of items (exchangeability). It is easy to handle and transport (portability). Its value can remain unchanged for extended periods so that it can be accumulated (saving). Finally, a token can also be used within a hierarchical system (hierarchy), that is, tokens of different values can exist, and these can be inter-converted.
This paper explores two important aspects of tokens: their value as rewards and function as tools. It seeks to clarify whether chimpanzees can use tokens as tools to get a food reward. It also analyzes the effectiveness of tokens in maintaining the chimpanzees' motivation in performing a discrimination task. Although four properties were referred to in describing tokens as tools, here we start by analyzing only two of them, saving and portability. We aim also to investigate the learning capacities of chimpanzees in a novel discrimination task with token reward instead of direct food reward.
Experiment 1: maintenance of discrimination performance with token reward
From previous studies, we know that tokens are effective in maintaining chimpanzees' performance in tasks requiring very little or no intellectual cost such as pressing levers or opening boxes. Moreover, in previous experiments subjects were always reinforced with a token. Even in cases where the task consisted of simple positional habits, or visual size or color-pattern discrimination, the subjects were always given a second chance in case of error (Cowles 1937). No experiment so far has focused on the introduction of a token reward in the context of a discrimination task using a computer setting, which is more objective and complex, and with differential token reinforcement. This first experiment tested whether or not tokens are efficient enough as rewards to maintain subject performance in a discrimination task that had previously been acquired by direct food reward. In this case the tokens were differentially awarded. The study of saving behavior associated with the use of token rewards was another aim of this experiment.
The subjects were three captive adult female chimpanzees (Pan troglodytes), Ai, Pendesa, and Popo, living in seminatural conditions (Ochiai and Matsuzawa 1997). They had extensive experience of matching-to-sample tasks (Biro and Matsuzawa 2001b; Matsuzawa 2001) and had already received coins as a reward in a previous study (S. Suzuki and T. Matsuzawa, unpublished work). They were free to eat leaves and grasses whenever they wanted in the outdoor enclosure, and were also fed fruit and vegetables at least five times a day, including experimental sessions.
Subjects were tested in an experimental booth (approximately 180 ×180 ×180 cm) with acrylic panels as walls. A touch-sensitive panel (Micro Touch SMT2) was installed on one wall (Fig. 1). It was connected to a computer (NEC PC-9821Xn) that controlled a discrimination task using tokens as reward. A second touch-sensitive panel (Hyper Touch CT-1000) was installed on another wall. It was connected to two sets of computers (NEC PC-9821Xa9 and NEC PC-9821Xn) that controlled the second part of the experiment, the use of tokens to get food rewards. A vending machine (CZX CONLUX ZD-160-A) for tokens was also located on the wall adjacent to the touch panel. Above each monitor was a universal feeder (Biomedica universal feeder BFU-310) that delivered tokens or food rewards into a food tray. Japanese 100-yen coins were used as tokens. Food rewards consisted of pieces of apple (about 1 cm cube, 1.2 g a piece on average) for Pendesa, and pieces of apple and blueberries (about 1.5 g each on average) for Ai and Popo.
Fig. 1 The experimental booth used in experiment 1, showing the position of the two monitors with touch-sensitive panels. Ai is performing a matching-to-sample task on the monitor on the left. There is another monitor for token use located to the right
Fig. 2 The 10 colors (presented as colored squares), 10 lexigrams, 10 kanji, and 3 Arabic numerals used as stimuli. The size of the stimuli was 6 cm×6 cm (see electronic supplementary material, ESM, S1 for a color version of this figure)
The discrimination tasks used the following sets of stimuli (Fig. 2) (see also electronic supplementary material, ESM, S1): ten colored (red, orange, yellow, green, blue, purple, pink, brown, white, gray) squares; ten visual symbols called lexigrams that corresponded to the ten colors; and ten kanji, Chinese characters, which corresponded to the ten colors and the ten lexigrams (Biro and Matsuzawa 2001a; Suzuki and Matsuzawa 1997).
The subjects were invited to come into the experimental booth from the residential outdoor enclosure. For that purpose, we simply called the name of each individual. They spontaneously walked into the booth. They were allowed to engage in any kind of activity in the booth. Each trial was initiated by the chimpanzee's spontaneous touch to an empty white circle displayed at the bottom of the touch monitor. Usually, each daily experiment consisted of three sessions, depending on the subject's willingness to participate. Each session comprised a fixed number of trials. There was no interaction between the experimenter and the subject during the sessions. In between sessions, interactions such as social play or chasing between the experimenter and the subject were allowed, and the subjects also received additional food. At the end of the experiment, the subjects were invited to return to their residential enclosure.
The experiment involved two phases, a matching phase and an exchange phase. The matching phase used an identity matching-to-sample task (IMTS) with two alternatives and token reinforcement. In the exchange phase, the subjects were required to insert the token in the vending machine to receive a food reward. A daily experiment usually consisted of three sessions of 80 trials each. All sessions ended after the 80th coin had been used. All three subjects received the same three types of IMTS tasks: color-color, lexigram-lexigram, and kanji-kanji. They participated in 12 sessions for each different task, producing a total of 36 sessions for each individual.
Matching-to-sample procedure. Each trial began with the presentation of a white circle against a black background near the bottom of the touch monitor. After the chimpanzee touched this starting stimulus, the circle disappeared and a sample stimulus appeared on the monitor. To proceed to the next step, the subject was required to touch this sample stimulus. The touch resulted in the appearance of two choice stimuli while the sample stimulus remained on screen. The subject was then required to choose and touch one of the two alternatives, which physically matched the sample stimulus (Fig. 3). A correct response was followed by a chime sound and the delivery of a coin. A feeder connected to the computer automatically delivered the coins. The starting stimulus then appeared again to mark the start of a new trial. An incorrect response was not rewarded and was followed by a beep sound and a “time out" (3 or 4 s). The subject was allowed to complete as many trials as she wished, before proceeding to the exchange phase, resulting in the accumulation of tokens by the subject.
Fig. 3 Schematic diagram illustrating a trial in the identity matching- to-sample task. Time Out for 3–4 s means that the subject had to wait for 3–4 s after an incorrect response before proceeding to the next trial
Fig. 4 Schematic view of the exchange of tokens. The filled circles represent “marks”
Procedure for the exchange of tokens. To obtain a food reward, the subject had to insert the tokens into the vending machine through a slot in the acrylic panel to the right of the touch monitor. A solid white circle (henceforth referred to as a “mark") appeared on the touch monitor, after a coin had been inserted (Fig. 4). The subject had to touch the solid white circle, or mark, to receive the food reward. The mark signaled to the chimpanzee that the vending machine had accepted the coin and was now operational. In a sense, the mark provided a visual feedback for the token-use.
The subject was allowed to insert consecutively as many coins as she wanted into the vending machine before exchanging them for food reward (succession strategy). This would result in the accumulation of tokens in the vending machine, signaled by the appearance of multiple marks on the monitor. They could also insert one token at a time, exchanging it immediately by touching the mark on the monitor (sequential strategy). The procedure assured that the number of token rewards was constant (80) across sessions.
Accuracy, measured as the percentage of correct responses in IMTS, showed little variance as sessions progressed (Table 1). The subjects' accuracy was significantly above chance level (i.e., 50%), for a chi-square test applied to the total of responses in each task for each subject (for color-to-color, Ai: χ2=828.82, df=11, P<0.01, Pendesa: χ2=940.10, df=11, P<0.01, Popo: χ2=912.60, df=11, P<0.01; for lexigram-to-lexigram, Ai: χ2=881.67, df=11, P<0.01, Pendesa: χ2=889.35, df=11, P<0.01, Popo: χ2=728.02, df=11, P<0.01; for kanji-to-kanji, Ai: χ2=693.60, df=11, P<0.01, Pendesa: χ2=735.00, df=11, P<0.01, Popo: χ2=663.34, df=11, P<0.01), and very close to 100% in some cases. The two-way ANOVA showed some significant differences between individuals, F2,99=59.28; P<0.01, and between tasks, F2,99=11.27; P<0.01. There were also some significant differences resulting from the interaction of individuals and tasks, F2,99=6.69; P<0.01.
Table 1 Average percentage of correct responses with the standard deviation for a total of 12 sessions per task, for each individual chimpanzee
How many coins were saved before they were exchanged for food? Data from the total number of sessions for each individual was combined in order to calculate the saving pattern. We computed a “saving index" in order to accurately reflect the saving behavior. This index is based on the weighted probability per opportunity and was computed as Ix=(Px–1×Px)×100. Px is the probability of collecting each token, in a rank position (x) in a saving bout. Px is given by the number of saving tokens divided by the number of opportunities of saving them. P0 is defined as 1. According to this procedure the saving index for the first token, I1, is 100%. The saving index I2 is based on P2 that is the probability of continuing to collect the second token. The saving index I3 is given by the equation (P2×P3)× 100. In this index we distinguish two types of situations: (1) stop collecting coins based on the subject's own will, and (2) stop collecting coins because is it the end of the session and there is no more opportunity to continue to save. The former situation is the research target of the present study. All bouts from all sessions for each subject were combined for this analysis.
Fig. 5 The probability of continuing to collect coins during experiment 1 for Ai, Pendesa (Pen), and Popo. Each curve represents the probability of continuing to collect coins (y axis) for each opportunity in a bout (x axis), depending on the number of coins already collected
Figure 5 shows the “saving index" for the three subjects. The probability of continuing to collect coins drops very quickly after the first coin for Ai, and after the second coin for Popo, while it declines very slowly for Pendesa. For Pendesa, the probability of collecting more than 15 coins per bout is 70%. A 50% threshold was also calculated for each subject (3.13 coins for Ai, 4 for Popo and 25 for Pendesa). This 50% threshold value represents the number of tokens saved on average, and is not based on the arithmetic mean but on the weighted probability per opportunity. In other words, based on the data, Ai continued to save up to 3 coins and Popo saved up to 4 coins in half of the bouts. In contrast, Pendesa showed a 50% probability of continuing to save up to 25 coins. These values illustrate the similarity between the saving patterns of Ai and Popo, which were very different from that of Pendesa. Following the same idea of analyzing the pattern of saving tokens, we also analyzed the pattern of saving marks on the monitor. We calculated the probability of continuing to insert coins without an immediate exchange for food, in other words, saving marks. For Ai and Pendesa the probability of inserting the next coin, without exchanging the previous one for a food item drops from 100% to 0% after the insertion of the first coin (sequential strategy). This means that the two chimpanzees never saved marks. For Popo the probability is maintained constant at 100% (succession strategy). This means that Popo continued to insert all the coins she had, and stopped only when she did not have any more coins to insert.
Experiment 1 confirmed that tokens were effective in maintaining the subjects' work in a discrimination task that had already been learned. All three subjects showed that the token reward was sufficient to maintain their discrimination performance with high accuracy at a stable level.
The present experiment also revealed an important aspect of the use of tokens as rewards, i.e., saving. The subjects had never saved food rewards in the same discrimination tasks in the traditional food-reward experiments. However, they spontaneously saved token rewards in this experiment. This is a unique feature of token rewards in contrast to direct food reward. From the description of the saving pattern we can also conclude that the amount of tokens already saved influenced the saving pattern.
There were also individual differences in saving patterns, that is, the number of coins saved in the discrimination task and the number of marks saved on the monitor for the vending machine. One subject, Pendesa, saved multiple tokens during the matching phase, in contrast to Ai and Popo who saved only a few tokens. All the subjects saved and transported the tokens either in the hand or in the mouth. Only one of the subjects, Popo, saved tokens in the vending machine (succession strategy), in contrast to the other two who showed a sequential strategy.
Experiment 2: learning of a new discrimination task with token reward
We have proved that tokens were sufficient to maintain the subjects' performance in a discrimination task previously maintained by direct food rewards. But we may still argue that the intellectual cost of this task was relatively low, because the subjects were used to performing this task. The introduction of a completely new task would resolve this problem and would test whether or not tokens are also effective in shaping a new discrimination task in the subjects. Experiment 2 analyzed the effect of a token reward, instead of a food reward, in shaping a new discrimination task. The introduction of a new task will cause the subjects to score many errors. As after an error the chimpanzees are subjected to a waiting period; this creates the perfect context to test the influence of such waiting time on the saving behavior. From experiment 1 we know that the amount of token already saved influenced the saving pattern. So, experiment 2 aims also to identify another factor that determines the saving behavior.
Two of the chimpanzees in experiment 1, Ai and Pendesa, served as subjects.
Chimpanzees were tested in an experimental booth slightly different from the one previously described. In this booth the touch panels were installed on two opposite walls. The equipment was similar to the one used in the previous booth except that there was only one computer (NEC, PC-9821Xn) controlling the second part of the experiment for token use. For the exchange of tokens for food there were two universal feeders connected to the same food tray. Food rewards were of ten different types (pieces of apple, banana, blueberry, carrot, chow, grape, peanut, pistachio nut, potato, and/or raisin).
The stimuli used for Pendesa were three colored squares (red, yellow and green) and the three corresponding lexigrams. Ai had already learned this symbolic correspondence between color and lexigram (Matsuzawa 1985a). She had also learned to use Arabic numerals (Matsuzawa 1985b). Therefore, a new correspondence was created for Ai. For the same colors, red, yellow, and green, she had to match the numerals 0, 2, and 3, respectively (Fig. 2; see also ESM S1 for this figure in color).
The procedure was the same as in experiment 1, except that a symbolic matching-to-sample (SMTS), rather than an IMTS, task was used. It also involved an initial matching phase where subjects received tokens, and an exchange phase. Ai performed 42 sessions (48 trials each) for the numeral-to-color matching and then switched to 42 sessions of color-to-numeral matching. Pendesa performed 42 sessions (48 trials each) for lexigram-to-color matching and then switched to 24 sessions of color-to-lexigram matching. During both phases, a daily experiment consisted of a block of three sessions, for both subjects. Each session ended after the subject inserted her last coin into the vending machine to get a food reward. In order to make the decision to switch from the pre-reversal phase to the post-reversal phase a learning criterion was defined. The individual was considered to have learned the task if the average accuracy in six consecutive sessions was equal to or above 75%.
Matching-to-sample procedure. The procedure of this SMTS task was the same as that of the previous one, except for the difference in the sample-choice relationship. The sample stimulus and the choice stimuli were taken from different categories, such as color and shape. For example, when a color was presented as sample stimulus, the choice stimuli were two numerals. The sample color corresponded to one of the two alternative numerals. The symbolic relationship between the sample and the choice stimuli was completely new for both subjects. The subjects had thus to learn to associate each color to the corresponding number (Ai) or lexigram (Pendesa). As in experiment 1, only correct responses were rewarded by a token automatically delivered by the feeder.
Procedure for the exchange of tokens. The exchange procedure was essentially the same as in experiment 1 (Fig. 4), but after touching each solid white circle, or “mark", on the monitor, two pictures from among the 45 possible combinations of ten different food items appeared on the monitor. The subject was required to choose one of the two by touching the picture and then received the corresponding food item. The ten food items shown were pieces of apple, grape, banana, raisin, peanut, blueberry, carrot, potato, chow, and pistachio nut. This procedure was used in order to maintain the high motivation of the individuals.
Fig. 6 Ai's learning curve in new symbolic matching-to-sample (SMTS) tasks shaped by token reward
Ai performed 68.8% correct in the first session (Fig. 6), which was significantly above chance level (i.e., 50%), χ2=6.75, df=11, P<0.01. This means that Ai acquired the new SMTS task in the first session, although the performance was not far above chance level. After training, performance reached 79.2% in the 37th session. In contrast, Pendesa performed 60.4% correct in the first session (Fig. 7), which was not significantly different from chance level, χ2=2.08, df=11, P<0.15. After training, accuracy reached 83.3% in the 41st session. When the results of the first nine sessions and the last nine sessions of matching were compared for the two subjects, the differences were statistically significant both for Ai, χ2=20.37, df=11, P<0.01, and for Pendesa, χ2=25.89, df=11, P<0.01. This means that learning had taken place in the newly introduced SMTS task. The chimpanzees did succeed in establishing a new discrimination when rewarded with tokens.
Fig. 7 Pendesa's learning curve in new SMTS tasks shaped by token reward
The introduction of the reverse symbolic matching task of color-to-numeral for Ai and of color-to-lexigram for Pendesa produced different results. Ai's accuracy dropped to the chance level of 50% in the first session, χ2=0.00, df=11, P<1.00, showing a lack of transfer from the numeral-to-color to the color-to-numeral task, χ2=126.89, df=11, P<0.01. The performance quickly recovered and reached 93.9% in the 31st session, which is above chance level, χ2=36.75, df=11, P<0.01. When the results of the first nine sessions and the last nine sessions of this second task were compared, the differences were statistically significant, χ2=176.39, df=11, P<0.01, which means that Ai also learned the reversed symmetrical discrimination task. In contrast, Pendesa's accuracy was already at 77.1% in the first session of the reversed task, which is above chance level, χ2=14.08, df=11, P<0.01. Accuracy increased to 83.3% by the seventh session. The comparison of the results of the first nine sessions and the last nine sessions of this second task, showed no differences, χ2=0.05, df=11, P<0.81, which means that, in Pendesa's case, no new learning of the reverse discrimination task occurred. Pendesa did not show a decrement of accuracy when the reverse task was introduced, χ2=0.13, df=11, P<0.72, which means that she transferred from the lexigram-tocolor to the color-to-lexigram task.
Saving bouts were analyzed following the same procedure as in experiment 1. Ai's saving pattern was very similar in both phases of the experiment. As in experiment 1, Ai's probability of continuing to collect coins dropped very quickly after receiving the first coin (Fig. 8). In other words, Ai's saving pattern was kept constant between phases and between experiments. Pendesa's probability of continuing to collect coins remained at 100% up to 6–7 coins (the 7th coin during lexigram-to-color matching, and the 6th coin during color-to-lexigram matching). Although there were some differences between the saving curves of the two phases of the experiment, both showed a pronounced drop after the 18th coin (Fig. 8).
Fig. 8 The probability of continuing to collect coins during both phases of experiment 2 for Ai and Pendesa. Each curve represents, for one subject in one phase of the experiment, the probability of continuing to collect coins (y axis) for each opportunity in a bout (x axis), depending on the number of coins already collected
Table 2 Frequency of stopping to collect coins, after a correct or incorrect response
The 50% threshold indicates the average number of tokens saved per bout. Ai saved about 2 coins on average (2.13 and 2.35 in number-to-color and color-to-number tasks, respectively). Pendesa saved, on average, about 30 (mean=30.21) coins for the lexigram-to-color task and 22 (mean=21.68) coins for the color-to-lexigram task. These results demonstrate a drop in the saving pattern with the transition to the reverse discrimination task. We also analyzed the influence of the waiting period associated with the error trials in the SMTS task on the saving pattern. The data from the first ten test sessions of the first phase were combined for each subject to determine the frequency of times they stopped a bout after an incorrect response (Table 2). Ai stopped collecting coins more frequently than not after an incorrect response (χ2=45.99, df=13, P<0.01). However, Pendesa never stopped collecting coins after an incorrect response, χ2=383.48, df=13, P<0.01. These data shows that saving behavior was influenced by the waiting period after an incorrect response in Ai, but not in Pendesa.
This experiment strongly suggests that tokens are a powerful reward in shaping a new discrimination task. Both subjects learned a new symbolic relationship between stimuli, reaching high accuracy with token rewards. The results also showed the existence of individual differences in the learning process. Ai showed no transfer of the acquired skill to the reversed symmetrical SMTS task. However, Pendesa successfully transferred the acquired skill to the reversed symmetrical SMTS task.
“Stimulus equivalence" (Sidman et al. 1982) contains three kinds of relationships: reflexivity, transitivity, and symmetry. Although in humans equivalence relations are easily generated, this does not always apply to non-human animals. Actually, the spontaneous generation of such equivalent relations is very rare among non-human animals (Tomonaga et al. 1991). Monkeys sometimes show reflexivity (Fujita 1983) and transitivity (D'Amato et al. 1985). However, they fail in showing symmetrical relations (Sidman et al. 1982). Chimpanzees in general show emergence of reflexivity (Oden et al. 1988) and transitivity (Yamamoto and Asano 1991), but only language-trained chimpanzees were thought to be able to show symmetric relations after prolonged training (see Kojima 1984; Premack 1976; Yamamoto and Asano 1995). Tomonaga et al. (1991) reported the first and only case of a non-language- trained chimpanzee that showed symmetry. From these studies, it is clear that symmetrical relations represent the most difficult aspect of equivalence relations and have a key role in the establishment of “stimulus equivalence". Experiment 2 provided the second example of a non-language-trained chimpanzee (Pendesa) showing the emergence of symmetry, proving that chimpanzees can generate equivalence relations much like humans. But why did one particular subject show this phenomenon and not the other (Ai, the language-trained chimpanzee)? What are the factors determining the process of transfer? Further studies will be necessary to answer these questions.
The results of this experiment clearly pinpointed the determinants of saving behavior. They confirmed that the amount of tokens already saved influenced the saving pattern. For Ai only 1–3 coins saved were enough to stop working for more coins. She was now ready to exchange them for food. However, Pendesa continued to work for more coins and stopped only once she had already collected a large number of coins (20–30 or more). In the case of Ai the time-out after an incorrect response also appeared to have an important role in stopping saving. She probably preferred to exchange the few coins without waiting for the inserted time-out interval (in other words, the delay interval) to expire before the opportunity to get more coins. Further analysis will be necessary to clarify the factors determining saving behavior.
This study proved that tokens are almost equal to food rewards in their capacity to maintain and shape a discrimination behavior. Experiment 1 showed that tokens were effective in maintaining the subjects working in a discrimination task already acquired, with high accuracy at a stable level. Tokens were as effective as direct food rewards and were seen as an item exchangeable for food by the chimpanzees. Experiment 2 further suggested that tokens were also effective in the learning of a novel discrimination problem. Both subjects, Ai and Pendesa, showed learning as sessions progressed, reaching high levels of accuracy with tokens as rewards. The acquisition of the new task was also confirmed by the introduction of the reversal procedure in the SMTS.
The present study provided evidence of a unique behavior, saving, clarifying some details of token use. Experiment 1 clearly showed that the chimpanzees spontaneously saved tokens, in contrast to food rewards. Cowles (1937) stated that chimpanzees, after some training, could save 10–30 tokens, exchangeable as a group after obtaining them singly. The present results proved that saving might occur without any training. All three chimpanzees saved coins, and one even spontaneously saved “marks" on the monitor, another stage of saving tokens. This spontaneous behavior which the subjects developed during the experiment indicates also the existence of inter- individual differences. The results showed three strategies of saving patterns in chimpanzees: (1) saving a few tokens (Ai) and saving no marks, exchanging them one by one for food, (2) saving many tokens (Pendesa) but saving no marks, exchanging them one by one for food, and (3) saving a few tokens and also saving multiple marks that were exchanged for food all at once (Popo).
The details of the saving behavior also lead us to conclude that the subjects were adjusting their behavior in terms of a balance of costs and benefits. The chimpanzees had to move from the place where they obtained the tokens to the place where they could exchange them. In this sense, moving from one place to the other has a cost in terms of energy and time spent. The food that is received represents the benefit. So, the strategy of collecting just one coin and going to exchange it for food before collecting another is very costly, which explains the emergence of the saving behavior. The chimpanzees were not only saving tokens, but they were also saving energy and time.
This unique feature of spontaneous saving by chimpanzees inevitably leads to the following question. What are the determinants of saving? Why do chimpanzees spontaneously stop saving at some points and not at others? Although there were clear individual differences in saving pattern, we proposed two plausible factors to answer these questions. One factor is the number of tokens already saved by the chimpanzee within a bout. The other is the contribution of the waiting period associated with the errors the subject made, a time-delay as penalty. The results of both experiments showed that the amount of tokens already saved influenced the decision as to whether the chimpanzee continued working for another token. The waiting period was also shown to have an important role in determining the saving pattern of one of the subjects. The subject seemed not to want to wait for the end of the time-delay penalty, but preferred instead to insert the tokens that she had already received. However, further analyses will be needed to determine the degree of importance of error trials more precisely.
The chimpanzees saved tokens but with the final objective of exchanging them for food. They were never observed to perform only the MTS task without exchanging the tokens for food. This means that they were planning a future action even if it was in the proximate future.
At this point we can return to our original question of what a token is. We started by mentioning that a token could serve as a reward – but was it only a reward for the chimpanzees? The chimpanzees performed a task to collect tokens with the objective of exchanging them for food as demonstrated. The chimpanzees moved from one place to another transporting the tokens to use them to get food. According to our definition of tools a token is a tool. The present study showed that the chimpanzees were able to use the unique tool called a token to obtain a specific goal.
This study was conducted at the Primate Research Institute of Kyoto University and was financially supported by the Ministry of Education, Science and Culture of Japan (Grant-in-Aid for Scientific Research No. 07102010 and 12002009 to Tetsuro Matsuzawa). This study is part of the Master Thesis of the first author, which was also financially supported by Fundação para a Ciência e a Tecnologia of the Minister of Science and Technology of Portugal (Praxis XXI/BM/17736/98 to Cláudia Sousa). We would like to express our thanks to Sumiharu Nagumo of Kyoto University for his technical assistance, to Joël Fagot of the Center for Research in Cognitive Neurosciences, France, for his critical reading of the article, and to Paulo Mota of Coimbra University, Portugal, for his support and suggestions. Special thanks are also due to Nobuyuki Kawai of Kyoto University and Dora Biro of Oxford University for their suggestions and help.
- Biro D, Matsuzawa T (2001a) Chimpanzee numerical competence:cardinal and ordinal skills. In: Matsuzawa T (ed) Primate originsof human cognition and behavior. Springer, Berlin HeidelbergNew York, pp 199–225
- Biro D, Matsuzawa T (2001b) Use of numerical symbols by thechimpanzee (Pan troglodytes): cardinals, ordinals and the introductionof zero. Anim Cogn 4 (in press) DOI 10.1007/s100710100086
- Cowles JT (1937) Food-tokens as incentives for learning by chimpanzees.Comp Psychol Monogr 23:1–96
- D'Amato MR, Salmon DP, Loukas E, Tomie A (1985) Symmetryand transitivity of conditional relations in monkeys (Cebusapella) and pigeons (Columbia livia). J Exp Anal Behav 44:35–47
- Erwin J (1974) Laboratory-reared rhesus monkeys can use theirtails as tools. Percept Motor Skills 39:129–130
- Fujita K (1983) Formation of the sameness-difference concept byJapanese monkeys from a small number of color stimuli. J ExpAnal Behav 40:289–300
- Kelleher RT (1956) Intermittent conditioned reinforcement inchimpanzees. Science 124:279–280
- Kelleher RT (1957a) A comparison of conditioned and food reinforcementwith chimpanzees. Psychol Newsl 8:88–93
- Kelleher RT (1957b) A multiple schedule of conditioned reinforcementwith chimpanzees. Psychol Rep 3:485–491
- Kelleher RT (1957c) Conditioned reinforcement in chimpanzees.J Comp Physiol Psychol 50:571–575
- Kelleher RT (1958) Fixed-ratio schedules of conditioned reinforcementwith chimpanzees. J Exp Anal Behav 1:281–289
- Kojima T (1984) Generalization between productive use and receptivediscrimination of names in an artificial visual languageby a chimpanzee. Int J Primatol 5:161–182
- Matsuzawa T (1985a) Color naming and classification in a chimpanzee(Pan troglodytes). J Hum Evol 14:283–291
- Matsuzawa T (1985b) Use of numbers by a chimpanzee. Nature315:57–59
- Matsuzawa T (1999) Communication and tool use in chimpanzees:cultural and social contexts. In: Hauser M, Konishi M (eds)Neural mechanisms of communication. MIT Press, Cambridge,pp 645–671
- Matsuzawa T (2001) Primate origins of human cognition and behavior.Springer, Berlin Heidelberg New York
- Ochiai T, Matsuzawa T (1997) Planting trees in an outdoor compoundof chimpanzees for an enriched environment. In: Hare V(ed) Proceedings of the third international conference on environmentalenrichment. The Shape of Enrichment, San Diego,pp 355–364
- Oden DL, Thompson RKR, Premack D (1988) Spontaneous transferof matching by infant chimpanzees. J Exp Psychol AnimBehav Proc 14:140–145
- Premack D (1976) Intelligence in ape and man. Erlbaum, HillsdaleSidman M, Rauzin R, Lazar R, Cunningham S, Tailby W, CarriganP (1982) A search for symmetry in the conditional discriminationsof rhesus monkeys, baboons, and children. J Exp AnalBehav 37:23–44
- Sterling EJ, Povinelli DJ (1999) Tool use, aye-ayes, and sensorimotorintelligence. Folia Primatol 70:8–16Suzuki S, Matsuzawa T (1997) Choice between two discriminationtasks in chimpanzees (Pan troglodytes). Jpn Psychol Res 39:226–235
- Tomonaga M, Matsuzawa T, Fujita K, Yamamoto J (1991)Emergence of symmetry in a visual conditional discriminationby chimpanzees (Pan troglodytes). Psychol Rep 68:51–60
- Wolf JB (1936) Effectiveness of token-rewards for chimpanzees.Comp Psychol Monogr 12:1–72
- Yamamoto J, Asano T (1991) Formation of stimulus equivalencesin a chimpanzee. In: Ehara A, Kimura T, Takenaka O, IwamotoM (eds) Primatology today. Elsevier, Amsterdam, pp 321–324
- Yamamoto J, Asano T (1995) Stimulus equivalence in a chimpanzee(Pan troglodytes). Psychol Rec 45:3–21