首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The brain's most difficult computation in decision-making learning is searching for essential information related to rewards among vast multimodal inputs and then integrating it into beneficial behaviors. Contextual cues consisting of limbic, cognitive, visual, auditory, somatosensory, and motor signals need to be associated with both rewards and actions by utilizing an internal representation such as reward prediction and reward prediction error. Previous studies have suggested that a suitable brain structure for such integration is the neural circuitry associated with multiple cortico-striatal loops. However, computational exploration still remains into how the information in and around these multiple closed loops can be shared and transferred. Here, we propose a "heterarchical reinforcement learning" model, where reward prediction made by more limbic and cognitive loops is propagated to motor loops by spiral projections between the striatum and substantia nigra, assisted by cortical projections to the pedunculopontine tegmental nucleus, which sends excitatory input to the substantia nigra. The model makes several fMRI-testable predictions of brain activity during stimulus-action-reward association learning. The caudate nucleus and the cognitive cortical areas are correlated with reward prediction error, while the putamen and motor-related areas are correlated with stimulus-action-dependent reward prediction. Furthermore, a heterogeneous activity pattern within the striatum is predicted depending on learning difficulty, i.e., the anterior medial caudate nucleus will be correlated more with reward prediction error when learning becomes difficult, while the posterior putamen will be correlated more with stimulus-action-dependent reward prediction in easy learning. Our fMRI results revealed that different cortico-striatal loops are operating, as suggested by the proposed model.  相似文献   

2.
Learning to make choices that yield rewarding outcomes requires the computation of three distinct signals: stimulus values that are used to guide choices at the time of decision making, experienced utility signals that are used to evaluate the outcomes of those decisions and prediction errors that are used to update the values assigned to stimuli during reward learning. Here we investigated whether monetary and social rewards involve overlapping neural substrates during these computations. Subjects engaged in two probabilistic reward learning tasks that were identical except that rewards were either social (pictures of smiling or angry people) or monetary (gaining or losing money). We found substantial overlap between the two types of rewards for all components of the learning process: a common area of ventromedial prefrontal cortex (vmPFC) correlated with stimulus value at the time of choice and another common area of vmPFC correlated with reward magnitude and common areas in the striatum correlated with prediction errors. Taken together, the findings support the hypothesis that shared anatomical substrates are involved in the computation of both monetary and social rewards.  相似文献   

3.
Reward prediction errors consist of the differences between received and predicted rewards. They are crucial for basic forms of learning about rewards and make us strive for more rewards—an evolutionary beneficial trait. Most dopamine neurons in the midbrain of humans, monkeys, and rodents signal a reward prediction error; they are activated by more reward than predicted (positive prediction error), remain at baseline activity for fully predicted rewards, and show depressed activity with less reward than predicted (negative prediction error). The dopamine signal increases nonlinearly with reward value and codes formal economic utility. Drugs of addiction generate, hijack, and amplify the dopamine reward signal and induce exaggerated, uncontrolled dopamine effects on neuronal plasticity. The striatum, amygdala, and frontal cortex also show reward prediction error coding, but only in subpopulations of neurons. Thus, the important concept of reward prediction errors is implemented in neuronal hardware.  相似文献   

4.
Learning is proposed to occur when there is a discrepancy between reward prediction and reward receipt. At least two separate systems are thought to exist: one in which predictions are proposed to be based on model-free or cached values; and another in which predictions are model-based. A basic neural circuit for model-free reinforcement learning has already been described. In the model-free circuit the ventral striatum (VS) is thought to supply a common-currency reward prediction to midbrain dopamine neurons that compute prediction errors and drive learning. In a model-based system, predictions can include more information about an expected reward, such as its sensory attributes or current, unique value. This detailed prediction allows for both behavioral flexibility and learning driven by changes in sensory features of rewards alone. Recent evidence from animal learning and human imaging suggests that, in addition to model-free information, the VS also signals model-based information. Further, there is evidence that the orbitofrontal cortex (OFC) signals model-based information. Here we review these data and suggest that the OFC provides model-based information to this traditional model-free circuitry and offer possibilities as to how this interaction might occur.  相似文献   

5.
Humans and animals are exquisitely, though idiosyncratically, sensitive to risk or variance in the outcomes of their actions. Economic, psychological, and neural aspects of this are well studied when information about risk is provided explicitly. However, we must normally learn about outcomes from experience, through trial and error. Traditional models of such reinforcement learning focus on learning about the mean reward value of cues and ignore higher order moments such as variance. We used fMRI to test whether the neural correlates of human reinforcement learning are sensitive to experienced risk. Our analysis focused on anatomically delineated regions of a priori interest in the nucleus accumbens, where blood oxygenation level-dependent (BOLD) signals have been suggested as correlating with quantities derived from reinforcement learning. We first provide unbiased evidence that the raw BOLD signal in these regions corresponds closely to a reward prediction error. We then derive from this signal the learned values of cues that predict rewards of equal mean but different variance and show that these values are indeed modulated by experienced risk. Moreover, a close neurometric-psychometric coupling exists between the fluctuations of the experience-based evaluations of risky options that we measured neurally and the fluctuations in behavioral risk aversion. This suggests that risk sensitivity is integral to human learning, illuminating economic models of choice, neuroscientific models of affective learning, and the workings of the underlying neural mechanisms.  相似文献   

6.
Surprise drives learning. Various neural “prediction error” signals are believed to underpin surprise‐based reinforcement learning. Here, we report a surprise signal that reflects reinforcement learning but is neither un/signed reward prediction error (RPE) nor un/signed state prediction error (SPE). To exclude these alternatives, we measured surprise responses in the absence of RPE and accounted for a host of potential SPE confounds. This new surprise signal was evident in ventral striatum, primary sensory cortex, frontal poles, and amygdala. We interpret these findings via a normative model of surprise. Hum Brain Mapp 35:4805–4814, 2014. © 2014 The Authors. Human Brain Mapping Published by Wiley Periodicals, Inc.  相似文献   

7.
To behave adaptively, we must learn from the consequences of our actions. Studies using event-related potentials (ERPs) have been informative with respect to the question of how such learning occurs. These studies have revealed a frontocentral negativity termed the feedback-related negativity (FRN) that appears after negative feedback. According to one prominent theory, the FRN tracks the difference between the values of actual and expected outcomes, or reward prediction errors. As such, the FRN provides a tool for studying reward valuation and decision making. We begin this review by examining the neural significance of the FRN. We then examine its functional significance. To understand the cognitive processes that occur when the FRN is generated, we explore variables that influence its appearance and amplitude. Specifically, we evaluate four hypotheses: (1) the FRN encodes a quantitative reward prediction error; (2) the FRN is evoked by outcomes and by stimuli that predict outcomes; (3) the FRN and behavior change with experience; and (4) the system that produces the FRN is maximally engaged by volitional actions.  相似文献   

8.
Contemporary learning theories suggest that conditioning is heavily dependent on the processing of prediction errors, which signal a discrepancy between expected and observed outcomes. This line of research provides a framework through which classical theories of placebo effects, expectations and conditioning, can be reconciled. Brain regions related to prediction error processing [anterior cingulate cortex (ACC), orbitofrontal cortex or the nucleus accumbens] overlap with those involved in placebo effects. Here we examined the possibility that the magnitude of objective neurochemical responses to placebo administration would depend on individual expectation-effectiveness comparisons. We show that such comparisons and not expectations per se predict behavioral placebo responses and placebo-induced activation of µ-opioid receptor-mediated neurotransmission in regions relevant to error detection (e.g. ACC). Expectations on the other hand were associated with greater µ-opioid system activation in the dorsolateral prefrontal cortex but not with greater behavioral placebo responses. The results presented aid the elucidation of molecular and neural mechanisms underlying the relationship between expectation-effectiveness associations and the formation of placebo responses, shedding light on the individual differences in learning and decision making. Expectation and outcome comparisons emerge as a cognitive mechanism that beyond reward associations appears to facilitate the formation and sustainability of placebo responses.  相似文献   

9.
Learning theory suggests that animals attend to pertinent environmental cues when reward contingencies unexpectedly change so that learning can occur. We have previously shown that activity in basolateral nucleus of amygdala (ABL) responds to unexpected changes in reward value, consistent with unsigned prediction error signals theorized by Pearce and Hall. However, changes in activity were present only at the time of unexpected reward delivery, not during the time when the animal needed to attend to conditioned stimuli that would come to predict the reward. This suggested that a different brain area must be signaling the need for attention necessary for learning. One likely candidate to fulfill this role is the anterior cingulate cortex (ACC). To test this hypothesis, we recorded from single neurons in ACC as rats performed the same behavioral task that we have used to dissociate signed from unsigned prediction errors in dopamine and ABL neurons. In this task, rats chose between two fluid wells that produced varying magnitudes of and delays to reward. Consistent with previous work, we found that ACC detected errors of commission and reward prediction errors. We also found that activity during cue sampling encoded reward size, but not expected delay to reward. Finally, activity in ACC was elevated during trials in which attention was increased following unexpected upshifts and downshifts in value. We conclude that ACC not only signals errors in reward prediction, as previously reported, but also signals the need for enhanced neural resources during learning on trials subsequent to those errors.  相似文献   

10.
Human reward pursuit is often assumed to involve conscious processing of reward information. However, recent research revealed that reward cues enhance cognitive performance even when perceived without awareness. Building on this discovery, the present functional MRI study tested two hypotheses using a rewarded mental‐rotation task. First, we examined whether subliminal rewards engage the ventral striatum (VS), an area implicated in reward anticipation. Second, we examined differences in neural responses to supraliminal versus subliminal rewards. Results indicated that supraliminal, but not subliminal, high‐value reward cues engaged brain areas involved in reward processing (VS) and task performance (supplementary motor area, motor cortex, and superior temporal gyrus). This pattern of findings is striking given that subliminal rewards improved performance to the same extent as supraliminal rewards. So, the neural substrates of conscious versus unconscious reward pursuit are vastly different—but despite their differences, conscious and unconscious reward pursuit may still produce the same behavioral outcomes. Hum Brain Mapp 35:5578–5586, 2014. © 2014 Wiley Periodicals, Inc .  相似文献   

11.
The novelty exploration bonus and its attentional modulation   总被引:1,自引:0,他引:1  
We hypothesized that novel stimuli represent salient learning signals that can motivate ‘exploration’ in search for potential rewards. In computational theories of reinforcement learning, this is referred to as the novelty ‘exploration bonus’ for rewards. If true, stimulus novelty should enhance the reward anticipation signals in brain areas that are part of dopaminergic circuitry and thereby reduce responses to reward outcomes. We investigated this hypothesis in two fMRI experiments. Images of complex natural scenes predicted monetary reward or a neutral outcome by virtue of depicting either indoor or outdoor scenes. Half of the reward-predicting and neutral images had been familiarized the day before, the other half were novel. In experiment 1, subjects indicated whether images were novel or familiar, whereas in experiment 2, they explicitly decided whether or not images predicted reward by depicting indoor or outdoor scenes. Novelty led to the hypothesized enhancement of mesolimbic reward prediction responses and concomitant reduction of mesolimbic responses to reward outcomes. However, this effect was strongly task-dependent and occurred only in experiment 2, when the reward-predicting property of each image was attended. Recognition memory for the novel and familiar stimuli (after 24 h) was enhanced by reward anticipation in both tasks. These findings are compatible with the proposition that novelty can act as a bonus for rewards under conditions when rewards are explicitly attended, thus biasing the organism towards reward anticipation and providing a motivational signal for exploration.  相似文献   

12.
Neuroeconomics is providing insights into the neural bases of decision-making in normal and pathological conditions. In the neuropsychiatric domain, this discipline investigates how abnormal functioning of neural systems associated with reward processing and cognitive control promotes different disorders, and whether such evidence may inform treatments. This endeavor is crucial when studying different types of addiction, which share a core promoting mechanism in the imbalance between impulsive subcortical neural signals associated with immediate pleasurable outcomes and inhibitory signals mediated by a prefrontal reflective system. The resulting impairment in behavioral control represents a hallmark of alcohol use disorders (AUDs), a chronic relapsing disorder characterized by excessive alcohol consumption despite devastating consequences. This review aims to summarize available magnetic resonance imaging (MRI) evidence on reward-related decision-making alterations in AUDs, and to envision possible future research directions. We review functional MRI (fMRI) studies using tasks involving monetary rewards, as well as MRI studies relating decision-making parameters to neurostructural gray- or white-matter metrics. The available data suggest that excessive alcohol exposure affects neural signaling within brain networks underlying adaptive behavioral learning via the implementation of prediction errors. Namely, weaker ventromedial prefrontal cortex activity and altered connectivity between ventral striatum and dorsolateral prefrontal cortex likely underpin a shift from goal-directed to habitual actions which, in turn, might underpin compulsive alcohol consumption and relapsing episodes despite adverse consequences. Overall, these data highlight abnormal fronto-striatal connectivity as a candidate neurobiological marker of impaired choice in AUDs. Further studies are needed, however, to unveil its implications in the multiple facets of decision-making.  相似文献   

13.
The past several years have seen a resurgence of interest in understanding the psychological and neural bases of what are often referred to as “negative symptoms” in schizophrenia. These aspects of schizophrenia include constructs such as asociality, avolition (a reduction in the motivation to initiate or persist in goal-directed behavior), and anhedonia (a reduction in the ability to experience pleasure). We believe that these dimensions of impairment in individuals with schizophrenia reflect difficulties using internal representations of emotional experiences, previous rewards, and motivational goals to drive current and future behavior in a way that would allow them to obtain desired outcomes, a deficit that has major clinical significance in terms of functional capacity. In this article, we review the major components of the systems that link experienced and anticipated rewards with motivated behavior that could potentially be impaired in schizophrenia. We conclude that the existing evidence suggests relatively intact hedonics in schizophrenia, but impairments in some aspects of reinforcement learning, reward prediction, and prediction error processing, consistent with an impairment in “wanting.” As of yet, there is only indirect evidence of impairment in anterior cingulate and orbital frontal function that may support value and effort computations. However, there are intriguing hints that individuals with schizophrenia may not be able to use reward information to modulate cognitive control and dorsolateral prefrontal cortex function, suggesting a potentially important role for cortical–striatal interactions in mediating impairment in motivated and goal-directed behavior in schizophrenia.  相似文献   

14.
Reward detection, surprise detection and prediction-error signaling have all been proposed as roles for the ventral striatum (vStr). Previous neuroimaging studies of striatal function in schizophrenia have found attenuated neural responses to reward-related prediction errors; however, as prediction errors represent a discrepancy in mesolimbic neural activity between expected and actual events, it is critical to examine responses to both expected and unexpected rewards (URs) in conjunction with expected and UR omissions in order to clarify the nature of ventral striatal dysfunction in schizophrenia. In the present study, healthy adults and people with schizophrenia were tested with a reward-related prediction-error task during functional magnetic resonance imaging to determine whether schizophrenia is associated with altered neural responses in the vStr to rewards, surprise prediction errors or all three factors. In healthy adults, we found neural responses in the vStr were correlated more specifically with prediction errors than to surprising events or reward stimuli alone. People with schizophrenia did not display the normal differential activation between expected and URs, which was partially due to exaggerated ventral striatal responses to expected rewards (right vStr) but also included blunted responses to unexpected outcomes (left vStr). This finding shows that neural responses, which typically are elicited by surprise, can also occur to well-predicted events in schizophrenia and identifies aberrant activity in the vStr as a key node of dysfunction in the neural circuitry used to differentiate expected and unexpected feedback in schizophrenia.  相似文献   

15.

Purpose of Review

Surprises are important sources of learning. Cognitive scientists often refer to surprises as “reward prediction errors,” a parameter that captures discrepancies between expectations and actual outcomes. Here, we integrate neurophysiological and functional magnetic resonance imaging (fMRI) results addressing the processing of reward prediction errors and how they might be altered in drug addiction and Parkinson’s disease.

Recent Findings

By increasing phasic dopamine responses, drugs might accentuate prediction error signals, causing increases in fMRI activity in mesolimbic areas in response to drugs. Chronic substance dependence, by contrast, has been linked with compromised dopaminergic function, which might be associated with blunted fMRI responses to pleasant non-drug stimuli in mesocorticolimbic areas. In Parkinson’s disease, dopamine replacement therapies seem to induce impairments in learning from negative outcomes.

Summary

The present review provides a holistic overview of reward prediction errors across different pathologies and might inform future clinical strategies targeting impulsive/compulsive disorders.
  相似文献   

16.
Increased striatal dopamine synthesis capacity has consistently been reported in patients with schizophrenia. However, the mechanism translating this into behavior and symptoms remains unclear. It has been proposed that heightened striatal dopamine may blunt dopaminergic reward prediction error signaling during reinforcement learning. In this study, we investigated striatal dopamine synthesis capacity, reward prediction errors, and their association in unmedicated schizophrenia patients (n = 19) and healthy controls (n = 23). They took part in FDOPA-PET and underwent functional magnetic resonance imaging (fMRI) scanning, where they performed a reversal-learning paradigm. The groups were compared regarding dopamine synthesis capacity (Kicer), fMRI neural prediction error signals, and the correlation of both. Patients did not differ from controls with respect to striatal Kicer. Taking into account, comorbid alcohol abuse revealed that patients without such abuse showed elevated Kicer in the associative striatum, while those with abuse did not differ from controls. Comparing all patients to controls, patients performed worse during reversal learning and displayed reduced prediction error signaling in the ventral striatum. In controls, Kicer in the limbic striatum correlated with higher reward prediction error signaling, while there was no significant association in patients. Kicer in the associative striatum correlated with higher positive symptoms and blunted reward prediction error signaling was associated with negative symptoms. Our results suggest a dissociation between striatal subregions and symptom domains, with elevated dopamine synthesis capacity in the associative striatum contributing to positive symptoms while blunted prediction error signaling in the ventral striatum related to negative symptoms.  相似文献   

17.
Drugs of abuse elicit dopamine release in the ventral striatum, possibly biasing dopamine‐driven reinforcement learning towards drug‐related reward at the expense of non‐drug‐related reward. Indeed, in alcohol‐dependent patients, reactivity in dopaminergic target areas is shifted from non‐drug‐related stimuli towards drug‐related stimuli. Such ‘hijacked’ dopamine signals may impair flexible learning from non‐drug‐related rewards, and thus promote craving for the drug of abuse. Here, we used functional magnetic resonance imaging to measure ventral striatal activation by reward prediction errors (RPEs) during a probabilistic reversal learning task in recently detoxified alcohol‐dependent patients and healthy controls (N = 27). All participants also underwent 6‐[18F]fluoro‐DOPA positron emission tomography to assess ventral striatal dopamine synthesis capacity. Neither ventral striatal activation by RPEs nor striatal dopamine synthesis capacity differed between groups. However, ventral striatal coding of RPEs correlated inversely with craving in patients. Furthermore, we found a negative correlation between ventral striatal coding of RPEs and dopamine synthesis capacity in healthy controls, but not in alcohol‐dependent patients. Moderator analyses showed that the magnitude of the association between dopamine synthesis capacity and RPE coding depended on the amount of chronic, habitual alcohol intake. Despite the relatively small sample size, a power analysis supports the reported results. Using a multimodal imaging approach, this study suggests that dopaminergic modulation of neural learning signals is disrupted in alcohol dependence in proportion to long‐term alcohol intake of patients. Alcohol intake may perpetuate itself by interfering with dopaminergic modulation of neural learning signals in the ventral striatum, thus increasing craving for habitual drug intake.  相似文献   

18.
Behavioral and neurophysiological evidence suggest that attention-deficit hyperactivity disorder (ADHD) is characterized by the impact of abnormal reward prediction error signals carried by the midbrain dopamine system on frontal brain areas that implement cognitive control. To investigate this issue, we recorded the event-related brain potential (ERP) from typical children and children with ADHD as they navigated a "virtual maze" to find monetary rewards, and physically gave them their accumulated rewards halfway through the task and at the end of the experiment. We found that the amplitude of a reward-related ERP component decreased somewhat for typical children after they received their first payment, but increased for children with ADHD following the payment. This result indicates that children with ADHD are unusually sensitive to the salience of reward and suggests that such sensitivity may be mediated in part by the midbrain dopamine system.  相似文献   

19.
Abler B  Walter H  Erk S 《Neuroreport》2005,16(7):669-672
Psychological considerations suggest that the omission of rewards in humans comprises two effects: first, an allocentric effect triggering learning and behavioural changes potentially processed by dopaminergic neurons according to the prediction error theory; second, an egocentric effect representing the individual's emotional reaction, commonly called frustration. We investigated this second effect in the context of omission of monetary reward with functional magnetic resonance imaging. As expected, the contrast omission relative to receipt of reward led to a decrease in ventral striatal activation consistent with prediction error theory. Increased activation for this contrast was found in areas previously related to emotional pain: the right anterior insula and the right ventral prefrontal cortex. We interpreted this as a neural correlate of the egocentric effect.  相似文献   

20.
A prominent theory in neuroscience suggests reward learning is driven by the discrepancy between a subject's expectation of an outcome and the actual outcome itself. Furthermore, it is postulated that midbrain dopamine neurons relay this mismatch to target regions including the ventral striatum. Using functional MRI (fMRI), we tested striatal responses to prediction errors for probabilistic classification learning with purely cognitive feedback. We used a version of the Rescorla-Wagner model to generate prediction errors for each subject and then entered these in a parametric analysis of fMRI activity. Activation in ventral striatum/nucleus-accumbens (Nacc) increased parametrically with prediction error for negative feedback. This result extends recent neuroimaging findings in reward learning by showing that learning with cognitive feedback also depends on the same circuitry and dopaminergic signaling mechanisms.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号