Pubfacts - Scientific Publication Data
  • Categories
  • |
  • Journals
  • |
  • Authors
  • Login
  • Categories
  • Journals

Search Our Scientific Publications & Authors

Publications
  • Publications
  • Authors
find publications by category +
Translate page:

Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning.

Authors:
Chenjia Bai Peng Liu Kaiyu Liu Lingxiao Wang Yingnan Zhao Lei Han Zhaoran Wang

IEEE Trans Neural Netw Learn Syst 2021 Dec 1;PP. Epub 2021 Dec 1.

Efficient exploration remains a challenging problem in reinforcement learning, especially for tasks where extrinsic rewards from environments are sparse or even totally disregarded. Significant advances based on intrinsic motivation show promising results in simple environments but often get stuck in environments with multimodal and stochastic dynamics. In this work, we propose a variational dynamic model based on the conditional variational inference to model the multimodality and stochasticity. We consider the environmental state-action transition as a conditional generative process by generating the next-state prediction under the condition of the current state, action, and latent variable, which provides a better understanding of the dynamics and leads to a better performance in exploration. We derive an upper bound of the negative log likelihood of the environmental transition and use such an upper bound as the intrinsic reward for exploration, which allows the agent to learn skills by self-supervised exploration without observing extrinsic rewards. We evaluate the proposed method on several image-based simulation tasks and a real robotic manipulating task. Our method outperforms several state-of-the-art environment model-based exploration approaches.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2021.3129160DOI Listing
December 2021

Publication Analysis

Top Keywords

extrinsic rewards
8
variational dynamic
8
upper bound
8
self-supervised exploration
8
reinforcement learning
8
exploration
6
multimodality stochasticity
4
model multimodality
4
bound negative
4
inference model
4
variational inference
4
negative log
4
conditional variational
4
based conditional
4
log likelihood
4
exploration approaches
4
model based
4
derive upper
4
likelihood environmental
4
environmental transition
4

Keyword Occurance

Similar Publications

Learning Attentional and Gated Communication via Curiosity.

Authors:
Chuxiong Sun Kaijie Zhou Cong Cong Kai Li Rui Wang Xiaohui Hu

Comput Intell Neurosci 2022 26;2022:2951193. Epub 2022 Apr 26.

Institute of Software, Chinese Academy of Sciences, Haidian, Beijing 100190, China.

Due to the partial observability in decentralized multi-agent systems, communication is critical for cooperation. Furthermore, the ability to decide when and whom to communicate is important to achieve efficient communication. However, the existing methods are typically driven by extrinsic rewards. Read More

View Article and Full-Text PDF
May 2022
Similar Publications

Protection motivation theory and smoking quitting intention: findings based on structural equation modelling and mediation analysis.

Authors:
Haoxiang Lin Meijun Chen Qingping Yun Lanchao Zhang Chun Chang

BMC Public Health 2022 04 27;22(1):838. Epub 2022 Apr 27.

Department of Social Medicine and Health Education, School of Public Health, Peking University Health Science Center, 38 Xueyuan Rd, Haidian District, Beijing, China.

Objective: Although many smoking cessation strategies have been implemented, only a few strategies at the population level are grounded in theory. Even in those interventions based on specific theories, most studies have focused only on the outcome. The main objective of this study was to demonstrate the utility of protection motivation theory (PMT) in explaining smoking quitting behaviour among adults, with the goal of providing valuable evidence for further intervention strategies. Read More

View Article and Full-Text PDF
April 2022
Similar Publications

Contributions of expected learning progress and perceptual novelty to curiosity-driven exploration.

Authors:
Francesco Poli Marlene Meyer Rogier B Mars Sabine Hunnius

Cognition 2022 Apr 12;225:105119. Epub 2022 Apr 12.

Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, the Netherlands.

Exploration is curiosity-driven when it relies on the intrinsic motivation to know rather than on extrinsic rewards. Recent evidence shows that artificial agents perform better on a variety of tasks when their learning is curiosity-driven, and humans often engage in curiosity-driven learning when sampling information from the environment. However, which mechanisms underlie curiosity is still unclear. Read More

View Article and Full-Text PDF
April 2022
Similar Publications

Credentialed veterinary technician intrinsic and extrinsic rewards: a narrative review.

Authors:
David C Driscoll

J Am Vet Med Assoc 2022 Apr 13:1-7. Epub 2022 Apr 13.

The economic literature on veterinary technicians is limited, and the AVMA Task Force on Veterinary Technician Utilization has recommended increasing veterinary technician economic research in several areas. The aim of this review was to provide an economic overview of the veterinary technician profession based on intrinsic and extrinsic rewards. Data sources for this paper include articles and texts from the veterinary, human medical, and service industries concerning veterinary technicians and from economic and psychology literature. Read More

View Article and Full-Text PDF
April 2022
Similar Publications

Text Messages and Financial Incentives to Increase Physical Activity in Adolescents With Prediabetes and Type 2 Diabetes: Web-Based Group Interviews to Inform Intervention Design.

Authors:
Mary Ellen Vajravelu Talia Alyssa Hitt NaDea Mak Aliya Edwards Jonathan Mitchell Lisa Schwartz Andrea Kelly Sandra Amaral

JMIR Diabetes 2022 Apr 6;7(2):e33082. Epub 2022 Apr 6.

Division of Nephrology, The Children's Hospital of Philadelphia, Philadelphia, PA, United States.

Background: Physical activity is a major component of treatment for adolescents with obesity and prediabetes or type 2 diabetes; however, sedentary behavior remains pervasive. An SMS text message-based intervention paired with financial incentives may be an effective way to promote physical activity in this population.

Objective: This study aims to obtain end-user feedback on SMS text message content and assess the acceptability of a planned SMS text messaging intervention with financial incentives to motivate youth with prediabetes or type 2 diabetes to increase physical activity. Read More

View Article and Full-Text PDF
April 2022
Similar Publications
}
© 2022 PubFacts.
  • About PubFacts
  • Privacy Policy
  • Sitemap