Research Papers
Home / Quanser Community / Research Papers
The Most Trusted Name in Research
Quanser systems offer a highly efficient platform for bridging the gap between advanced theoretical and algorithm framework and real-world implementation. Browse our growing collection of research papers that demonstrate how Quanser systems help researchers around the globe to validate their concepts.
ContributeBibTex
@article{ren_2_2020,
title = {Advising reinforcement learning toward scaling agents in continuous control environments with sparse rewards},
author = {Ren, H.; Ben-Tzvi, P.},
journal = {Engineering Applications of Artificial Intelligence},
year = {2020},
month = {04},
volume = {90},
institution = {Virginia Tech, USA},
abstract = {This paper adapts the success of the teacher–student framework for reinforcement learning to a continuous control environment with sparse rewards. Furthermore, the proposed advising framework is designed for the scaling agents problem, wherein the student policy is trained to control multiple agents while the teacher policy is well trained for a single agent. Existing research on teacher–student frameworks have been focused on discrete control domain. Moreover, they rely on similar target and source environments and as such they do not allow for scaling the agents. On the other hand, in this work the agents face a scaling agents problem where the value functions of the source and target task converge at different rates. Existing concepts from the teacher–student framework are adapted to meet new challenges including early advising, importance of advising, and mistake correction, but a modified heuristic was used to decide on when to teach. The performance of the proposed algorithm was evaluated using the case study of pushing, and picking and placing objects with a dual arm manipulation system. The teacher policy was trained using a simulated scenario consisting of a single arm. The student policy was trained to handle the dual arm manipulation system in simulation under the advice of the teacher agent. The trained student policy was then validated using two Quanser Mico arms for experimental demonstration. The effects of varying parameters on the student performance in the advising framework was also analyzed and discussed. The results showed that the proposed advising framework expedited the training process and achieved the desired scaling within a limited advising budget.
},
keywords = {Reinforcement learning, Advising framework, Continuous control, Sparse reward, Multi-agent},
language = {English},
publisher = {Elsevier Ltd.}
}
Abstract
This paper adapts the success of the teacher–student framework for reinforcement learning to a continuous control environment with sparse rewards. Furthermore, the proposed advising framework is designed for the scaling agents problem, wherein the student policy is trained to control multiple agents while the teacher policy is well trained for a single agent. Existing research on teacher–student frameworks have been focused on discrete control domain. Moreover, they rely on similar target and source environments and as such they do not allow for scaling the agents. On the other hand, in this work the agents face a scaling agents problem where the value functions of the source and target task converge at different rates. Existing concepts from the teacher–student framework are adapted to meet new challenges including early advising, importance of advising, and mistake correction, but a modified heuristic was used to decide on when to teach. The performance of the proposed algorithm was evaluated using the case study of pushing, and picking and placing objects with a dual arm manipulation system. The teacher policy was trained using a simulated scenario consisting of a single arm. The student policy was trained to handle the dual arm manipulation system in simulation under the advice of the teacher agent. The trained student policy was then validated using two Quanser Mico arms for experimental demonstration. The effects of varying parameters on the student performance in the advising framework was also analyzed and discussed. The results showed that the proposed advising framework expedited the training process and achieved the desired scaling within a limited advising budget.
Learning inverse kinematics and dynamics of a robotic manipulator using generative adversarial networks
Product(s):
Joint Control Robot – 4 DOFBibTex
@article{ren_2020,
title = {Learning inverse kinematics and dynamics of a robotic manipulator using generative adversarial networks},
author = {Ren, H.; Ben-Tzvi, P.},
journal = {Robotics and Autonomous Systems},
year = {2020},
month = {02},
volume = {124},
institution = {Virginia Tech, USA},
abstract = {Obtaining inverse kinematics and dynamics of a robotic manipulator is often crucial for robot control. Analytical models are typically used to approximate real robot systems, and various controllers have been designed on top of the analytical model to compensate for the approximation error. Recently, machine learning techniques have been developed for error compensation, resulting in better performance. Unfortunately, combining a learned compensator with an analytical model makes the designed controller redundant and computationally expensive. Also, general machine learning techniques require a lot of data to perform the training process and approximation, especially in solving high dimensional problems. As a result, state-of-the-art machine learning applications are either expensive in terms of computation and data collection, or limited to a local approximation for a specific task or routine. In order to address the high dimensionality problem in learning inverse kinematics and dynamics, as well as to make the training process more data efficient, this paper presents a novel approach using a series of modified Generative Adversarial Networks (GANs). Namely, we use Conditional GANs (CGANs), Least Squares GANs (LSGANs), Bidirectional GANs (BiGANs) and Dual GANs(DualGANs). We trained and tested the proposed methods using real-world data collected from two types of robotic manipulators, a MICO robotic manipulator and a Fetch robotic manipulator. The data input to the GANs was obtained using a sampling method applied to the real data. The proposed approach enables approximating the real model using limited data without compromising the performance and accuracy. The proposed methods were tested in real-world experiments using unseen trajectories to validate the “learned” approximate inverse kinematics and inverse dynamics as well as to demonstrate the capability and effectiveness of the proposed algorithm over existing analytical models.
},
keywords = {Inverse kinematics, Inverse dynamics, Generative adversarial networks},
language = {English},
publisher = {Elsevier B.V.}
}
Abstract
Obtaining inverse kinematics and dynamics of a robotic manipulator is often crucial for robot control. Analytical models are typically used to approximate real robot systems, and various controllers have been designed on top of the analytical model to compensate for the approximation error. Recently, machine learning techniques have been developed for error compensation, resulting in better performance. Unfortunately, combining a learned compensator with an analytical model makes the designed controller redundant and computationally expensive. Also, general machine learning techniques require a lot of data to perform the training process and approximation, especially in solving high dimensional problems. As a result, state-of-the-art machine learning applications are either expensive in terms of computation and data collection, or limited to a local approximation for a specific task or routine. In order to address the high dimensionality problem in learning inverse kinematics and dynamics, as well as to make the training process more data efficient, this paper presents a novel approach using a series of modified Generative Adversarial Networks (GANs). Namely, we use Conditional GANs (CGANs), Least Squares GANs (LSGANs), Bidirectional GANs (BiGANs) and Dual GANs(DualGANs). We trained and tested the proposed methods using real-world data collected from two types of robotic manipulators, a MICO robotic manipulator and a Fetch robotic manipulator. The data input to the GANs was obtained using a sampling method applied to the real data. The proposed approach enables approximating the real model using limited data without compromising the performance and accuracy. The proposed methods were tested in real-world experiments using unseen trajectories to validate the “learned” approximate inverse kinematics and inverse dynamics as well as to demonstrate the capability and effectiveness of the proposed algorithm over existing analytical models.