一种爬梯机械人的设计【一种爬梯机器人的设计】
购买设计请充值后下载,资源目录下的文件所见即所得,都可以点开预览,资料完整,充值下载可得到资源目录里的所有文件。【注】:dwg后缀为CAD图纸,doc,docx为WORD文档,原稿无水印,可编辑。具体请见文件预览,有不明白之处,可咨询QQ:12401814
辽宁科技大学本科生毕业设计 第 10 页Learning Control of Robot ManipulatorsROBERTO HOROWITZDepartment of Mechanical EngineeringUniversity of California at BerkeleyBerkeley,CA 94720,U.S.APhone:(510)642-4675e-mail:horowitzcanaima.berkeley.eduAbstractLearning control encompasses a class of control algorithms for programmable machines such as robots which attain, through an interactive process, the motor dexterity that enables the machine to execute complex tasks. In this paper we discuss the use of function identification and adaptive control algorithms in learning controllers for robot manipulators. In particular, we discuss the similarities and differences between betterment learning schemes, repetitive controllers and adaptive learning schemes based on integral transforms. The stability and convergence properties of adaptive learning algorithms based on integral transforms are highlighted and experimental results illustrating some of these properties are presented.Key words: Learning control, adaptive control, repetitive control, roboticsIntroductionThe emulation of human learning has long been among the most sought after and elusive goals in robotics and artificial intelligence. Many aspects of human learning are still not well understood. However, much progress has been achieved in robotics motion control toward emulating how humans develop the necessary motor skills to execute complex motions. In this paper we will refer to learning controllers as the class of control systems that generate a control action in an interactive manner using a function adaptation algorithm, in order to execute a prescribed action. In typical learning control applications the machine under control repeatedly attempts to execute a prescribed task while the adaptation algorithm successively improves the control systems performance from one trial to the next by updating the control input based on the error signals from previous trials.The term learning control in the robot motion control context was perhaps first used by Arimoto and his colleagues(c,f(Arimoto et al.,1984;Arimoto et al.,1988).Arimoto defined learning control as the class of control algorithms that achieve asymptotic zero error tracking by an interactive betterment process ,which Arimoto called learning. In this process a single finite horizon tracking task is repeatedly performed by the robot, starting always from the same initial condition. The control action at each trial is equal to the control action of the previous trial plus terms proportional to the tracking error and its time derivative respectively.Parallel to the development of the learning and betterment control schemes, a significant amount of research has been directed toward the application of repetitive control algorithms for robot trajectory tracking and other motion control problems (c.f.(Hara et al.,1988;Tomizuka et al.,1989;Tomizuka,1992). The basic objective in repetitive control is to cancel an unknown periodic disturbance or to track an unknown periodic reference trajectory. In its simplest form, the periodic signal generator of many repetitive control algorithm closely resembles the betterment learning laws in (Arimoto et al., 1984; Arimoto et al.,1988).However, while the learning betterment controller acts during a finite time horizon, the repetitive controller acts continuously as regulator. Moreover, in the learning betterment approach, it is assumed that, at every learning trial, the robot starts executing the task from the same initial condition. This is not the case in the repetitive control approach.My interest in learning and repetitive control arouse in 1987, as a consequence of studying the stability of a class of adaptive and repetitive controllers for robot manipulators with my former student and colleague Nader Sadegh. My colleague and friend Masayoshi Tomizuka had been working very actively in the area of repetitive control and he introduced me to this problem. At that time there was much activity in the robotics and control communities toward finding adaptive control algorithms for robot manipulators which were rigorously proven to be asymptotically stable. The problem had been recently solved using passivity by Slotine and Li (1986), Sadegh and Horowitz (1987) and Wen and Baynard (1988). In contrast, most of the stability results in learning and repetitive control of that period relied on several unrealistic assumptions: either the dynamics of the robot was assumed linear, or it was assumed that it could be at least partially linearized with feedback control. Moreover, it was assumed in most works that the actual response of the robot was periodic or repeatable, even during the learning transient, and that joint accelerations could be directly measured. Nader and I had recently overcome some of these problems in our adaptive control research, and concluded that learning controllers could be synthesized and analyzed, using a similar approach. For us the main appeal of learning and repetitive controllers lied in their simplicity. In most of the other approaches to robot trajectory control, including parametric adaptive control, it is necessary to compute the so called inverse dynamic equations of the robot. In many of these schemes these equations have to be computed in real time. In contrast, in the betterment learning and repetitive control schemes, the control action is generated by relatively simple functional adaptation algorithms. Moreover, since most robot applications in industry involve the repeated execution of the same task, the idea of implementing a control algorithm which would “learn” through practice, without requiring any a-priori knowledge of the structure of the robot equations of motion, was also very appealing. Our approach to the synthesis of learning controllers relied on the following insights: i) In learning control, motor dexterity is not gained through the used of feedback. This was the approach used in most adaptive controllers at the time(c,f(Slotine and Li,1987;Sadegh and Horowitz,1987;Ortega and Spong,1989).In these adaptive schemes both the nonlinear control law and the parameter adaptation algorithm regressor are functions of the actual robot joint coordinates and velocities. ii) In contrast, in learning control algorithms, motor dexterity is gained through the use of a feedforward control action which is stored in memory and is retrieved as the task is executed. The learning process involves the functional adaptation of the feedforward action. iii) Feedback plays a fundamental role in stabilizing the system and in guaranteeing that the map between the feedforward function error and the tracking errors is strictly passive. It therefore became apparent to us that, in order to use passivity based adaptive control results in the synthesis and analysis of learning and repetitive algorithms, it was necessary to formulate and to prove the stability of an adaptive control law which accomplishes the linearization of the robot dynamics by feedforward rather then feedback control. We presented these results in (Sadeh and Horowitz,1990) with the introduction of the so called Desired Compensation Adaptive Law (DCAL).In this adaptive scheme both the nonlinear control law and the parameter adaptation algorithm regressor are functions of the desired trajectories and velocities. Subsequently we were able to synthesize repetitive controllers for robot arms by replacing the adaptive law in the DCAL with a repetitive control law (Sadegh et al.,1990).Unfortunately, as discussed in (Hara et al., 1988; Tomizuka et al.,1989; Sadegh et al.,1990), the asymptotic convergence of the basic repetitive control system can only be guaranteed under restrictive conditions in the plant dynamics or restrictions in the nature of the disturbance signals. These conditions are generally not satisfied in robot control applications. Most often ,modifications in the update schemes are introduced ,such as the so called Q filter modification (Hara et al.,1988;Tomizuka et al.,1989),that enhance the robustness of the repetitive controller, at the expense of limiting its tracking performance. Likewise, the convergence of betterment learning schemes is proven by appealing to strict assumptions regarding the initial condition of the robot at the beginning of each learning trial. Another shortcoming of the betterment learning and repetitive control schemes discussed so far, is that these algorithms were developed for the iterative learning of a single task. None of the research works in these areas provided a mechanism for extending the learning process so that a family of tasks can be simultaneously learned by the machine, or provide a systematic mechanism for using the dexterity gained in learning a particular task to subsequently perform a somewhat different task of a similar nature. After Nader left Berkeley to become a faculty member in the Georgia Institute of Technology, I begun working on these problems with Bill Messner.Our research has revealed that the robustness limitation of the basic betterment and repetitive control laws and the inability of these algorithms to learn multiple tasks in part stem from the fact that all these schemes use point to point function adaptation algorithms. These algorithms only update the value of the control input at the current instant of time and do not provide a mechanism for updating the control input at neighboringpoints. However, in most applications the control function that must be identified is usually at least piecewise continuous. Thus, the value of the control at a given point will be almost the same as those of nearby points. Point to point function update laws do not take advantage of this situation. This issue has implications in more general learning problems and content addressable memories. Let us consider as an example the case of multi-task learning control algorithms for robot manipulators. In this application a function of several variables must be identified, namely the robot inverse dynamics. The trajectory used for training in betterment control cannot visit every point (or vector) in the domain of the fuction in a finite amount of time. Thus, the perfect identification of a control input function for one task using a point to point update law will not provide any information for generating the control input for other similar tasks unless the trajectories intersect, or some sort of interpolation is used. Similarly, in content addressable memories, it is desirable that the learning algorithm have an “ interpolating” property, so that input vectors which are similar to previously learned input vectors, but are novel to the system, output vectors that are similar to previously learned output vectors.One solution to the interpolation problem in robot learning control was presented in(Miller,1987)with the use of the so called “cerebellar model arithmetic computer” (CMAC).In this algorithm an input vector is mapped to several locations in an intermediate memory, and the output vector is computed by summing over the values stored in all the locations to which the input vector was mapped. The mapping of input vectors has the property that inputs near to each other map to overlapping regions in intermediate memory. This causes interpolation to be performed automatically.In (Messner et al.,1991) we introduced a class of function identification algorithms for learning control systems based on integral transforms, in order to address the robustness and interpolation problems of point to point repetitive and learning betterment controllers mentioned above. In these adaptive learning algorithms unknown functions are defined in terms of integral equations of the first kind which consist of known kernels and unknown influence functions. The learning process involves the indirect estimation of the unknown functions by estimating the influence functions. The entire influence function is modified in proportion to the value of the kernelat each point. Thus, the use of the kernel in both the update of the influence functions and in the generation of the function estimate provides these algorithms with desirable interpolation and smoothing properties and overcomes many of the limitations concerning the estimation of multivariable functions of prior point to point betterment and repetitive control schemes. Moreover, the use of integral transforms makes it is possible to demonstrate strong stability and convergence results for these learning algorithms.The reminder of the paper we discuss the use of learning control in the robot tracking control context ,and stress the similarities and between betterment learning chemes, repetitive control schemes and learning schemes based on integral transforms. Conclusions and reflections on some of the outstanding problems in this area are included in the last section.机器人模仿控制论罗伯特霍洛维茨机械工程学系加州大学伯克利分校伯克利分校,加州94720 ,美国电话: ( 510 ) 642-4675电子邮箱: horowitzcanaima.berkeley.edu摘要模仿控制涵盖了一类可编程机器的控制算法,如机器人的动作就是通过一个互动的进程,以及能让机器来执行复杂任务的机动马达来实现的。在本文中,我们讨论了机器人模仿控制器功能的识别和自适应控制算法的使用。我们还特别讨论了在积分变换基础上改进模仿方案,重复控制器和自适应模仿方案的异同,突出了在积分变换基础上自适应模仿算法的稳定性和收敛性,并给出了表明其中一些特性的实验结果。关键词:模仿控制,自适应控制,重复控制,机器人导言机器人技术和模仿人类的人工智能一直是最难以实现的追求和目标。虽然关于人类许多方面的模仿仍然没有得到很好的实现,但是在模仿人类如何获得执行复杂的动作所必需的运动技能上,机器人运动控制已经取得了很大进展。在本文中,我们将参照模仿控制器类的控制系统,生成一个以迭代方式进行的控制动作,并运用功能适应算法,以执行规定的动作。在典型的模仿控制应用软件中,测试系统测试出错误信号后便更新控制输入,而适应算法也因此不断地提高控制系统的性能,从而受控制的机器可以反复执行规定的任务。在研究机器人运动控制的领域里,模仿控制这一术语也许是第一次为Arimoto和他的同事们所使用(c,f(Arimoto et al.,1984;Arimoto et al.,1988)。Arimoto把模仿控制定义为通过迭代方式改善进程从而达到零误差渐近跟踪的一类控制算法,他也把它命名为模仿。在这个过程中,机器人总是从相同的初始条件开始,在一个单一的有限度的范围内进行反复的任务追踪。控制动作的每次测试结果相当于控制动作的前一次测试结果再加上加上条件比例跟踪误差及其时间导数。与模仿和改善控制方案并行发展的是,大量的研究已经直接针对机器人轨迹跟踪重复控制算法的应用和其他运动控制问题(c.f.(Hara et al.,1988;Tomizuka et al.,1989;Tomizuka,1992)。重复控制的基本目标是消除不明周期干扰或跟踪未知定期参考轨迹。在它最简单的形式中,许多重复控制算法的定期信号发生器与改善模仿规律很相似(Arimoto et al.,1984; Arimoto et al.,1988)。然而,在模仿过程中的行为改善控制器有时间界限,该行为不断重复控制器上调节器的动作。此外,对于模仿改善的方法,假定机器人在每一次模仿试验中总是从相同的初始条件开始执行任务的,但是这不是重复控制方法那种情况。我在模仿和重复控制方面的兴趣开始于1987年,是因为那时我和我的校友及同事Nader Sadegh一起学习研究了一类有关机器人自适应和重复控制器稳定性的知识。我的同事和朋友Masayoshi Tomizuka在重复控制这块领域里一直都非常积极地去研究,也是他把我引入了这个课题。当时机器人技术和控制领域里有很多人都积极的为机器人寻找能渐近稳定的自适应控制算法,所以这些算法都必须通过严格的证明。最近,Slotine和Li(1986),Sadegh和Horowitz(1987以及 Wen and Baynard(1988)已经通过运用钝性解决了这个问题。与此相反的是,在那一时期大部分模仿和重复控制的的稳定性成果都是建立在几个不现实的假设之上的,或者是动态的机器人线性假定,或者是被认为可能是至少部分线性反馈控制。此外,还有人认为在大多数工程中,即使是短暂的模仿,机器人的实际反应也是定期或重复的,并且还可直接测量出联合加速度。最近我和Nader在我们的自适应控制研究中已证明了这种说法,并得出结论,我们认为模仿控制器可使用类似的做法进行合成和分析。我们觉得,模仿和控制器的主要优点在于它的简单和直接。在机器人轨迹控制的其他方法中,还包括参数自适应控制,但是有必要计算一下所谓的机器人逆动力学方程。在许多类似的方法中,这些方程必须以实际时间计算。相反,在改善模仿和重复控制的方法中,动作的控制是由简单功能的自适应算法相关产生的。此外,由于在工业应用的大多数机器人涉及重复执行同样的任务,那么实现一个不需要任何一个有经验或者具备一定知识结构的机器人运动方程而仅通过实践“模仿”的控制算法的想法,也非常具有吸引力.我们合成模仿控制器的方法根据如下见解:一)在模仿控制方面,电机灵巧度不是通过使用获得的反馈来实现的。这是当时大部分自适应控制器采用的方法(c,f(Slotine Li,1987;Sadegh和Horowitz,1987;Ortega和Spong,1989)。在这些自适应的方法中,非线性控制规律和参数调整算法的递减都是实际机器人关节坐标和速度的性能。二)与此相反,在模仿控制算法中,机动灵活的获得是通过使用一种前馈控制的动作,这是存储在内存中的,当执行任务时便可以检索出来。但是模仿的过程中要涉及到调整前馈动作后功能的适应性。三)反馈信息在使系统稳定和保证前馈功能误差和跟踪误差之间映射的严格被动中发挥着基础性的作用。因此,很明显,我们认为为了在模仿和反复算法综合和分析中的自适应控制结果的基础上使用被动,有必要计算和证明自适应控制法的稳定,即能实现机器人动态线性的前馈控制,而不是反馈控制。我们把这些结果写在(Sadeh和Horowitz,1990)并有所谓的期望补偿自适应律( DCAL )的介绍。在这个自适应方案中,非线性控制律和参数自适应算法递减都是改变后运动轨迹和速度的性能。后来我们就可以通过运用重复控制法(Sadegh et al.,1990)代替自适应法的DCAL去合成重复控制器的机器人手臂了。不幸的是,正如在(Hara et al.,1988; Tomizuka et al.,1989; Sadegh et al.,1990)讨论的一样,渐近收敛的基础的重复控制系统只有在机械动态或局限性干扰信号的严格限制性条件下才能保证。这些条件一般不适合应用在机器人的控制中。大多数情况下,更新方案后有相应的修改,如所谓的Q滤波器修改(Hara et al.,1988;Tomizuka et al.,1989),可以增强重复控制器强度,但是要以牺牲限制其跟踪性能为代价。同样,就机器人开始时的每一个模仿实验的初始条件而言,在合理的假设下,融合改善模仿方案已经得到证明。到目前为止,我们讨论的改善模仿和重复控制方案的另一个缺点,就是这些算法是为单一任务的迭代模仿而提出的。在这些领域里没有任何研究工作可提供拓展模仿工序的一个这样的机制,它能使机器同时模仿大量家务工作,或提是供一个系统的机制,即运用通过模仿特殊任务获得的灵活性去执行稍微有点不同但具有类似性质的工作。在Nader离开伯克利分校成为一名佐治亚理工学院的教员后,我和Bill Messner开始对这些问题进行了研究。我们的研究表明,本质改变和重复控制规律的巨大局限性和这些用来模仿多样任务的算法的失效在某种程度上源于一个事实,即所有这些方案都是使用点对点功能适应算法。这些算法仅仅只更新了控制输入在当前即时的时间内的实用性,但并没有提供一种可以在相邻时间内更新控制输入的机制。然而,大多数应用中的必须查明的控制功能,通常至少是分段连续的。因此,在某一特定点上控制的实用性将和附近的点的实用性几乎一样,而点对点的功能更新规律不能充分利用这种情况。这个焦点已经更广泛地影响模仿问题和可寻址内容存储器的问题。让我们考虑把机器人的多任务模仿控制算法的情况作为一个例子。在此应用程序中,几个功能变量必须确定,即机器人的逆动力学参数。在有限的时间内用于训练改善控制的轨道不能访问到主函数每一个点(或载体)。因此,在用点至点更新的规律时,执行一次任务中控制输入功能的精确识别将不会提供生成控制输入任务的任何信息,除非其他类似的轨道相交,或者使用了某种插值。类似地,在可寻址内容存储器中,可取的做法是模仿算法应有一个“插值“的性能,因此,输入向量是类似以前的经验输入向量,但对系统来说还是新的向量,输出向量还是类似以前的经验输出载体。在机器人模仿控制的插值问题上,(Miller,1987)一书提出了利用所谓的“小脑模型算术计算机”(CMAC)的解决办法。在该算法中,一个输入向量被映射到在中间记忆的几个点上,而输出向量是由总结存储在输入向量被映射到的所有点的值算出的。输入向量的映射有一个特性就是在中间记忆中彼此靠近的输入向量将映射在重叠的区域,这能导致插值的自动进行。在(Messner et al.,1991)中,我们介绍了基于积分变换的模仿控制系统的一类功能识别算法,以便处理在上文提到的点对点的重复性和模仿改善控制器的强度和插值问题。在这种自适应模仿算法中,未知函数是以第一类积分方程的形式定义的,这种积分方程由已知的和未知的核心功能影响函数组成。模仿过程中涉及到通过估计影响函数对未知函数的间接估计。整个影响函数被调整成与每一个点的核心值成比例。因此,在更新影响函数和发生函数的估计中核心值的使用为这些算法提供了理想的插值和平滑的性能,并且克服了前面点对点改善和重复控制方案的有关多变函数估计中的限制问题。此外,积分变换的运用使得这些模仿算法表现出强大的稳定性和收敛性成为可能。需要提醒的是,在我们讨论的研究报告中使用的机器人环境跟踪模仿控制方面,以及强调的改善模仿方案和重复控制方案及模仿方案异同的基础都是积分变换。在这一领域一些未得到解决的问题的结论和思考将在后续介绍。
收藏