Researchers at Heriot-Watt University and Alana AI have developed a new system called FurChat that combines large language models (LLMs) with embodied conversational agents to provide context-specific conversations and information in specific settings. The system utilizes LLMs, such as ChatGPT or other open-source alternatives, along with an animated speech-enabled robot, the Furhat robot.
LLMs have become increasingly popular due to their ability to interact with humans in real-time and provide human-like answers to a range of questions. However, most LLMs are generic and not fine-tuned for specific topics. On the other hand, chatbots and robots in public spaces often rely on different types of natural language processing models.
The FurChat system aims to bridge this gap by combining the open-domain conversation capabilities of LLMs with specific information sources. For example, FurChat can provide information about a building or an organization, such as the UK National Robotarium. In a similar vein, the system has been developed for the SPRING project in France, providing information about the Broca hospital using an ARI robot and in French.
One of the main objectives of the research team was to test the ability of LLMs to generate appropriate facial expressions aligned with the robot’s communication or response. The facial expressions and responses of the embodied conversational agent in the FurChat system are generated by the GPT 3.5 model. These expressions are conveyed both in spoken terms and physically by the Furhat robot.
The FurChat system represents a significant advancement in the field of embodied AI for natural interaction with humans. By combining LLMs with animated speech-enabled robots, it allows for engaging and informative conversations in specific settings. Further research and development of this system could lead to more advanced and context-aware conversational agents in the future.
Sources:
– Cherakara et al., “FurChat: Combining Large Language Models with Embodied Conversational Agents for Context-Specific Conversations” (pre-published on arXiv)