BadAgent Extension: Cross-Domain Robustness and Trigger Visibility in LLM Agents

View Sample PDF

Author(s): Pedro Yanes Garrido (Illinois Institute of Technology, USA)and Diego Fernandez Arias (Illinois Institute of Technology, USA)
Copyright: 2026
Pages: 26
Source title: Examining Vulnerabilities and Adversarial Exploitation of AI and LLMs
Source Author(s)/Editor(s): Puya Pakshad (Illinois Institute of Technology, USA)and Marwan Omar (Illinois Institute of Technology, USA)
DOI: 10.4018/979-8-3373-8252-4.ch010

Keywords: Artificial Intelligence / Engineering Science Reference / Information Security & Privacy / Security & Forensics

Purchase

View BadAgent Extension: Cross-Domain Robustness and Trigger Visibility in LLM Agents on the publisher's website for pricing and purchasing information.

Abstract

This paper presents an empirical study on backdoor attacks in large language model agents. We extend a recent attack framework by adding two lightweight benchmarks that measure cross-domain robustness and trigger visibility without changing the model architecture. Our approach fine-tunes AgentLM-based agents with parameter-efficient methods on operating system and web browsing tasks using multiple poisoning ratios and both visible and invisible triggers. We then evaluate the agents with four metrics: Attack Success Rate, Follow Step Ratio, Cross-Domain Robustness, and Trigger Visibility Gap. The results show that backdoors often transfer to unseen domains without a drop in success, while invisible triggers significantly reduce the attack success rate compared to visible ones. These findings highlight the need for stronger evaluation tools and defenses for LLM-based agents.

The IRMA Community

Research IRM

BadAgent Extension: Cross-Domain Robustness and Trigger Visibility in LLM Agents

Purchase

Abstract

Related Content

IRMA Sponsors