LLM security is the practice of safeguarding large language models against attacks throughout their entire lifecycle, from development through post-deployment management. Leaks and architectural issues with AI models pose great risks at an unprecedented scale.

Generative AI solutions process a staggering volume of requests and data, with Anthropic’s Claude facilitating over 25 billion monthly API calls. All organizations deploying LLMs want to protect their infrastructure components that power such calls against abuse, unauthorized intrusion, crashes, and newer exploits such as prompt injection. Meanwhile, companies who have LLM integrations within their products want to provide a secure and enjoyable experience for their customers.

LLM security is a relatively new concept stemming from the rise of modern AI. While protecting everyday users (and intellectual property) has always been important, enterprise applications have rapidly incorporated LLMs to serve users. LLMs have even become important tools within government and other sensitive industries, heightening the need for stronger, adaptive security.

How does LLM security work?

Large language models pose unique threats since they rely on natural language processing to function. Users create and send detailed prompts through a web or app interface to the backend, which then uses transformers to better interpret the meaning behind what’s been said (the input). This transformer system needs a massive stream of prompts and training data to learn, evolve, and provide more accurate responses over time.

AI crawlers will scrape data across the web to enable this, from official or “trusted” sources (such as the WHO for medical knowledge) and independent blogs, alike. Training data can also be much more curated based on the use case. What matters next is the inference layer, which generates tokens based on anticipated outputs. Together with the attention mechanism, an LLM will determine which parts of a prompt are important and produce an output for the user. This can include simple text, images, videos, documents, design assets, and more.

Unfortunately, attackers target these LLM components. Organizations must plan to counteract common LLM threats, which include the following according to the OWASP Top 10 for LLM Applications:

Threat name

Description

Recommended solution(s)

Prompt injection

A user engineers an LLM prompt that causes unintended behavior within the model, such as sensitive data sharing, infrastructure disclosure, elevated LLM access, arbitrary command execution, decision manipulation, and content manipulation. This can be accidental or intentional. Injection can also happen directly through the user prompt, or indirectly through attached files and websites.

Limit model behavior and instructional inputs. Specify and validate output formats. Filter inputs and outputs to avoid sharing sensitive information inadvertently, while evaluating contextual relevance. Require manual human approval for actions that require elevated privileges. Handle API token use in code and enforce access control based on least privilege. Limit external content to avoid malicious inputs. Penetration test your models to ensure resilience against attack.

Sensitive information disclosure

Confidential business intelligence data, personally identifiable information (PII), sensitive files, and security credentials are exposed due to hacking, unauthorized access, and breaches. Users who include such data in their prompts put this information at risk.

Sanitize data to prevent user PII from entering the training data pool. Publish clear terms of use to prevent accidental disclosure while prompting. Limit content outputs to include specific data types.

Supply chain

LLMs and supporting infrastructure components from third parties can introduce vulnerabilities into the system. LLMs that partially or completely rely on external models face additional risks.

Carefully vet external tools and components before incorporating them. Ensure that data sources and suppliers are trustworthy. Continually maintain a software bill of materials (SBOM) to account for your tools and better understand how they mesh. Audit software solutions to prevent licensing and compliance issues. Create policies for security patching and periodic updates.

Data and model poisoning

Critical LLM data, collected and processed before or during model training, can undermine LLM security. Attackers can tamper with existing data or feed bad data into the model to produce outputs with less contextual value. Tampering can also introduce vulnerabilities and backdoors that compromise the LLM.

Monitor data uptake and transformations on the backend. Carefully vet all data sources, and especially those from external providers (which can contain malicious content). Sandbox your models to prevent exposure to poisoned data sources. Use version control for easy reversion. Implement thresholds and testing methods to highlight suspicious behavior.

Improper output handling

Improperly validated, sanitized, and managed output can pass faulty outputs to other downstream systems. This can lead to unauthorized privilege escalation, code execution, poor input validation, poor output visibility, security measure ineffectiveness, and vulnerability to prompt injection.

Apply zero-trust principles within the model to avoid unauthorized system actions. Validate inputs and responses originating from within the model itself. Properly encode LLM outputs that’s context-aware. Protect internal databases with parameterized queries and prepared statements. Enforce strict content policies. Log and monitor everything for quicker troubleshooting and security responses.

Excessive agency

Giving the LLM unchecked autonomy to make decisions, call functions, and communicate with external systems can make the model less effective, undermines privacy, and can introduce vulnerabilities.

Minimize the quantity of embedded extensions. Limit available functions to prevent excess agency across the model, and restrict those functions to specific tasks. Don’t just give an LLM the ability to define which actions are allowed and off-limits. Sanitize LLM inputs and outputs.

System prompt leakage

Instructions that influence how an LLM behaves or responds to users can contain secrets or sensitive backend details. It’s possible to accidentally leak credentials, system architecture, API keys, tokens, and more to users who submit malicious requests.

Isolate sensitive data from system prompts to avoid accidental leakage. Use external mechanisms to help guide LLM behavior instead of relying on system prompts (which hackers can manipulate). Create and enforce independent guardrails to ensure compliance without impacting accuracy. Separate security measures from the LLM itself and use individualized agents to perform specific tasks.

Vector and embedding weaknesses

RAG-enabled LLMs are vulnerable to improper creation, storage, and retrieval of embeddings and vectors. This makes it easier to access private data, alter LLM outputs, and inject damaging content.

Implement RBAC and similar fine-grained access controls, paired with vectors and embeddings that are permissions-aware. Create systems for reliable data validation and authenticate sources before feeding new data into the model. Tag and classify data within combined data pools to reduce mismatches and closely control access. Monitor and log everything.

Misinformation

LLMs can present false information or hallucinations as objectively or factually correct. The model’s responses can be quite inaccurate, impacting users who rely too much on LLM-generated responses to guide their decision making.

Use retrieval-augmented generation (RAG) to verify model responses and incorporate data only from trusted sources. Fine-tune models to improve accuracy while adding automatic response validation. Encourage fact checking and verification with LLM users. Integrate content filters, clearly label all AI-derived content, and be honest about anticipated response accuracy. Educate users to think critically while evaluating the usefulness of an LLM’s outputs.

Unbounded consumption

The LLM may enable users to perform too many inferences without controlling that behavior. Excessive use or abuse can consume too many backend resources (compute and memory) and increase operational costs.

Validate all inputs and enforce strict limits for prompt sizing. Restrict API responses to only provide essential information within responses. Perform dynamic resource management to scale appropriately with demand, while limiting resource use. Enforce rate limits to stop abuse. Implement timeouts and throttling mechanisms to boost performance (and improve UX). Limit how the LLM can access internal services, network resources, and APIs.

Read our LLM-related blog content:

What are the benefits of LLM security?

LLM security has grown increasingly important alongside worldwide LLM usage. These protections can help in many ways:

  • Users are protected against malicious behavior originating from the LLM itself, due to hacking or compromised software components.

  • LLMs are protected against malicious user actions, whether those behaviors are intentional or not.

  • LLM companies that prevent breaches and abuse can lower their operating costs while remaining compliant with relevant regulations.

  • Users benefit from improved output accuracy through reduced hallucinations and better contextual awareness.

  • Stronger security means deeper trust, both from end users and organizations that rely on internal LLM integrations.

  • Responsible LLM development and maintenance practices encourage safer innovation.

You’ve mastered one topic, but why stop there?

Our blog delivers the expert insights, industry analysis, and helpful tips you need to build resilient, high-performance services.

By clicking "Get new posts first" above, you confirm your agreement for HAProxy to store and processes your personal data in accordance with its updated Privacy Policy, which we encourage you to review.

Thank you! Your submission was successful.

Does HAProxy support LLM security?

Yes! HAProxy One helps your AI/ML applications scale massively, boosts reliability, strengthens security, and optimizes costs. Organizations that use LLMs, and LLM vendors themselves, can innovate freely and deliver improved user experiences with ease.

HAProxy One also supports stronger LLM security through our multi-layered security features. HAProxy Enterprise WAF, HAProxy Enterprise Bot Management Module, DDoS protection, rate limiting, and a collection of centralized access controls help teams oversee their models more safely.

To see how HAProxy protects LLM traffic in practice, explore our AI gateway solution, then request a demo to test it against your own workloads.

FAQs

Prompt injection is when a user crafts an input that makes a model behave in unintended ways, such as revealing sensitive data, escalating access, or executing arbitrary commands. It can happen directly through the user’s prompt or indirectly through attached files and websites. It ranks as the top risk in the OWASP Top 10 for LLM Applications.