CWE-1427

Improper Neutralization of Input Used for LLM Prompting

The product uses externally-provided data to build prompts provided to large language models (LLMs), but the way these prompts are constructed causes the LLM to fail to distinguish between user-supplied inputs and developer provided system directives.

Mitigation

Phase: Architecture and Design

Description:

  • LLM-enabled applications should be designed to ensure proper sanitization of user-controllable input, ensuring that no intentionally misleading or dangerous characters can be included. Additionally, they should be designed in a way that ensures that user-controllable input is identified as untrusted and potentially dangerous.
Mitigation

Phase: Implementation

Description:

  • LLM prompts should be constructed in a way that effectively differentiates between user-supplied input and developer-constructed system prompting to reduce the chance of model confusion at inference-time.
Mitigation

Phase: Architecture and Design

Description:

  • LLM-enabled applications should be designed to ensure proper sanitization of user-controllable input, ensuring that no intentionally misleading or dangerous characters can be included. Additionally, they should be designed in a way that ensures that user-controllable input is identified as untrusted and potentially dangerous.
Mitigation

Phase: Implementation

Description:

  • Ensure that model training includes training examples that avoid leaking secrets and disregard malicious inputs. Train the model to recognize secrets, and label training data appropriately. Note that due to the non-deterministic nature of prompting LLMs, it is necessary to perform testing of the same test case several times in order to ensure that troublesome behavior is not possible. Additionally, testing should be performed each time a new model is used or a model's weights are updated.
Mitigation

Phases: Installation, Operation

Description:

  • During deployment/operation, use components that operate externally to the system to monitor the output and act as a moderator. These components are called different terms, such as supervisors or guardrails.
Mitigation

Phase: System Configuration

Description:

  • During system configuration, the model could be fine-tuned to better control and neutralize potentially dangerous inputs.

No CAPEC attack patterns related to this CWE.

Back to CWE stats page