Improper Validation of Syntactic Correctness of Input

Incomplete Base
Structure: Simple
Description

The product receives input that is expected to be well-formed - i.e., to comply with a certain syntax - but it does not validate or incorrectly validates that the input complies with the syntax.

Extended Description

Often, complex inputs are expected to follow a particular syntax, which is either assumed by the input itself, or declared within metadata such as headers. The syntax could be for data exchange formats, markup languages, or even programming languages. When untrusted input is not properly validated for the expected syntax, attackers could cause parsing failures, trigger unexpected errors, or expose latent vulnerabilities that might not be directly exploitable if the input had conformed to the syntax.

Common Consequences 1
Scope: Other

Impact: Varies by Context

Potential Mitigations 1
Phase: Implementation

Strategy: Input Validation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.

Effectiveness: High

Demonstrative Examples 1
The following code loads and parses an XML file.

Code Example:

Bad
Java

// Read DOM* try { ``` ... DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance(); factory.setValidating( false ); .... c_dom = factory.newDocumentBuilder().parse( xmlFile ); } catch(Exception ex) { ... }

The XML file is loaded without validating it against a known XML Schema or DTD.
Observed Examples 2
CVE-2016-4029Chain: incorrect validation of intended decimal-based IP address format (Improper Validation of Syntactic Correctness of Input) enables parsing of octal or hexadecimal formats (Incorrect Parsing of Numbers with Different Radices), allowing bypass of an SSRF protection mechanism (Server-Side Request Forgery (SSRF)).
CVE-2007-5893HTTP request with missing protocol version number leads to crash
Applicable Platforms
Languages:
Not Language-Specific : Often
Modes of Introduction
Implementation
Related Weaknesses
Notes
MaintenanceThis entry is still under development and will continue to see updates and content improvements.