Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')

Stable Base
Structure: Simple
Description

The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.

The product uses external input to construct a pathname that is intended to identify a file or directory that is located underneath a restricted parent directory, but the product does not properly neutralize special elements within the pathname that can cause the pathname to resolve to a location that is outside of the restricted directory.
Extended Description

Many file operations are intended to take place within a restricted directory. By using special elements such as ".." and "/" separators, attackers can escape outside of the restricted location to access files or directories that are elsewhere on the system. One of the most common special elements is the "../" sequence, which in most modern operating systems is interpreted as the parent directory of the current location. This is referred to as relative path traversal. Path traversal also covers the use of absolute pathnames such as "/usr/local/bin" to access unexpected files. This is referred to as absolute path traversal.

Common Consequences 4
Scope: IntegrityConfidentialityAvailability

Impact: Execute Unauthorized Code or Commands

The attacker may be able to create or overwrite critical files that are used to execute code, such as programs or libraries.

Scope: Integrity

Impact: Modify Files or Directories

The attacker may be able to overwrite or create critical files, such as programs, libraries, or important data. If the targeted file is used for a security mechanism, then the attacker may be able to bypass that mechanism. For example, appending a new account at the end of a password file may allow an attacker to bypass authentication.

Scope: Confidentiality

Impact: Read Files or Directories

The attacker may be able read the contents of unexpected files and expose sensitive data. If the targeted file is used for a security mechanism, then the attacker may be able to bypass that mechanism. For example, by reading a password file, the attacker could conduct brute force password guessing attacks in order to break into an account on the system.

Scope: Availability

Impact: DoS: Crash, Exit, or Restart

The attacker may be able to overwrite, delete, or corrupt unexpected critical files such as programs, libraries, or important data. This may prevent the product from working at all and in the case of protection mechanisms such as authentication, it has the potential to lock out product users.

Detection Methods 9
Automated Static AnalysisHigh
Automated techniques can find areas where path traversal weaknesses exist. However, tuning or customization may be required to remove or de-prioritize path-traversal problems that are only exploitable by the product's administrator - or other privileged users - and thus potentially valid behavior or, at worst, a bug instead of a vulnerability.
Manual Static AnalysisHigh
Manual white box techniques may be able to provide sufficient code coverage and reduction of false positives if all file access operations can be assessed within limited time constraints.
Automated Static Analysis - Binary or BytecodeHigh
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Bytecode Weakness Analysis - including disassembler + source code weakness analysis ``` Cost effective for partial coverage: ``` Binary Weakness Analysis - including disassembler + source code weakness analysis
Manual Static Analysis - Binary or BytecodeSOAR Partial
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Binary / Bytecode disassembler - then use manual analysis for vulnerabilities & anomalies
Dynamic Analysis with Automated Results InterpretationHigh
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Web Application Scanner Web Services Scanner Database Scanners
Dynamic Analysis with Manual Results InterpretationHigh
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Fuzz Tester Framework-based Fuzzer
Manual Static Analysis - Source CodeHigh
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Manual Source Code Review (not inspections) ``` Cost effective for partial coverage: ``` Focused Manual Spotcheck - Focused manual analysis of source
Automated Static Analysis - Source CodeHigh
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Source code Weakness Analyzer Context-configured Source Code Weakness Analyzer
Architecture or Design ReviewHigh
According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Formal Methods / Correct-By-Construction ``` Cost effective for partial coverage: ``` Inspection (IEEE 1028 standard) (can apply to requirements, design, source code, etc.)
Potential Mitigations 11
Phase: Implementation

Strategy: Input Validation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does. When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue." Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright. When validating filenames, use stringent allowlists that limit the character set to be used. If feasible, only allow a single "." character in the filename to avoid weaknesses such as Relative Path Traversal, and exclude directory separators such as "/" to avoid Absolute Path Traversal. Use a list of allowable file extensions, which will help to avoid Unrestricted Upload of File with Dangerous Type. Do not rely exclusively on a filtering mechanism that removes potentially dangerous characters. This is equivalent to a denylist, which may be incomplete (Incomplete List of Disallowed Inputs). For example, filtering "/" is insufficient protection if the filesystem also supports the use of "\" as a directory separator. Another possible error could occur when the filtering is applied in a way that still produces dangerous data (Collapse of Data into Unsafe Value). For example, if "../" sequences are removed from the ".../...//" string in a sequential fashion, two instances of "../" would be removed from the original string, but the remaining characters would still form the "../" string.
Phase: Architecture and Design
For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid Client-Side Enforcement of Server-Side Security. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.
Phase: Implementation

Strategy: Input Validation

Inputs should be decoded and canonicalized to the application's current internal representation before being validated (Incorrect Behavior Order: Validate Before Canonicalize). Make sure that the application does not decode the same input twice (Double Decoding of the Same Data). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked. Use a built-in path canonicalization function (such as realpath() in C) that produces the canonical version of the pathname, which effectively removes ".." sequences and symbolic links (Relative Path Traversal, Improper Link Resolution Before File Access ('Link Following')). This includes: - realpath() in C - getCanonicalPath() in Java - GetFullPath() in ASP.NET - realpath() or abs_path() in Perl - realpath() in PHP
Phase: Architecture and Design

Strategy: Libraries or Frameworks

Use a vetted library or framework that does not allow this weakness to occur or provides constructs that make this weakness easier to avoid [REF-1482].
Phase: Operation

Strategy: Firewall

Use an application firewall that can detect attacks against this weakness. It can be beneficial in cases in which the code cannot be fixed (because it is controlled by a third party), as an emergency prevention measure while more comprehensive software assurance measures are applied, or to provide defense in depth [REF-1481].

Effectiveness: Moderate

Phase: Architecture and DesignOperation

Strategy: Environment Hardening

Run your code using the lowest privileges that are required to accomplish the necessary tasks [REF-76]. If possible, create isolated accounts with limited privileges that are only used for a single task. That way, a successful attack will not immediately give the attacker access to the rest of the software or its environment. For example, database applications rarely need to run as the database administrator, especially in day-to-day operations.
Phase: Architecture and Design

Strategy: Enforcement by Conversion

When the set of acceptable objects, such as filenames or URLs, is limited or known, create a mapping from a set of fixed input values (such as numeric IDs) to the actual filenames or URLs, and reject all other inputs. For example, ID 1 could map to "inbox.txt" and ID 2 could map to "profile.txt". Features such as the ESAPI AccessReferenceMap [REF-185] provide this capability.
Phase: Architecture and DesignOperation

Strategy: Sandbox or Jail

Run the code in a "jail" or similar sandbox environment that enforces strict boundaries between the process and the operating system. This may effectively restrict which files can be accessed in a particular directory or which commands can be executed by the software. OS-level examples include the Unix chroot jail, AppArmor, and SELinux. In general, managed code may provide some protection. For example, java.io.FilePermission in the Java SecurityManager allows the software to specify restrictions on file operations. This may not be a feasible solution, and it only limits the impact to the operating system; the rest of the application may still be subject to compromise. Be careful to avoid Creation of chroot Jail Without Changing Working Directory and other weaknesses related to jails.

Effectiveness: Limited

Phase: Architecture and DesignOperation

Strategy: Attack Surface Reduction

Store library, include, and utility files outside of the web document root, if possible. Otherwise, store them in a separate directory and use the web server's access control capabilities to prevent attackers from directly requesting them. One common practice is to define a fixed constant in each calling program, then check for the existence of the constant in the library/include file; if the constant does not exist, then the file was directly requested, and it can exit immediately. This significantly reduces the chance of an attacker being able to bypass any protection mechanisms that are in the base program but not in the include files. It will also reduce the attack surface.
Phase: Implementation
Ensure that error messages only contain minimal details that are useful to the intended audience and no one else. The messages need to strike the balance between being too cryptic (which can confuse users) or being too detailed (which may reveal more than intended). The messages should not reveal the methods that were used to determine the error. Attackers can use detailed information to refine or optimize their original attack, thereby increasing their chances of success. If errors must be captured in some detail, record them in log messages, but consider what could occur if the log messages can be viewed by attackers. Highly sensitive information such as passwords should never be saved to log files. Avoid inconsistent messaging that might accidentally tip off an attacker about internal state, such as whether a user account exists or not. In the context of path traversal, error messages which disclose path information can help attackers craft the appropriate attack strings to move through the file system hierarchy.
Phase: OperationImplementation

Strategy: Environment Hardening

When using PHP, configure the application so that it does not use register_globals. During implementation, develop the application so that it does not rely on this feature, but be wary of implementing a register_globals emulation that is subject to weaknesses such as Improper Neutralization of Directives in Dynamically Evaluated Code ('Eval Injection'), Variable Extraction Error, and similar issues.
Demonstrative Examples 6

ID : DX-27

The following code could be for a social networking application in which each user's profile information is stored in a separate file. All files are stored in a single directory.

Code Example:

Bad
Perl
perl
While the programmer intends to access files such as "/users/cwe/profiles/alice" or "/users/cwe/profiles/bob", there is no verification of the incoming user parameter. An attacker could provide a string such as:

Code Example:

Attack
bash
The program would generate a profile pathname like this:

Code Example:

Result
bash
When the file is opened, the operating system resolves the "../" during path canonicalization and actually accesses this file:

Code Example:

Result
bash
As a result, the attacker could read the entire text of the password file.
Notice how this code also contains an error message information leak (Generation of Error Message Containing Sensitive Information) if the user parameter does not produce a file that exists: the full pathname is provided. Because of the lack of output encoding of the file that is retrieved, there might also be a cross-site scripting problem (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')) if profile contains any HTML, but other code would need to be examined.

ID : DX-18

In the example below, the path to a dictionary file is read from a system property and used to initialize a File object.

Code Example:

Bad
Java
java
However, the path is not validated or modified to prevent it from containing relative or absolute path sequences before creating the File object. This allows anyone who can control the system property to determine what file is used. Ideally, the path should be resolved relative to some kind of application or user home directory.

ID : DX-2

The following code takes untrusted input and uses a regular expression to filter "../" from the input. It then appends this result to the /home/user/ directory and attempts to read the file in the final resulting path.

Code Example:

Bad
Perl
perl
Since the regular expression does not have the /g global match modifier, it only removes the first instance of "../" it comes across. So an input value such as:

Code Example:

Attack
bash
will have the first "../" stripped, resulting in:

Code Example:

Result
bash
This value is then concatenated with the /home/user/ directory:

Code Example:

Result
bash
which causes the /etc/passwd file to be retrieved once the operating system has resolved the ../ sequences in the pathname. This leads to relative path traversal (Relative Path Traversal).
The following code attempts to validate a given input path by checking it against an allowlist and once validated delete the given file. In this specific case, the path is considered valid if it starts with the string "/safe_dir/".

Code Example:

Bad
Java
java
An attacker could provide an input such as this:

Code Example:

Attack
bash
The software assumes that the path is valid because it starts with the "/safe_path/" sequence, but the "../" sequence will cause the program to delete the important.dat file in the parent directory

ID : DX-22

The following code demonstrates the unrestricted upload of a file with a Java servlet and a path traversal vulnerability. The action attribute of an HTML form is sending the upload file request to the Java servlet.

Code Example:

Good
HTML
html
When submitted the Java servlet's doPost method will receive the request, extract the name of the file from the Http request header, read the file contents from the request and output the file to the local upload directory.

Code Example:

Bad
Java
java
This code does not perform a check on the type of the file being uploaded (Unrestricted Upload of File with Dangerous Type). This could allow an attacker to upload any executable file or other file with malicious code.
Additionally, the creation of the BufferedWriter object is subject to relative path traversal (Relative Path Traversal). Since the code does not check the filename that is provided in the header, an attacker can use "../" sequences to write to files outside of the intended directory. Depending on the executing environment, the attacker may be able to specify arbitrary files to write to, leading to a wide variety of consequences, from code execution, XSS (Improper Neutralization of Input During Web Page Generation ('Cross-site Scripting')), or system crash.

ID : DX-159

This script intends to read a user-supplied file from the current directory. The user inputs the relative path to the file and the script uses Python's os.path.join() function to combine the path to the current working directory with the provided path to the specified file. This results in an absolute path to the desired file. If the file does not exist when the script attempts to read it, an error is printed to the user.

Code Example:

Bad
Python
python
However, if the user supplies an absolute path, the os.path.join() function will discard the path to the current working directory and use only the absolute path provided. For example, if the current working directory is /home/user/documents, but the user inputs /etc/passwd, os.path.join() will use only /etc/passwd, as it is considered an absolute path. In the above scenario, this would cause the script to access and read the /etc/passwd file.

Code Example:

Good
Python
python
The constructed path string uses os.sep to add the appropriate separation character for the given operating system (e.g. '\' or '/') and the call to os.path.normpath() removes any additional slashes that may have been entered - this may occur particularly when using a Windows path. The path is checked against an expected directory (/home/cwe/documents); otherwise, an attacker could provide relative path sequences like ".." to cause normpath() to generate paths that are outside the intended directory (Relative Path Traversal). By putting the pieces of the path string together in this fashion, the script avoids a call to os.path.join() and any potential issues that might arise if an absolute path is entered. With this version of the script, if the current working directory is /home/cwe/documents, and the user inputs /etc/passwd, the resulting path will be /home/cwe/documents/etc/passwd. The user is therefore contained within the current working directory as intended.
Observed Examples 23
CVE-2024-37032Large language model (LLM) management tool does not validate the format of a digest value (Improper Validation of Specified Type of Input) from a private, untrusted model registry, enabling relative path traversal (Relative Path Traversal), a.k.a. Probllama
CVE-2024-4315Chain: API for text generation using Large Language Models (LLMs) does not include the "\" Windows folder separator in its denylist (Incomplete List of Disallowed Inputs) when attempting to prevent Local File Inclusion via path traversal (Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')), allowing deletion of arbitrary files on Windows systems.
CVE-2024-0520Product for managing datasets for AI model training and evaluation allows both relative (Relative Path Traversal) and absolute (Absolute Path Traversal) path traversal to overwrite files via the Content-Disposition header
CVE-2022-45918Chain: a learning management tool debugger uses external input to locate previous session logs (External Control of File Name or Path) and does not properly validate the given path (Improper Input Validation), allowing for filesystem path traversal using "../" sequences (Path Traversal: '../filedir')
CVE-2019-20916Python package manager does not correctly restrict the filename specified in a Content-Disposition header, allowing arbitrary file read using path traversal sequences such as "../"
CVE-2022-31503Python package constructs filenames using an unsafe os.path.join call on untrusted input, allowing absolute path traversal because os.path.join resets the pathname to an absolute path that is specified as part of the input.
CVE-2022-24877directory traversal in Go-based Kubernetes operator app allows accessing data from the controller's pod file system via ../ sequences in a yaml file
CVE-2021-21972Chain: Cloud computing virtualization platform does not require authentication for upload of a tar format file (Missing Authentication for Critical Function), then uses .. path traversal sequences (Relative Path Traversal) in the file to access unexpected files, as exploited in the wild per CISA KEV.
CVE-2020-4053a Kubernetes package manager written in Go allows malicious plugins to inject path traversal sequences into a plugin archive ("Zip slip") to copy a file outside the intended directory
CVE-2020-3452Chain: security product has improper input validation (Improper Input Validation) leading to directory traversal (Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal')), as exploited in the wild per CISA KEV.
CVE-2019-10743Go-based archive library allows extraction of files to locations outside of the target folder with "../" path traversal sequences in filenames in a zip file, aka "Zip Slip"
CVE-2010-0467Newsletter module allows reading arbitrary files using "../" sequences.
CVE-2006-7079Chain: PHP app uses extract for register_globals compatibility layer (Variable Extraction Error), enabling path traversal (Improper Limitation of a Pathname to a Restricted Directory ('Path Traversal'))
CVE-2009-4194FTP server allows deletion of arbitrary files using ".." in the DELE command.
CVE-2009-4053FTP server allows creation of arbitrary directories using ".." in the MKD command.
CVE-2009-0244FTP service for a Bluetooth device allows listing of directories, and creation or reading of files using ".." sequences.
CVE-2009-4013Software package maintenance program allows overwriting arbitrary files using "../" sequences.
CVE-2009-4449Bulletin board allows attackers to determine the existence of files using the avatar.
CVE-2009-4581PHP program allows arbitrary code execution using ".." in filenames that are fed to the include() function.
CVE-2010-0012Overwrite of files using a .. in a Torrent file.
CVE-2010-0013Chat program allows overwriting files using a custom smiley request.
CVE-2008-5748Chain: external control of values for user's desired language and theme enables path traversal.
CVE-2009-1936Chain: library file sends a redirect if it is directly requested but continues to execute, allowing remote file inclusion and path traversal.
References 11
Writing Secure Code
Michael Howard and David LeBlanc
Microsoft Press
04-12-2002
ID: REF-7
OWASP Enterprise Security API (ESAPI) Project
OWASP
ID: REF-45
Testing for Path Traversal (OWASP-AZ-001)
OWASP
ID: REF-185
Top 25 Series - Rank 7 - Path Traversal
Johannes Ullrich
SANS Software Security Institute
09-03-2010
ID: REF-186
The Art of Software Security Assessment
Mark Dowd, John McDonald, and Justin Schuh
Addison Wesley
2006
ID: REF-62
Automated Source Code Security Measure (ASCSM)
Object Management Group (OMG)
01-2016
ID: REF-962
Secure by Design Alert: Eliminating Directory Traversal Vulnerabilities in Software
Cybersecurity and Infrastructure Security Agency
02-05-2024
ID: REF-1448
State-of-the-Art Resources (SOAR) for Software Vulnerability Detection, Test, and Evaluation
Gregory Larsen, E. Kenneth Hong Fong, David A. Wheeler, and Rama S. Moorthy
07-2014
ID: REF-1479
D3FEND: Application Layer Firewall
D3FEND
ID: REF-1481
D3FEND: D3-TL Trusted Library
D3FEND
ID: REF-1482
Likelihood of Exploit

High

Applicable Platforms
Languages:
Not Language-Specific : Undetermined
Technologies:
AI/ML : Undetermined
Modes of Introduction
Implementation
Alternate Terms

Directory traversal

Path traversal

"Path traversal" is preferred over "directory traversal," but both terms are attack-focused.
Functional Areas
  1. File Processing
Affected Resources
  1. File or Directory
Taxonomy Mapping
  • PLOVER
  • OWASP Top Ten 2007
  • OWASP Top Ten 2004
  • CERT C Secure Coding
  • SEI CERT Perl Coding Standard
  • WASC
  • Software Fault Patterns
  • OMG ASCSM
Notes
OtherIn many programming languages, the injection of a null byte (the 0 or NUL) may allow an attacker to truncate a generated filename to apply to a wider range of files. For example, the product may add ".txt" to any pathname, thus limiting the attacker to text files, but a null injection may effectively remove this restriction.
RelationshipPathname equivalence can be regarded as a type of canonicalization error.
RelationshipSome pathname equivalence issues are not directly related to directory traversal, rather are used to bypass security-relevant checks for whether a file/directory can be accessed by the attacker (e.g. a trailing "/" on a filename could bypass access rules that don't expect a trailing /, causing a server to provide the file when it normally would not).
Terminology Like other weaknesses, terminology is often based on the types of manipulations used, instead of the underlying weaknesses. Some people use "directory traversal" only to refer to the injection of ".." and equivalent sequences whose specific meaning is to traverse directories. Other variants like "absolute pathname" and "drive letter" have the *effect* of directory traversal, but some people may not call it such, since it doesn't involve ".." or equivalent.
Research GapMany variants of path traversal attacks are probably under-studied with respect to root cause. Improper Filtering of Special Elements and Collapse of Data into Unsafe Value begin to cover part of this gap.
Research Gap Incomplete diagnosis or reporting of vulnerabilities can make it difficult to know which variant is affected. For example, a researcher might say that "..\" is vulnerable, but not test "../" which may also be vulnerable. Any combination of directory separators ("/", "\", etc.) and numbers of "." (e.g. "....") can produce unique variants; for example, the "//../" variant is not listed (CVE-2004-0325). See this entry's children and lower-level descendants.