Use of Externally-Controlled Format String

Draft Base

Structure: Simple

Description

The product uses a function that accepts a format string as an argument, but the format string originates from an external source.

Common Consequences 2

Scope: Confidentiality

Impact: Read Memory

Format string problems allow for information disclosure which can severely simplify exploitation of the program.

Scope: IntegrityConfidentialityAvailability

Impact: Modify MemoryExecute Unauthorized Code or Commands

Format string problems can result in the execution of arbitrary code, buffer overflows, denial of service, or incorrect data representation.

Detection Methods 9

Automated Static Analysis

This weakness can often be detected using automated static analysis tools. Many modern tools use data flow analysis or constraint-based techniques to minimize the number of false positives.

Black BoxLimited

Since format strings often occur in rarely-occurring erroneous conditions (e.g. for error message logging), they can be difficult to detect using black box methods. It is highly likely that many latent issues exist in executables that do not have associated source code (or equivalent source.

Automated Static Analysis - Binary or BytecodeHigh

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Bytecode Weakness Analysis - including disassembler + source code weakness analysis Binary Weakness Analysis - including disassembler + source code weakness analysis ``` Cost effective for partial coverage: ``` Binary / Bytecode simple extractor - strings, ELF readers, etc.

Manual Static Analysis - Binary or BytecodeSOAR Partial

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Binary / Bytecode disassembler - then use manual analysis for vulnerabilities & anomalies

Dynamic Analysis with Automated Results InterpretationSOAR Partial

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Web Application Scanner Web Services Scanner Database Scanners

Dynamic Analysis with Manual Results InterpretationSOAR Partial

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Cost effective for partial coverage: ``` Fuzz Tester Framework-based Fuzzer

Manual Static Analysis - Source CodeHigh

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Manual Source Code Review (not inspections) ``` Cost effective for partial coverage: ``` Focused Manual Spotcheck - Focused manual analysis of source

Automated Static Analysis - Source CodeHigh

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Source code Weakness Analyzer Context-configured Source Code Weakness Analyzer ``` Cost effective for partial coverage: ``` Warning Flags

Architecture or Design ReviewHigh

According to SOAR [REF-1479], the following detection techniques may be useful: ``` Highly cost effective: ``` Formal Methods / Correct-By-Construction ``` Cost effective for partial coverage: ``` Inspection (IEEE 1028 standard) (can apply to requirements, design, source code, etc.)

Potential Mitigations 3

Phase: Requirements

Choose a language that is not subject to this flaw.

Phase: Implementation

Ensure that all format string functions are passed a static string which cannot be controlled by the user, and that the proper number of arguments are always sent to that function as well. If at all possible, use functions that do not support the %n operator in format strings. [REF-116] [REF-117]

Phase: Build and Compilation

Run compilers and linkers with high warning levels, since they may detect incorrect usage.

Demonstrative Examples 3

The following program prints a string provided as an argument.

Code Example:Bad
C
c

The example is exploitable, because of the call to printf() in the printWrapper() function. Note: The stack buffer was added to make exploitation more simple.

The following code copies a command line argument into a buffer using snprintf().

Code Example:Bad
C
c

This code allows an attacker to view the contents of the stack and write to the stack using a command line argument containing a sequence of formatting directives. The attacker can read from the stack by providing more formatting directives, such as %x, than the function takes as arguments to be formatted. (In this example, the function takes no arguments to be formatted.) By using the %n formatting directive, the attacker can write to the stack, causing snprintf() to write the number of bytes output thus far to the specified argument (rather than reading a value from the argument, which is the intended behavior). A sophisticated version of this attack will use four staggered writes to completely control the value of a pointer on the stack.

Certain implementations make more advanced attacks even easier by providing format directives that control the location in memory to read from or write to. An example of these directives is shown in the following code, written for glibc:

Code Example:Bad
C
c

This code produces the following output: 5 9 5 5 It is also possible to use half-writes (%hn) to accurately control arbitrary DWORDS in memory, which greatly reduces the complexity needed to execute an attack that would otherwise require four staggered writes, such as the one mentioned in a separate example.

Observed Examples 6

CVE-2002-1825format string in Perl program

CVE-2001-0717format string in bad call to syslog function

CVE-2002-0573format string in bad call to syslog function

CVE-2002-1788format strings in NNTP server responses

CVE-2006-2480Format string vulnerability exploited by triggering errors or warnings, as demonstrated via format string specifiers in a .bmp filename.

CVE-2007-2027Chain: untrusted search path enabling resultant format string by loading malicious internationalization messages

References 8

Format String Vulnerabilities in Perl Programs

Steve Christey

https://seclists.org/fulldisclosure/2005/Dec/91(2023-04-07)

ID: REF-116

Programming Language Format String Vulnerabilities

Hal Burch and Robert C. Seacord

https://drdobbs.com/security/programming-language-format-string-vulne/197002914(2023-04-07)

ID: REF-117

Format String Attacks

Tim Newsham

Guardent

09-09-2000

https://seclists.org/bugtraq/2000/Sep/214(2025-07-29)

ID: REF-118

Writing Secure Code

Michael Howard and David LeBlanc

Microsoft Press

04-12-2002

https://www.microsoftpressstore.com/store/writing-secure-code-9780735617223

ID: REF-7

24 Deadly Sins of Software Security

Michael Howard, David LeBlanc, and John Viega

McGraw-Hill

2010

ID: REF-44

The Art of Software Security Assessment

Mark Dowd, John McDonald, and Justin Schuh

Addison Wesley

2006

ID: REF-62

Automated Source Code Security Measure (ASCSM)

Object Management Group (OMG)

01-2016

http://www.omg.org/spec/ASCSM/1.0/

ID: REF-962

State-of-the-Art Resources (SOAR) for Software Vulnerability Detection, Test, and Evaluation

Gregory Larsen, E. Kenneth Hong Fong, David A. Wheeler, and Rama S. Moorthy

07-2014

https://www.ida.org/-/media/feature/publications/s/st/stateoftheart-resources-soar-for-software-vulnerability-detection-test-and-evaluation/p-5061.ashx(2025-09-05)

ID: REF-1479

Likelihood of Exploit

High

Applicable Platforms

Languages:

C : OftenC++ : OftenPerl : Rarely

Modes of Introduction

Implementation

Related Attack Patterns

Functional Areas

Logging
Error Handling
String Processing
Memory Management

Affected Resources

Memory

Related Weaknesses

ChildOf:

Exposure of Resource to Wrong Sphere (CWE-668)

ChildOf:

Exposure of Resource to Wrong Sphere (CWE-668)

CanPrecede:

Write-what-where Condition (CWE-123)

ChildOf:

Improper Input Validation (CWE-20)

Taxonomy Mapping

PLOVER
7 Pernicious Kingdoms
CLASP
CERT C Secure Coding
CERT C Secure Coding
OWASP Top Ten 2004
WASC
The CERT Oracle Secure Coding Standard for Java (2011)
SEI CERT Perl Coding Standard
Software Fault Patterns
OMG ASCSM

Notes

Applicable Platform This weakness is possible in any programming language that support format strings.

Other In some circumstances, such as internationalization, the set of format strings is externally controlled by design. If the source of these format strings is trusted (e.g. only contained in library files that are only modifiable by the system administrator), then the external control might not itself pose a vulnerability. While Format String vulnerabilities typically fall under the Buffer Overflow category, technically they are not overflowed buffers. The Format String vulnerability is fairly new (circa 1999) and stems from the fact that there is no realistic way for a function that takes a variable number of arguments to determine just how many arguments were passed in. The most common functions that take a variable number of arguments, including C-runtime functions, are the printf() family of calls. The Format String problem appears in a number of ways. A *printf() call without a format specifier is dangerous and can be exploited. For example, printf(input); is exploitable, while printf(y, input); is not exploitable in that context. The result of the first call, used incorrectly, allows for an attacker to be able to peek at stack memory since the input string will be used as the format specifier. The attacker can stuff the input string with format specifiers and begin reading stack values, since the remaining parameters will be pulled from the stack. Worst case, this improper use may give away enough control to allow an arbitrary value (or values in the case of an exploit program) to be written into the memory of the running program. Frequently targeted entities are file names, process names, identifiers. Format string problems are a classic C/C++ issue that are now rare due to the ease of discovery. One main reason format string vulnerabilities can be exploited is due to the %n operator. The %n operator will write the number of characters, which have been printed by the format string therefore far, to the memory pointed to by its argument. Through skilled creation of a format string, a malicious user may use values on the stack to create a write-what-where condition. Once this is achieved, they can execute arbitrary code. Other operators can be used as well; for example, a %9999s operator could also trigger a buffer overflow, or when used in file-formatting functions like fprintf, it can generate a much larger output than intended.

Research GapFormat string issues are under-studied for languages other than C. Memory or disk consumption, control flow or variable alteration, and data corruption may result from format string exploitation in applications written in other languages such as Perl, PHP, Python, etc.