The C preprocessor is a simple text parser/replacer that is run before the actual compilation of the code. Used to extend and ease the use of the C (and later C++) language, it can be used for:
a. Including other files using #include
b. Define a text-replacement macro using #define
c. Conditional Compilation using #if #ifdef
d. Platform/Compiler specific logic (as an extension of conditional compilation)
# Include Guards
A header file may be included by other header files. A source file (compilation unit) that includes multiple headers may therefore, indirectly, include some headers more than once. If such a header file that is included more than once contains definitions, the compiler (after preprocessing) detects a violation of the One Definition Rule (e.g. §3.2 of the 2003 C++ standard) and therefore issues a diagnostic and compilation fails.
Multiple inclusion is prevented using "include guards", which are sometimes also known as header guards or macro guards. These are implemented using the preprocessor #define , #ifndef , #endif directives.
The key advantage of using include guards is that they will work with all standard-compliant compilers and preprocessors.
However, include guards also cause some problems for developers, as it is necessary to ensure the macros are unique within all headers used in a project. Specifically, if two (or more) headers use FOO_H_INCLUDED as their include guard, the first of those headers included in a compilation unit will effectively prevent the others from being included. Particular challenges are introduced if a project uses a number of third-party libraries with header files that happen to use include guards in common.
It is also necessary to ensure that the macros used in include guards do not conflict with any other macros defined in header files.
(opens new window) also support the #pragma once directive which ensures the file is only included once within a single compilation. This is a de facto standard
(opens new window) directive, but it is not part of any ISO C++ standard. For example:
While #pragma once avoids some problems associated with include guards, a #pragma — by definition in the standards — is inherently a compiler-specific hook, and will be silently ignored by compilers that don’t support it. Projects which use #pragma once are more difficult to port to compilers that don’t support it.
A number of coding guidelines and assurance standards for C++ specifically discourage any use of the preprocessor other than to #include header files or for the purposes of placing include guards in headers.
# Conditional logic and cross-platform handling
In a nutshell, conditional pre-processing logic is about making code-logic available or unavailable for compilation using macro definitions.
Three prominent use-cases are:
- different app profiles (e.g. debug, release, testing, optimised) that can be candidates of the same app (e.g. with extra logging).
- cross-platform compiles — single code-base, multiple compilation platforms.
- utilising a common code-base for multiple application versions (e.g. Basic, Premium and Pro versions of a software) — with slightly different features.
Example a: A cross-platform approach for removing files (illustrative):
Macros like _WIN32 , __APPLE__ or __unix__ are normally predefined by corresponding implementations.
Example b: Enabling additional logging for a debug build:
Example c: Enable a premium feature in a separate product build (note: this is illustrative. it is often a better idea to allow a feature to be unlocked without the need to reinstall an application)
Some common tricks:
Defining symbols at invocation time:
The preprocessor can be called with predefined symbols (with optional initialisation). For example this command ( gcc -E runs only the preprocessor)
processes Sample.cpp in the same way as it would if #define OPTIMISE_FOR_OS_X and #define TESTING_MODE 1 were added to the top of Sample.cpp.
Ensuring a macro is defined:
If a macro isn’t defined and its value is compared or checked, the preprocessor almost always silently assumes the value to be 0 . There are a few ways to work with this. One approach is to assume that the default settings are represented as 0, and any changes (e.g. to the app build profile) needs to be explicitly done (e.g. ENABLE_EXTRA_DEBUGGING=0 by default, set -DENABLE_EXTRA_DEBUGGING=1 to override). Another approach is make all definitions and defaults explicit. This can be achieved using a combination of #ifndef and #error directives:
Macros are categorized into two main groups: object-like macros and function-like macros. Macros are treated as a token substitution early in the compilation process. This means that large (or repeating) sections of code can be abstracted into a preprocessor macro.
The Qt library makes use of this technique to create a meta-object system by having the user declare the Q_OBJECT macro at the head of the user-defined class extending QObject.
Macro names are usually written in all caps, to make them easier to differentiate from normal code. This isn’t a requirement, but is merely considered good style by many programmers.
When an object-like macro is encountered, it’s expanded as a simple copy-paste operation, with the macro’s name being replaced with its definition. When a function-like macro is encountered, both its name and its parameters are expanded.
Due to this, function-like macro parameters are often enclosed within parentheses, as in AREA() above. This is to prevent any bugs that can occur during macro expansion, specifically bugs caused by a single macro parameter being composed of multiple actual values.
Also note that due to this simple expansion, care must be taken with the parameters passed to macros, to prevent unexpected side effects. If the parameter is modified during evaluation, it will be modified each time it is used in the expanded macro, which usually isn’t what we want. This is true even if the macro encloses the parameters in parentheses to prevent expansion from breaking anything.
Additionally, macros provide no type-safety, leading to hard-to-understand errors about type mismatch.
As programmers normally terminate lines with a semicolon, macros that are intended to be used as standalone lines are often designed to "swallow" a semicolon; this prevents any unintended bugs from being caused by an extra semicolon.
In this example, the inadvertent double semicolon breaks the if. else block, preventing the compiler from matching the else to the if . To prevent this, the semicolon is omitted from the macro definition, which will cause it to "swallow" the semicolon immediately following any usage of it.
Leaving off the trailing semicolon also allows the macro to be used without ending the current statement, which can be beneficial.
Normally, a macro definition ends at the end of the line. If a macro needs to cover multiple lines, however, a backslash can be used at the end of a line to indicate this. This backslash must be the last character in the line, which indicates to the preprocessor that the following line should be concatenated onto the current line, treating them as a single line. This can be used multiple times in a row.
This is especially useful in complex function-like macros, which may need to cover multiple lines.
In the case of more complex function-like macros, it can be useful to give them their own scope to prevent possible name collisions or to cause objects to be destroyed at the end of the macro, similar to an actual function. A common idiom for this is do while 0, where the macro is enclosed in a do-while block. This block is generally not followed with a semicolon, allowing it to swallow a semicolon.
There are also variadic macros; similarly to variadic functions, these take a variable number of arguments, and then expand them all in place of a special "Varargs" parameter, __VA_ARGS__ .
Note that during expansion, __VA_ARGS__ can be placed anywhere in the definition, and will be expanded correctly.
In the case of a zero-argument variadic parameter, different compilers will handle the trailing comma differently. Some compilers, such as Visual Studio, will silently swallow the comma without any special syntax. Other compilers, such as GCC, require you to place ## immediately before __VA_ARGS__ . Due to this, it is wise to conditionally define variadic macros when portability is a concern.
An idiomatic technique for generating repeating code structures at compile time.
An X-macro consists of two parts: the list, and the execution of the list.
which is expanded by the preprocessor into the following:
As lists become bigger (let’s say, more than 100 elements), this technique helps to avoid excessive copy-pasting.
If defining a seamingly irrelevant X before using LIST is not to your liking, you can pass a macro name as an argument as well:
Now, you explicitly specify which macro should be used when expanding the list, e.g.
If each invocation of the MACRO should take additional parameters — constant with respect to the list, variadic macros can be used
The first argument is supplied by the LIST , while the rest is provided by the user in the LIST invocation. For example:
# Preprocessor error messages
Compile errors can be generated using the preprocessor. This is useful for a number of reasons some of which include, notifying a user if they are on an unsupported platform or an unsupported compiler.
e.g. Return Error if gcc version is 3.0.0 or earlier.
e.g. Return Error if compiling on an Apple computer.
# Predefined macros
Predefined macros are those that the compiler defines (in contrast to those user defines in the source file). Those macros must not be re-defined or undefined by user.
The following macros are predefined by the C++ standard:
__LINE__ contains the line number of the line this macro is used on, and can be changed by the #line directive.
__FILE__ contains the filename of the file this macro is used in, and can be changed by the #line directive.
__DATE__ contains date (in "Mmm dd yyyy" format) of the file compilation, where Mmm is formatted as if obtained by a call to std::asctime() .
__TIME__ contains time (in "hh:mm:ss" format) of the file compilation.
__cplusplus is defined by (conformant) C++ compilers while compiling C++ files. Its value is the standard version the compiler is fully conformant with, i.e. 199711L for C++98 and C++03, 201103L for C++11 and 201402L for C++14 standard.
__STDC_HOSTED__ is defined to 1 if the implementation is hosted, or 0 if it is freestanding.
__STDCPP_DEFAULT_NEW_ALIGNMENT__ contains a size_t literal, which is the alignment used for a call to alignment-unaware operator new .
Additionally, the following macros are allowed to be predefined by implementations, and may or may not be present:
__STDC__ has implementation-dependent meaning, and is usually defined only when compiling a file as C, to signify full C standard compliance. (Or never, if the compiler decides not to support this macro.)
__STDC_VERSION__ has implementation-dependent meaning, and its value is usually the C version, similarly to how __cplusplus is the C++ version. (Or is not even defined, if the compiler decides not to support this macro.)
__STDC_MB_MIGHT_NEQ_WC__ is defined to 1 , if values of the narrow encoding of the basic character set might not be equal to the values of their wide counterparts (e.g. if (uintmax_t)’x’ != (uintmax_t)L’x’ )
__STDC_ISO_10646__ is defined if wchar_t is encoded as Unicode, and expands to an integer constant in the form yyyymmL , indicating the latest Unicode revision supported.
__STDCPP_STRICT_POINTER_SAFETY__ is defined to 1 , if the implementation has strict pointer safety (otherwise it has relaxed pointer safety)
__STDCPP_THREADS__ is defined to 1 , if the program can have more than one thread of execution (applicable to freestanding implementation — hosted implementations can always have more than one thread)
It is also worth mentioning __func__ , which is not an macro, but a predefined function-local variable. It contains the name of the function it is used in, as a static character array in an implementation-defined format.
C Language: #ifndef Directive
This C tutorial explains how to use the #ifndef preprocessor directive in the C language.
In the C Programming Language, the #ifndef directive allows for conditional compilation. The preprocessor determines if the provided macro does not exist before including the subsequent code in the compilation process.
The syntax for the #ifndef directive in the C language is:
macro_definition The macro definition that must not be defined for the preprocessor to include the C source code into the compiled application.
- The #ifndef directive must be closed by an #endif directive.
The following example shows how to use the #ifndef directive in the C language:
In this example, if the macro YEARS_OLD is not defined before the #ifndef directive is encountered, it will be defined with a value of 10.
Here is the output of the executable program:
If you remove the line #define YEARS_OLD 12, you will see the following output from the executable program:
Урок №22. Директивы препроцессора
Препроцессор лучше всего рассматривать как отдельную программу, которая выполняется перед компиляцией. При запуске программы, препроцессор просматривает код сверху вниз, файл за файлом, в поиске директив. Директивы — это специальные команды, которые начинаются с символа # и НЕ заканчиваются точкой с запятой. Есть несколько типов директив, которые мы рассмотрим ниже.
Вы уже видели директиву #include в действии. Когда вы подключаете файл с помощью директивы #include, препроцессор копирует содержимое подключаемого файла в текущий файл сразу после строки с #include. Это очень полезно при использовании определенных данных (например, предварительных объявлений функций) сразу в нескольких местах.
Директива #include имеет две формы:
#include <filename> , которая сообщает препроцессору искать файл в системных путях (в местах хранения системных библиотек языка С++). Чаще всего вы будете использовать эту форму при подключении заголовочных файлов из Стандартной библиотеки C++.
#include «filename» , которая сообщает препроцессору искать файл в текущей директории проекта. Если его там не окажется, то препроцессор начнет проверять системные пути и любые другие, которые вы указали в настройках вашей IDE. Эта форма используется для подключения пользовательских заголовочных файлов.
Директиву #define можно использовать для создания макросов. Макрос — это правило, которое определяет конвертацию идентификатора в указанные данные.
Есть два основных типа макросов: макросы-функции и макросы-объекты.
Макросы-функции ведут себя как функции и используются в тех же целях. Мы не будем сейчас их обсуждать, так как их использование, как правило, считается опасным, и почти всё, что они могут сделать, можно осуществить с помощью простой (линейной) функции.
Макросы-объекты можно определить одним из следующих двух способов:
#define идентификатор текст_замена
Верхнее определение не имеет никакого текст_замена , в то время как нижнее — имеет. Поскольку это директивы препроцессора (а не простые стейтменты), то ни одна из форм не заканчивается точкой с запятой.
Макросы-объекты с текст_замена
Когда препроцессор встречает макросы-объекты с текст_замена , то любое дальнейшее появление идентификатор заменяется на текст_замена . идентификатор обычно пишется заглавными буквами с символами подчёркивания вместо пробелов.
The One-Definition Rule: Introducing and explaining include guards.
You’ll find macros #ifndef and #define at the top of pretty much every C/C++ header file. However, I could not find any information about this formatting in any books or tutorials I learned from, and if they were, there was no explanation! So what are they?
For a while, I just accepted this was part of coding standards for formatting or readability and just used it myself. It was only when I took an interest in understanding compilers that I realised the purpose and importance of header guards.
Include guards are a construct used at the top of header files to stop some code from being included twice. For example, for a header file header.h the contents of this file may look similar to the following:
Understanding why this is required becomes intuitive when you learn how a C/C++ compiler works. When a program is compiled into an executable the compiler will go through all the source files and convert them into one program. It does this by performing an interim step of converting the source code into one long text file ready to convert to assembly. When a source file has an #include “header.h” the header file is opened and appended into this long text file. Once this is complete the compiler will check the files. For small projects, this does not pose any issues. However, for larger projects, certain header files may be reused multiple times. This would mean these header files are appended to the long text file multiple times and finally, this would lead to a compiler error warning that you have redefined some variables.
This is where include guards come in, the compiler will only include the header file if and if only it has not already been defined. For example, the first time a header is included the compiler will check if header_h has been defined, if not it will define header_h and this code between #define and #endif . The second time this header file is now included the compiler will check if header_h has been defined and because it has it will skip the definition.
This is part of the One Definition Rule, which states that:
“A given class, enumeration, and template, etc., must be defined exactly once in a program.” – The C++ Programming Language, Bjarne Stroustrup.
To see this in action, I have created a two simple examples for comparison. One without header guards and one with.
Without header guards
Running this code results in:
With header guards
Compiling this code results in no errors, and running the executable we get 30 , as expected.
To put into one sentence. Header guards are used to stop the compiler including the same code multiple times; this stops any variables, functions, classes etc being defined more than once.
It’s good to get into the habit of starting your header files with these header guards.