Convert-Binary-C

 view release on metacpan or  search on metacpan

ucpp/README  view on Meta::CPAN


Each of these functions receives a pointer to the corresponding
preprocessor object as its first parameter. There is also a
callback_arg member defined in the preprocessor object structure
which can be used to pass an additional pointer to the callback
function.

If you additionally define UCPP_CLONE, you can also clone an
existing preprocessor object:

	clone = clone_cpp(original);

The cloned object will be identical to the original object, except
for its internal lexer states, which means you cannot clone a
preprocessor object while it is preprocessing source code.



COMPATIBILITY NOTES
-------------------

The C language has a lengthening history. Nowadays, C comes in three
flavours:

-- Traditional C, aka "K&R". This is the language first described by
Brian Kernighan and Dennis Ritchie, and implemented in the first C
compiler that was ever coded. There are actually several dialects of
K&R, and all of them are considered deprecated.

-- ISO 9899:1990, aka C90, aka C89, aka ANSI-C. Formalized by ANSI
in 1989 and adopted by ISO the next year, it is the C flavour many C
compilers understand. It is mostly backward compatible with K&R C, but
with enhancements, clarifications and several new features.

-- ISO 9899:1999, aka C99. This is an evolution on C90, almost fully
backward compatible with C90. C99 introduces many new and useful
features, however, including in the preprocessor.

There was also a normative addendum in 1995, that added a few features
to C90 (for instance, digraphs) that are also present in C99. It is
sometimes refered to as "C95" or "AMD 1".


ucpp implements the C99 standard, but can be used in a stricter mode,
to enforce C90 compatibility (it will, however, still recognize some
constructions that are not in plain C90).

ucpp also knows about several extensions to C99:

-- Assertions: this is an extension to the defined() operator, with
   its own namespace. Assertions seem to be used in several places,
   therefore ucpp knows about them. It is recommended to enable
   assertions by default on Solaris systems.
-- Unicode: the C99 norm specifies that extended characters, from
   the ISO-10646 charset (aka "unicode") can be used in identifiers
   with the notations \u and \U. ucpp also accepts (with the proper
   flag) the UTF-8 encoding in the source file for such characters.
-- #include_next directive: it works as a #include, but will look
   for files only in the directories specified in the include path
   after the one the current file was found. This is a GNU-ism that
   is useful for writing transparent wrappers around header files.

Assertions and unicode are activated by specific flags; the #include_next
support is always active.

The ucpp code itself should be compatible with any ISO-C90 compiler.
The cpp.c file is rather big (~ 64kB), it might confuse old 16-bit C
compilers; the macro.c file is somewhat large also (~ 47kB).

The evaluation of #if expressions is subject to some subtleties, see the
section "cross-compilation".

The lexer code makes no assumption about the source character set, but
the following: source characters (those which have a syntactic value in
C; comment and string literal contents are not concerned) must have a
strictly positive value that is strictly lower than MAX_CHAR_VAL. The
strict positivity is already assured by the C standard, so you just need
to adjust MAX_CHAR_VAL.

ucpp has been tested succesfully on ASCII/ISO-8859-1 and EBCDIC systems.
Beware that UTF-8 is NOT compatible with EBCDIC.

Pragma handling: when used in non-lexer mode, ucpp tries to output a
source text that, when read again, will yield the exact same stream of
tokens. This is not completely true with regards to line numbering in
some tricky macro replacements, but it should work correctly otherwise,
especially with pragma directives if the compile-time option PRAGMA_DUMP
was set: #pragma are dumped, non-void _Pragma() are converted to the
corresponding #pragma and dumped also.

ucpp does not macro-replace the contents of #pragma and _Pragma();
If you want a macro-replaced pragma, use this:

#define pragma_(x)	_Pragma(#x)
#define pragma(x)	pragma_(x)

Anyway, pragmas do not nest (an _Pragma() cannot be evaluated if it is
inside a #pragma or another _Pragma).


I wrote ucpp according to what is found in "The C Programming Language"
from Brian Kernighan and Dennis Ritchie (2nd edition) and the C99
standard; but I could have misinterpreted some points. On some tricky
points I got help from the helpful people from the comp.std.c newsgroup.
For assertions and #include_next, I mimicked the behaviour of GNU cpp,
as is stated in the GNU cpp info documentation. An open question is
related to the following code:

#define undefined	!
#define makeun(x)	un ## x
#if makeun(defined foo)
qux
#else
bar
#endif

ucpp will replace 'defined foo' with 0 first (since foo is not defined),
then it will replace the macro makeun, and the expression will become
'un0', which is replaced by 0 since this is a remaining identifier. The
expression evaluates to false, and 'bar' is emitted.
However, some other preprocessors will replace makeun first, considering



( run in 1.394 second using v1.01-cache-2.11-cpan-39bf76dae61 )