source: EcnlProtoTool/trunk/tcc-0.9.26/tcc-doc.texi@ 279

Last change on this file since 279 was 279, checked in by coas-nagasima, 7 years ago

ファイルを追加、更新。

File size: 34.2 KB
Line 
1\input texinfo @c -*- texinfo -*-
2@c %**start of header
3@setfilename tcc-doc.info
4@settitle Tiny C Compiler Reference Documentation
5@dircategory Software development
6@direntry
7* TCC: (tcc-doc). The Tiny C Compiler.
8@end direntry
9@c %**end of header
10
11@include config.texi
12
13@iftex
14@titlepage
15@afourpaper
16@sp 7
17@center @titlefont{Tiny C Compiler Reference Documentation}
18@sp 3
19@end titlepage
20@headings double
21@end iftex
22
23@contents
24
25@node Top, Introduction, (dir), (dir)
26@top Tiny C Compiler Reference Documentation
27
28This manual documents version @value{VERSION} of the Tiny C Compiler.
29
30@menu
31* Introduction:: Introduction to tcc.
32* Invoke:: Invocation of tcc (command line, options).
33* Clang:: ANSI C and extensions.
34* asm:: Assembler syntax.
35* linker:: Output file generation and supported targets.
36* Bounds:: Automatic bounds-checking of C code.
37* Libtcc:: The libtcc library.
38* devel:: Guide for Developers.
39@end menu
40
41
42@node Introduction
43@chapter Introduction
44
45TinyCC (aka TCC) is a small but hyper fast C compiler. Unlike other C
46compilers, it is meant to be self-relying: you do not need an
47external assembler or linker because TCC does that for you.
48
49TCC compiles so @emph{fast} that even for big projects @code{Makefile}s may
50not be necessary.
51
52TCC not only supports ANSI C, but also most of the new ISO C99
53standard and many GNUC extensions including inline assembly.
54
55TCC can also be used to make @emph{C scripts}, i.e. pieces of C source
56that you run as a Perl or Python script. Compilation is so fast that
57your script will be as fast as if it was an executable.
58
59TCC can also automatically generate memory and bound checks
60(@pxref{Bounds}) while allowing all C pointers operations. TCC can do
61these checks even if non patched libraries are used.
62
63With @code{libtcc}, you can use TCC as a backend for dynamic code
64generation (@pxref{Libtcc}).
65
66TCC mainly supports the i386 target on Linux and Windows. There are alpha
67ports for the ARM (@code{arm-tcc}) and the TMS320C67xx targets
68(@code{c67-tcc}). More information about the ARM port is available at
69@url{http://lists.gnu.org/archive/html/tinycc-devel/2003-10/msg00044.html}.
70
71For usage on Windows, see also @url{tcc-win32.txt}.
72
73@node Invoke
74@chapter Command line invocation
75
76@section Quick start
77
78@example
79@c man begin SYNOPSIS
80usage: tcc [options] [@var{infile1} @var{infile2}@dots{}] [@option{-run} @var{infile} @var{args}@dots{}]
81@c man end
82@end example
83
84@noindent
85@c man begin DESCRIPTION
86TCC options are a very much like gcc options. The main difference is that TCC
87can also execute directly the resulting program and give it runtime
88arguments.
89
90Here are some examples to understand the logic:
91
92@table @code
93@item @samp{tcc -run a.c}
94Compile @file{a.c} and execute it directly
95
96@item @samp{tcc -run a.c arg1}
97Compile a.c and execute it directly. arg1 is given as first argument to
98the @code{main()} of a.c.
99
100@item @samp{tcc a.c -run b.c arg1}
101Compile @file{a.c} and @file{b.c}, link them together and execute them. arg1 is given
102as first argument to the @code{main()} of the resulting program.
103@ignore
104Because multiple C files are specified, @option{--} are necessary to clearly
105separate the program arguments from the TCC options.
106@end ignore
107
108@item @samp{tcc -o myprog a.c b.c}
109Compile @file{a.c} and @file{b.c}, link them and generate the executable @file{myprog}.
110
111@item @samp{tcc -o myprog a.o b.o}
112link @file{a.o} and @file{b.o} together and generate the executable @file{myprog}.
113
114@item @samp{tcc -c a.c}
115Compile @file{a.c} and generate object file @file{a.o}.
116
117@item @samp{tcc -c asmfile.S}
118Preprocess with C preprocess and assemble @file{asmfile.S} and generate
119object file @file{asmfile.o}.
120
121@item @samp{tcc -c asmfile.s}
122Assemble (but not preprocess) @file{asmfile.s} and generate object file
123@file{asmfile.o}.
124
125@item @samp{tcc -r -o ab.o a.c b.c}
126Compile @file{a.c} and @file{b.c}, link them together and generate the object file @file{ab.o}.
127
128@end table
129
130Scripting:
131
132TCC can be invoked from @emph{scripts}, just as shell scripts. You just
133need to add @code{#!/usr/local/bin/tcc -run} at the start of your C source:
134
135@example
136#!/usr/local/bin/tcc -run
137#include <stdio.h>
138
139int main()
140@{
141 printf("Hello World\n");
142 return 0;
143@}
144@end example
145
146TCC can read C source code from @emph{standard input} when @option{-} is used in
147place of @option{infile}. Example:
148
149@example
150echo 'main()@{puts("hello");@}' | tcc -run -
151@end example
152@c man end
153
154@section Option summary
155
156General Options:
157
158@c man begin OPTIONS
159@table @option
160@item -c
161Generate an object file.
162
163@item -o outfile
164Put object file, executable, or dll into output file @file{outfile}.
165
166@item -run source [args...]
167Compile file @var{source} and run it with the command line arguments
168@var{args}. In order to be able to give more than one argument to a
169script, several TCC options can be given @emph{after} the
170@option{-run} option, separated by spaces:
171@example
172tcc "-run -L/usr/X11R6/lib -lX11" ex4.c
173@end example
174In a script, it gives the following header:
175@example
176#!/usr/local/bin/tcc -run -L/usr/X11R6/lib -lX11
177@end example
178
179@item -dumpversion
180Print only the compiler version and nothing else.
181
182@item -v
183Display TCC version.
184
185@item -vv
186Show included files. As sole argument, print search dirs (as below).
187
188@item -bench
189Display compilation statistics.
190
191@item -print-search-dirs
192Print the configured installation directory and a list of library
193and include directories tcc will search.
194
195@end table
196
197Preprocessor options:
198
199@table @option
200@item -Idir
201Specify an additional include path. Include paths are searched in the
202order they are specified.
203
204System include paths are always searched after. The default system
205include paths are: @file{/usr/local/include}, @file{/usr/include}
206and @file{PREFIX/lib/tcc/include}. (@file{PREFIX} is usually
207@file{/usr} or @file{/usr/local}).
208
209@item -Dsym[=val]
210Define preprocessor symbol @samp{sym} to
211val. If val is not present, its value is @samp{1}. Function-like macros can
212also be defined: @option{-DF(a)=a+1}
213
214@item -Usym
215Undefine preprocessor symbol @samp{sym}.
216@end table
217
218Compilation flags:
219
220Note: each of the following warning options has a negative form beginning with
221@option{-fno-}.
222
223@table @option
224@item -funsigned-char
225Let the @code{char} type be unsigned.
226
227@item -fsigned-char
228Let the @code{char} type be signed.
229
230@item -fno-common
231Do not generate common symbols for uninitialized data.
232
233@item -fleading-underscore
234Add a leading underscore at the beginning of each C symbol.
235
236@end table
237
238Warning options:
239
240@table @option
241@item -w
242Disable all warnings.
243
244@end table
245
246Note: each of the following warning options has a negative form beginning with
247@option{-Wno-}.
248
249@table @option
250@item -Wimplicit-function-declaration
251Warn about implicit function declaration.
252
253@item -Wunsupported
254Warn about unsupported GCC features that are ignored by TCC.
255
256@item -Wwrite-strings
257Make string constants be of type @code{const char *} instead of @code{char
258*}.
259
260@item -Werror
261Abort compilation if warnings are issued.
262
263@item -Wall
264Activate all warnings, except @option{-Werror}, @option{-Wunusupported} and
265@option{-Wwrite-strings}.
266
267@end table
268
269Linker options:
270
271@table @option
272@item -Ldir
273Specify an additional static library path for the @option{-l} option. The
274default library paths are @file{/usr/local/lib}, @file{/usr/lib} and @file{/lib}.
275
276@item -lxxx
277Link your program with dynamic library libxxx.so or static library
278libxxx.a. The library is searched in the paths specified by the
279@option{-L} option.
280
281@item -Bdir
282Set the path where the tcc internal libraries (and include files) can be
283found (default is @file{PREFIX/lib/tcc}).
284
285@item -shared
286Generate a shared library instead of an executable.
287
288@item -soname name
289set name for shared library to be used at runtime
290
291@item -static
292Generate a statically linked executable (default is a shared linked
293executable).
294
295@item -rdynamic
296Export global symbols to the dynamic linker. It is useful when a library
297opened with @code{dlopen()} needs to access executable symbols.
298
299@item -r
300Generate an object file combining all input files.
301
302@item -Wl,-rpath=path
303Put custom seatch path for dynamic libraries into executable.
304
305@item -Wl,--oformat=fmt
306Use @var{fmt} as output format. The supported output formats are:
307@table @code
308@item elf32-i386
309ELF output format (default)
310@item binary
311Binary image (only for executable output)
312@item coff
313COFF output format (only for executable output for TMS320C67xx target)
314@end table
315
316@item -Wl,-subsystem=console/gui/wince/...
317Set type for PE (Windows) executables.
318
319@item -Wl,-[Ttext=# | section-alignment=# | file-alignment=# | image-base=# | stack=#]
320Modify executable layout.
321
322@item -Wl,-Bsymbolic
323Set DT_SYMBOLIC tag.
324
325@end table
326
327Debugger options:
328
329@table @option
330@item -g
331Generate run time debug information so that you get clear run time
332error messages: @code{ test.c:68: in function 'test5()': dereferencing
333invalid pointer} instead of the laconic @code{Segmentation
334fault}.
335
336@item -b
337Generate additional support code to check
338memory allocations and array/pointer bounds. @option{-g} is implied. Note
339that the generated code is slower and bigger in this case.
340
341Note: @option{-b} is only available on i386 for the moment.
342
343@item -bt N
344Display N callers in stack traces. This is useful with @option{-g} or
345@option{-b}.
346
347@end table
348
349Misc options:
350
351@table @option
352@item -MD
353Generate makefile fragment with dependencies.
354
355@item -MF depfile
356Use @file{depfile} as output for -MD.
357
358@end table
359
360Note: GCC options @option{-Ox}, @option{-fx} and @option{-mx} are
361ignored.
362@c man end
363
364@ignore
365
366@setfilename tcc
367@settitle Tiny C Compiler
368
369@c man begin SEEALSO
370gcc(1)
371@c man end
372
373@c man begin AUTHOR
374Fabrice Bellard
375@c man end
376
377@end ignore
378
379@node Clang
380@chapter C language support
381
382@section ANSI C
383
384TCC implements all the ANSI C standard, including structure bit fields
385and floating point numbers (@code{long double}, @code{double}, and
386@code{float} fully supported).
387
388@section ISOC99 extensions
389
390TCC implements many features of the new C standard: ISO C99. Currently
391missing items are: complex and imaginary numbers and variable length
392arrays.
393
394Currently implemented ISOC99 features:
395
396@itemize
397
398@item 64 bit @code{long long} types are fully supported.
399
400@item The boolean type @code{_Bool} is supported.
401
402@item @code{__func__} is a string variable containing the current
403function name.
404
405@item Variadic macros: @code{__VA_ARGS__} can be used for
406 function-like macros:
407@example
408 #define dprintf(level, __VA_ARGS__) printf(__VA_ARGS__)
409@end example
410
411@noindent
412@code{dprintf} can then be used with a variable number of parameters.
413
414@item Declarations can appear anywhere in a block (as in C++).
415
416@item Array and struct/union elements can be initialized in any order by
417 using designators:
418@example
419 struct @{ int x, y; @} st[10] = @{ [0].x = 1, [0].y = 2 @};
420
421 int tab[10] = @{ 1, 2, [5] = 5, [9] = 9@};
422@end example
423
424@item Compound initializers are supported:
425@example
426 int *p = (int [])@{ 1, 2, 3 @};
427@end example
428to initialize a pointer pointing to an initialized array. The same
429works for structures and strings.
430
431@item Hexadecimal floating point constants are supported:
432@example
433 double d = 0x1234p10;
434@end example
435
436@noindent
437is the same as writing
438@example
439 double d = 4771840.0;
440@end example
441
442@item @code{inline} keyword is ignored.
443
444@item @code{restrict} keyword is ignored.
445@end itemize
446
447@section GNU C extensions
448
449TCC implements some GNU C extensions:
450
451@itemize
452
453@item array designators can be used without '=':
454@example
455 int a[10] = @{ [0] 1, [5] 2, 3, 4 @};
456@end example
457
458@item Structure field designators can be a label:
459@example
460 struct @{ int x, y; @} st = @{ x: 1, y: 1@};
461@end example
462instead of
463@example
464 struct @{ int x, y; @} st = @{ .x = 1, .y = 1@};
465@end example
466
467@item @code{\e} is ASCII character 27.
468
469@item case ranges : ranges can be used in @code{case}s:
470@example
471 switch(a) @{
472 case 1 @dots{} 9:
473 printf("range 1 to 9\n");
474 break;
475 default:
476 printf("unexpected\n");
477 break;
478 @}
479@end example
480
481@cindex aligned attribute
482@cindex packed attribute
483@cindex section attribute
484@cindex unused attribute
485@cindex cdecl attribute
486@cindex stdcall attribute
487@cindex regparm attribute
488@cindex dllexport attribute
489
490@item The keyword @code{__attribute__} is handled to specify variable or
491function attributes. The following attributes are supported:
492 @itemize
493
494 @item @code{aligned(n)}: align a variable or a structure field to n bytes
495(must be a power of two).
496
497 @item @code{packed}: force alignment of a variable or a structure field to
498 1.
499
500 @item @code{section(name)}: generate function or data in assembly section
501name (name is a string containing the section name) instead of the default
502section.
503
504 @item @code{unused}: specify that the variable or the function is unused.
505
506 @item @code{cdecl}: use standard C calling convention (default).
507
508 @item @code{stdcall}: use Pascal-like calling convention.
509
510 @item @code{regparm(n)}: use fast i386 calling convention. @var{n} must be
511between 1 and 3. The first @var{n} function parameters are respectively put in
512registers @code{%eax}, @code{%edx} and @code{%ecx}.
513
514 @item @code{dllexport}: export function from dll/executable (win32 only)
515
516 @end itemize
517
518Here are some examples:
519@example
520 int a __attribute__ ((aligned(8), section(".mysection")));
521@end example
522
523@noindent
524align variable @code{a} to 8 bytes and put it in section @code{.mysection}.
525
526@example
527 int my_add(int a, int b) __attribute__ ((section(".mycodesection")))
528 @{
529 return a + b;
530 @}
531@end example
532
533@noindent
534generate function @code{my_add} in section @code{.mycodesection}.
535
536@item GNU style variadic macros:
537@example
538 #define dprintf(fmt, args@dots{}) printf(fmt, ## args)
539
540 dprintf("no arg\n");
541 dprintf("one arg %d\n", 1);
542@end example
543
544@item @code{__FUNCTION__} is interpreted as C99 @code{__func__}
545(so it has not exactly the same semantics as string literal GNUC
546where it is a string literal).
547
548@item The @code{__alignof__} keyword can be used as @code{sizeof}
549to get the alignment of a type or an expression.
550
551@item The @code{typeof(x)} returns the type of @code{x}.
552@code{x} is an expression or a type.
553
554@item Computed gotos: @code{&&label} returns a pointer of type
555@code{void *} on the goto label @code{label}. @code{goto *expr} can be
556used to jump on the pointer resulting from @code{expr}.
557
558@item Inline assembly with asm instruction:
559@cindex inline assembly
560@cindex assembly, inline
561@cindex __asm__
562@example
563static inline void * my_memcpy(void * to, const void * from, size_t n)
564@{
565int d0, d1, d2;
566__asm__ __volatile__(
567 "rep ; movsl\n\t"
568 "testb $2,%b4\n\t"
569 "je 1f\n\t"
570 "movsw\n"
571 "1:\ttestb $1,%b4\n\t"
572 "je 2f\n\t"
573 "movsb\n"
574 "2:"
575 : "=&c" (d0), "=&D" (d1), "=&S" (d2)
576 :"0" (n/4), "q" (n),"1" ((long) to),"2" ((long) from)
577 : "memory");
578return (to);
579@}
580@end example
581
582@noindent
583@cindex gas
584TCC includes its own x86 inline assembler with a @code{gas}-like (GNU
585assembler) syntax. No intermediate files are generated. GCC 3.x named
586operands are supported.
587
588@item @code{__builtin_types_compatible_p()} and @code{__builtin_constant_p()}
589are supported.
590
591@item @code{#pragma pack} is supported for win32 compatibility.
592
593@end itemize
594
595@section TinyCC extensions
596
597@itemize
598
599@item @code{__TINYC__} is a predefined macro to @code{1} to
600indicate that you use TCC.
601
602@item @code{#!} at the start of a line is ignored to allow scripting.
603
604@item Binary digits can be entered (@code{0b101} instead of
605@code{5}).
606
607@item @code{__BOUNDS_CHECKING_ON} is defined if bound checking is activated.
608
609@end itemize
610
611@node asm
612@chapter TinyCC Assembler
613
614Since version 0.9.16, TinyCC integrates its own assembler. TinyCC
615assembler supports a gas-like syntax (GNU assembler). You can
616desactivate assembler support if you want a smaller TinyCC executable
617(the C compiler does not rely on the assembler).
618
619TinyCC Assembler is used to handle files with @file{.S} (C
620preprocessed assembler) and @file{.s} extensions. It is also used to
621handle the GNU inline assembler with the @code{asm} keyword.
622
623@section Syntax
624
625TinyCC Assembler supports most of the gas syntax. The tokens are the
626same as C.
627
628@itemize
629
630@item C and C++ comments are supported.
631
632@item Identifiers are the same as C, so you cannot use '.' or '$'.
633
634@item Only 32 bit integer numbers are supported.
635
636@end itemize
637
638@section Expressions
639
640@itemize
641
642@item Integers in decimal, octal and hexa are supported.
643
644@item Unary operators: +, -, ~.
645
646@item Binary operators in decreasing priority order:
647
648@enumerate
649@item *, /, %
650@item &, |, ^
651@item +, -
652@end enumerate
653
654@item A value is either an absolute number or a label plus an offset.
655All operators accept absolute values except '+' and '-'. '+' or '-' can be
656used to add an offset to a label. '-' supports two labels only if they
657are the same or if they are both defined and in the same section.
658
659@end itemize
660
661@section Labels
662
663@itemize
664
665@item All labels are considered as local, except undefined ones.
666
667@item Numeric labels can be used as local @code{gas}-like labels.
668They can be defined several times in the same source. Use 'b'
669(backward) or 'f' (forward) as suffix to reference them:
670
671@example
672 1:
673 jmp 1b /* jump to '1' label before */
674 jmp 1f /* jump to '1' label after */
675 1:
676@end example
677
678@end itemize
679
680@section Directives
681@cindex assembler directives
682@cindex directives, assembler
683@cindex align directive
684@cindex skip directive
685@cindex space directive
686@cindex byte directive
687@cindex word directive
688@cindex short directive
689@cindex int directive
690@cindex long directive
691@cindex quad directive
692@cindex globl directive
693@cindex global directive
694@cindex section directive
695@cindex text directive
696@cindex data directive
697@cindex bss directive
698@cindex fill directive
699@cindex org directive
700@cindex previous directive
701@cindex string directive
702@cindex asciz directive
703@cindex ascii directive
704
705All directives are preceeded by a '.'. The following directives are
706supported:
707
708@itemize
709@item .align n[,value]
710@item .skip n[,value]
711@item .space n[,value]
712@item .byte value1[,...]
713@item .word value1[,...]
714@item .short value1[,...]
715@item .int value1[,...]
716@item .long value1[,...]
717@item .quad immediate_value1[,...]
718@item .globl symbol
719@item .global symbol
720@item .section section
721@item .text
722@item .data
723@item .bss
724@item .fill repeat[,size[,value]]
725@item .org n
726@item .previous
727@item .string string[,...]
728@item .asciz string[,...]
729@item .ascii string[,...]
730@end itemize
731
732@section X86 Assembler
733@cindex assembler
734
735All X86 opcodes are supported. Only ATT syntax is supported (source
736then destination operand order). If no size suffix is given, TinyCC
737tries to guess it from the operand sizes.
738
739Currently, MMX opcodes are supported but not SSE ones.
740
741@node linker
742@chapter TinyCC Linker
743@cindex linker
744
745@section ELF file generation
746@cindex ELF
747
748TCC can directly output relocatable ELF files (object files),
749executable ELF files and dynamic ELF libraries without relying on an
750external linker.
751
752Dynamic ELF libraries can be output but the C compiler does not generate
753position independent code (PIC). It means that the dynamic library
754code generated by TCC cannot be factorized among processes yet.
755
756TCC linker eliminates unreferenced object code in libraries. A single pass is
757done on the object and library list, so the order in which object files and
758libraries are specified is important (same constraint as GNU ld). No grouping
759options (@option{--start-group} and @option{--end-group}) are supported.
760
761@section ELF file loader
762
763TCC can load ELF object files, archives (.a files) and dynamic
764libraries (.so).
765
766@section PE-i386 file generation
767@cindex PE-i386
768
769TCC for Windows supports the native Win32 executable file format (PE-i386). It
770generates EXE files (console and gui) and DLL files.
771
772For usage on Windows, see also tcc-win32.txt.
773
774@section GNU Linker Scripts
775@cindex scripts, linker
776@cindex linker scripts
777@cindex GROUP, linker command
778@cindex FILE, linker command
779@cindex OUTPUT_FORMAT, linker command
780@cindex TARGET, linker command
781
782Because on many Linux systems some dynamic libraries (such as
783@file{/usr/lib/libc.so}) are in fact GNU ld link scripts (horrible!),
784the TCC linker also supports a subset of GNU ld scripts.
785
786The @code{GROUP} and @code{FILE} commands are supported. @code{OUTPUT_FORMAT}
787and @code{TARGET} are ignored.
788
789Example from @file{/usr/lib/libc.so}:
790@example
791/* GNU ld script
792 Use the shared library, but some functions are only in
793 the static library, so try that secondarily. */
794GROUP ( /lib/libc.so.6 /usr/lib/libc_nonshared.a )
795@end example
796
797@node Bounds
798@chapter TinyCC Memory and Bound checks
799@cindex bound checks
800@cindex memory checks
801
802This feature is activated with the @option{-b} (@pxref{Invoke}).
803
804Note that pointer size is @emph{unchanged} and that code generated
805with bound checks is @emph{fully compatible} with unchecked
806code. When a pointer comes from unchecked code, it is assumed to be
807valid. Even very obscure C code with casts should work correctly.
808
809For more information about the ideas behind this method, see
810@url{http://www.doc.ic.ac.uk/~phjk/BoundsChecking.html}.
811
812Here are some examples of caught errors:
813
814@table @asis
815
816@item Invalid range with standard string function:
817@example
818@{
819 char tab[10];
820 memset(tab, 0, 11);
821@}
822@end example
823
824@item Out of bounds-error in global or local arrays:
825@example
826@{
827 int tab[10];
828 for(i=0;i<11;i++) @{
829 sum += tab[i];
830 @}
831@}
832@end example
833
834@item Out of bounds-error in malloc'ed data:
835@example
836@{
837 int *tab;
838 tab = malloc(20 * sizeof(int));
839 for(i=0;i<21;i++) @{
840 sum += tab4[i];
841 @}
842 free(tab);
843@}
844@end example
845
846@item Access of freed memory:
847@example
848@{
849 int *tab;
850 tab = malloc(20 * sizeof(int));
851 free(tab);
852 for(i=0;i<20;i++) @{
853 sum += tab4[i];
854 @}
855@}
856@end example
857
858@item Double free:
859@example
860@{
861 int *tab;
862 tab = malloc(20 * sizeof(int));
863 free(tab);
864 free(tab);
865@}
866@end example
867
868@end table
869
870@node Libtcc
871@chapter The @code{libtcc} library
872
873The @code{libtcc} library enables you to use TCC as a backend for
874dynamic code generation.
875
876Read the @file{libtcc.h} to have an overview of the API. Read
877@file{libtcc_test.c} to have a very simple example.
878
879The idea consists in giving a C string containing the program you want
880to compile directly to @code{libtcc}. Then you can access to any global
881symbol (function or variable) defined.
882
883@node devel
884@chapter Developer's guide
885
886This chapter gives some hints to understand how TCC works. You can skip
887it if you do not intend to modify the TCC code.
888
889@section File reading
890
891The @code{BufferedFile} structure contains the context needed to read a
892file, including the current line number. @code{tcc_open()} opens a new
893file and @code{tcc_close()} closes it. @code{inp()} returns the next
894character.
895
896@section Lexer
897
898@code{next()} reads the next token in the current
899file. @code{next_nomacro()} reads the next token without macro
900expansion.
901
902@code{tok} contains the current token (see @code{TOK_xxx})
903constants. Identifiers and keywords are also keywords. @code{tokc}
904contains additional infos about the token (for example a constant value
905if number or string token).
906
907@section Parser
908
909The parser is hardcoded (yacc is not necessary). It does only one pass,
910except:
911
912@itemize
913
914@item For initialized arrays with unknown size, a first pass
915is done to count the number of elements.
916
917@item For architectures where arguments are evaluated in
918reverse order, a first pass is done to reverse the argument order.
919
920@end itemize
921
922@section Types
923
924The types are stored in a single 'int' variable. It was choosen in the
925first stages of development when tcc was much simpler. Now, it may not
926be the best solution.
927
928@example
929#define VT_INT 0 /* integer type */
930#define VT_BYTE 1 /* signed byte type */
931#define VT_SHORT 2 /* short type */
932#define VT_VOID 3 /* void type */
933#define VT_PTR 4 /* pointer */
934#define VT_ENUM 5 /* enum definition */
935#define VT_FUNC 6 /* function type */
936#define VT_STRUCT 7 /* struct/union definition */
937#define VT_FLOAT 8 /* IEEE float */
938#define VT_DOUBLE 9 /* IEEE double */
939#define VT_LDOUBLE 10 /* IEEE long double */
940#define VT_BOOL 11 /* ISOC99 boolean type */
941#define VT_LLONG 12 /* 64 bit integer */
942#define VT_LONG 13 /* long integer (NEVER USED as type, only
943 during parsing) */
944#define VT_BTYPE 0x000f /* mask for basic type */
945#define VT_UNSIGNED 0x0010 /* unsigned type */
946#define VT_ARRAY 0x0020 /* array type (also has VT_PTR) */
947#define VT_VLA 0x20000 /* VLA type (also has VT_PTR and VT_ARRAY) */
948#define VT_BITFIELD 0x0040 /* bitfield modifier */
949#define VT_CONSTANT 0x0800 /* const modifier */
950#define VT_VOLATILE 0x1000 /* volatile modifier */
951#define VT_SIGNED 0x2000 /* signed type */
952
953#define VT_STRUCT_SHIFT 18 /* structure/enum name shift (14 bits left) */
954@end example
955
956When a reference to another type is needed (for pointers, functions and
957structures), the @code{32 - VT_STRUCT_SHIFT} high order bits are used to
958store an identifier reference.
959
960The @code{VT_UNSIGNED} flag can be set for chars, shorts, ints and long
961longs.
962
963Arrays are considered as pointers @code{VT_PTR} with the flag
964@code{VT_ARRAY} set. Variable length arrays are considered as special
965arrays and have flag @code{VT_VLA} set instead of @code{VT_ARRAY}.
966
967The @code{VT_BITFIELD} flag can be set for chars, shorts, ints and long
968longs. If it is set, then the bitfield position is stored from bits
969VT_STRUCT_SHIFT to VT_STRUCT_SHIFT + 5 and the bit field size is stored
970from bits VT_STRUCT_SHIFT + 6 to VT_STRUCT_SHIFT + 11.
971
972@code{VT_LONG} is never used except during parsing.
973
974During parsing, the storage of an object is also stored in the type
975integer:
976
977@example
978#define VT_EXTERN 0x00000080 /* extern definition */
979#define VT_STATIC 0x00000100 /* static variable */
980#define VT_TYPEDEF 0x00000200 /* typedef definition */
981#define VT_INLINE 0x00000400 /* inline definition */
982#define VT_IMPORT 0x00004000 /* win32: extern data imported from dll */
983#define VT_EXPORT 0x00008000 /* win32: data exported from dll */
984#define VT_WEAK 0x00010000 /* win32: data exported from dll */
985@end example
986
987@section Symbols
988
989All symbols are stored in hashed symbol stacks. Each symbol stack
990contains @code{Sym} structures.
991
992@code{Sym.v} contains the symbol name (remember
993an idenfier is also a token, so a string is never necessary to store
994it). @code{Sym.t} gives the type of the symbol. @code{Sym.r} is usually
995the register in which the corresponding variable is stored. @code{Sym.c} is
996usually a constant associated to the symbol like its address for normal
997symbols, and the number of entries for symbols representing arrays.
998Variable length array types use @code{Sym.c} as a location on the stack
999which holds the runtime sizeof for the type.
1000
1001Four main symbol stacks are defined:
1002
1003@table @code
1004
1005@item define_stack
1006for the macros (@code{#define}s).
1007
1008@item global_stack
1009for the global variables, functions and types.
1010
1011@item local_stack
1012for the local variables, functions and types.
1013
1014@item global_label_stack
1015for the local labels (for @code{goto}).
1016
1017@item label_stack
1018for GCC block local labels (see the @code{__label__} keyword).
1019
1020@end table
1021
1022@code{sym_push()} is used to add a new symbol in the local symbol
1023stack. If no local symbol stack is active, it is added in the global
1024symbol stack.
1025
1026@code{sym_pop(st,b)} pops symbols from the symbol stack @var{st} until
1027the symbol @var{b} is on the top of stack. If @var{b} is NULL, the stack
1028is emptied.
1029
1030@code{sym_find(v)} return the symbol associated to the identifier
1031@var{v}. The local stack is searched first from top to bottom, then the
1032global stack.
1033
1034@section Sections
1035
1036The generated code and datas are written in sections. The structure
1037@code{Section} contains all the necessary information for a given
1038section. @code{new_section()} creates a new section. ELF file semantics
1039is assumed for each section.
1040
1041The following sections are predefined:
1042
1043@table @code
1044
1045@item text_section
1046is the section containing the generated code. @var{ind} contains the
1047current position in the code section.
1048
1049@item data_section
1050contains initialized data
1051
1052@item bss_section
1053contains uninitialized data
1054
1055@item bounds_section
1056@itemx lbounds_section
1057are used when bound checking is activated
1058
1059@item stab_section
1060@itemx stabstr_section
1061are used when debugging is actived to store debug information
1062
1063@item symtab_section
1064@itemx strtab_section
1065contain the exported symbols (currently only used for debugging).
1066
1067@end table
1068
1069@section Code generation
1070@cindex code generation
1071
1072@subsection Introduction
1073
1074The TCC code generator directly generates linked binary code in one
1075pass. It is rather unusual these days (see gcc for example which
1076generates text assembly), but it can be very fast and surprisingly
1077little complicated.
1078
1079The TCC code generator is register based. Optimization is only done at
1080the expression level. No intermediate representation of expression is
1081kept except the current values stored in the @emph{value stack}.
1082
1083On x86, three temporary registers are used. When more registers are
1084needed, one register is spilled into a new temporary variable on the stack.
1085
1086@subsection The value stack
1087@cindex value stack, introduction
1088
1089When an expression is parsed, its value is pushed on the value stack
1090(@var{vstack}). The top of the value stack is @var{vtop}. Each value
1091stack entry is the structure @code{SValue}.
1092
1093@code{SValue.t} is the type. @code{SValue.r} indicates how the value is
1094currently stored in the generated code. It is usually a CPU register
1095index (@code{REG_xxx} constants), but additional values and flags are
1096defined:
1097
1098@example
1099#define VT_CONST 0x00f0
1100#define VT_LLOCAL 0x00f1
1101#define VT_LOCAL 0x00f2
1102#define VT_CMP 0x00f3
1103#define VT_JMP 0x00f4
1104#define VT_JMPI 0x00f5
1105#define VT_LVAL 0x0100
1106#define VT_SYM 0x0200
1107#define VT_MUSTCAST 0x0400
1108#define VT_MUSTBOUND 0x0800
1109#define VT_BOUNDED 0x8000
1110#define VT_LVAL_BYTE 0x1000
1111#define VT_LVAL_SHORT 0x2000
1112#define VT_LVAL_UNSIGNED 0x4000
1113#define VT_LVAL_TYPE (VT_LVAL_BYTE | VT_LVAL_SHORT | VT_LVAL_UNSIGNED)
1114@end example
1115
1116@table @code
1117
1118@item VT_CONST
1119indicates that the value is a constant. It is stored in the union
1120@code{SValue.c}, depending on its type.
1121
1122@item VT_LOCAL
1123indicates a local variable pointer at offset @code{SValue.c.i} in the
1124stack.
1125
1126@item VT_CMP
1127indicates that the value is actually stored in the CPU flags (i.e. the
1128value is the consequence of a test). The value is either 0 or 1. The
1129actual CPU flags used is indicated in @code{SValue.c.i}.
1130
1131If any code is generated which destroys the CPU flags, this value MUST be
1132put in a normal register.
1133
1134@item VT_JMP
1135@itemx VT_JMPI
1136indicates that the value is the consequence of a conditional jump. For VT_JMP,
1137it is 1 if the jump is taken, 0 otherwise. For VT_JMPI it is inverted.
1138
1139These values are used to compile the @code{||} and @code{&&} logical
1140operators.
1141
1142If any code is generated, this value MUST be put in a normal
1143register. Otherwise, the generated code won't be executed if the jump is
1144taken.
1145
1146@item VT_LVAL
1147is a flag indicating that the value is actually an lvalue (left value of
1148an assignment). It means that the value stored is actually a pointer to
1149the wanted value.
1150
1151Understanding the use @code{VT_LVAL} is very important if you want to
1152understand how TCC works.
1153
1154@item VT_LVAL_BYTE
1155@itemx VT_LVAL_SHORT
1156@itemx VT_LVAL_UNSIGNED
1157if the lvalue has an integer type, then these flags give its real
1158type. The type alone is not enough in case of cast optimisations.
1159
1160@item VT_LLOCAL
1161is a saved lvalue on the stack. @code{VT_LLOCAL} should be eliminated
1162ASAP because its semantics are rather complicated.
1163
1164@item VT_MUSTCAST
1165indicates that a cast to the value type must be performed if the value
1166is used (lazy casting).
1167
1168@item VT_SYM
1169indicates that the symbol @code{SValue.sym} must be added to the constant.
1170
1171@item VT_MUSTBOUND
1172@itemx VT_BOUNDED
1173are only used for optional bound checking.
1174
1175@end table
1176
1177@subsection Manipulating the value stack
1178@cindex value stack
1179
1180@code{vsetc()} and @code{vset()} pushes a new value on the value
1181stack. If the previous @var{vtop} was stored in a very unsafe place(for
1182example in the CPU flags), then some code is generated to put the
1183previous @var{vtop} in a safe storage.
1184
1185@code{vpop()} pops @var{vtop}. In some cases, it also generates cleanup
1186code (for example if stacked floating point registers are used as on
1187x86).
1188
1189The @code{gv(rc)} function generates code to evaluate @var{vtop} (the
1190top value of the stack) into registers. @var{rc} selects in which
1191register class the value should be put. @code{gv()} is the @emph{most
1192important function} of the code generator.
1193
1194@code{gv2()} is the same as @code{gv()} but for the top two stack
1195entries.
1196
1197@subsection CPU dependent code generation
1198@cindex CPU dependent
1199See the @file{i386-gen.c} file to have an example.
1200
1201@table @code
1202
1203@item load()
1204must generate the code needed to load a stack value into a register.
1205
1206@item store()
1207must generate the code needed to store a register into a stack value
1208lvalue.
1209
1210@item gfunc_start()
1211@itemx gfunc_param()
1212@itemx gfunc_call()
1213should generate a function call
1214
1215@item gfunc_prolog()
1216@itemx gfunc_epilog()
1217should generate a function prolog/epilog.
1218
1219@item gen_opi(op)
1220must generate the binary integer operation @var{op} on the two top
1221entries of the stack which are guaranted to contain integer types.
1222
1223The result value should be put on the stack.
1224
1225@item gen_opf(op)
1226same as @code{gen_opi()} for floating point operations. The two top
1227entries of the stack are guaranted to contain floating point values of
1228same types.
1229
1230@item gen_cvt_itof()
1231integer to floating point conversion.
1232
1233@item gen_cvt_ftoi()
1234floating point to integer conversion.
1235
1236@item gen_cvt_ftof()
1237floating point to floating point of different size conversion.
1238
1239@item gen_bounded_ptr_add()
1240@item gen_bounded_ptr_deref()
1241are only used for bounds checking.
1242
1243@end table
1244
1245@section Optimizations done
1246@cindex optimizations
1247@cindex constant propagation
1248@cindex strength reduction
1249@cindex comparison operators
1250@cindex caching processor flags
1251@cindex flags, caching
1252@cindex jump optimization
1253Constant propagation is done for all operations. Multiplications and
1254divisions are optimized to shifts when appropriate. Comparison
1255operators are optimized by maintaining a special cache for the
1256processor flags. &&, || and ! are optimized by maintaining a special
1257'jump target' value. No other jump optimization is currently performed
1258because it would require to store the code in a more abstract fashion.
1259
1260@unnumbered Concept Index
1261@printindex cp
1262
1263@bye
1264
1265@c Local variables:
1266@c fill-column: 78
1267@c texinfo-column-for-description: 32
1268@c End:
Note: See TracBrowser for help on using the repository browser.