LinuxSir.cn,穿越时空的Linuxsir!

 找回密码
 注册
搜索
热搜: shell linux mysql
12
返回列表 发新帖
楼主: libra_kevin

怎么把C和C++的目标程序链接到一起?

[复制链接]
发表于 2006-4-17 15:52:01 | 显示全部楼层
用汇编命令查看一下:
1. 先给me.h加上extern "C",看用gcc和g++命名有什么不同(结果是一样的)

  1. /*
  2. *me.h
  3. */
  4. #include <stdio.h>

  5. extern "C" void CppPrintf(void);

  6. [root@root GCC]# gcc -S me.cpp
  7. [root@root GCC]# less me.s
  8.         ...
  9.         subl    $8, %esp
  10.         movl    12(%ebp), %eax
  11.         decl    %eax
  12.         movl    %eax, -12(%ebp)
  13. ...skipping...
  14. .globl CppPrintf           //注意此函数的命名
  15.         .type   CppPrintf, @function
  16. CppPrintf:
  17. .LFB1442:
  18.         pushl   %ebp
  19. .LCFI4:
  20.         movl    %esp, %ebp
  21.         ...
  22. [root@root GCC]# g++ -S me.cpp
  23. [root@root GCC]# less me.s
  24.      ....
  25.      movl    12(%ebp), %eax
  26.         decl    %eax
  27.         movl    %eax, -12(%ebp)
  28. ...skipping...
  29. .globl CppPrintf           //注意此函数的命名            
  30.         .type   CppPrintf, @function
  31. CppPrintf:
  32. .LFB1442:
  33.         pushl   %ebp
  34. .LCFI4:
  35.         movl    %esp, %ebp
  36. .LCFI5:
  37.         subl    $8, %esp
  38.         ....
复制代码


1. 去掉me.h中extern "C",看用gcc和g++命名有什么不同(也是一样的)

  1. [root@root GCC]# gcc -S me.cpp
  2. [root@root GCC]# less me.s
  3.     ....
  4.     subl    $8, %esp
  5.         movl    12(%ebp), %eax
  6.         decl    %eax
  7.         movl    %eax, -12(%ebp)
  8. ...skipping...
  9. .globl _Z9CppPrintfv        //注意此函数的命名
  10.         .type   _Z9CppPrintfv, @function
  11. _Z9CppPrintfv:
  12. .LFB1442:
  13.         pushl   %ebp
  14. .LCFI4:
  15.         movl    %esp, %ebp
  16.         ...
  17. [root@root GCC]# gcc -S me.cpp
  18. [root@root GCC]# less me.s
  19. ...
  20. subl    $8, %esp
  21.         movl    12(%ebp), %eax
  22.         decl    %eax
  23.         movl    %eax, -12(%ebp)
  24. ...skipping...
  25. .globl _Z9CppPrintfv  //注意此函数的命名
  26.         .type   _Z9CppPrintfv, @function
  27. _Z9CppPrintfv:
  28. .LFB1442:
  29.         pushl   %ebp
  30. .LCFI4:
  31.         ...
复制代码

总结:用extern "C"时,无论用gcc或g++命名都一样,以C的命名方式命名。
   去掉他时,无论使用gcc或g++都一样,以CPP的命名方式命名。
请指教
回复 支持 反对

使用道具 举报

发表于 2006-4-18 13:30:47 | 显示全部楼层
extern "C"作用就是避免函数名被c++编译器变形为C++的方式,这样才能与C编译器生成的目标文件链接到一起。
回复 支持 反对

使用道具 举报

 楼主| 发表于 2006-4-18 14:03:25 | 显示全部楼层
多谢大家! 这次学了不少啊
回复 支持 反对

使用道具 举报

发表于 2006-4-18 14:07:29 | 显示全部楼层
Post by sybaselu

总结:用extern "C"时,无论用gcc或g++命名都一样,以C的命名方式命名。
   去掉他时,无论使用gcc或g++都一样,以CPP的命名方式命名。
请指教

我前面说的g++和gcc的区别主要是针对__cplusplus这个宏说的
g++会定义这个宏,gcc不会定义这个宏

用了
#ifdef __cplusplus
extern “C” {
#endif
       int printf(const char *format, ...);
#ifdef __cplusplus
}
#endif
的效果,就是:
gcc编译时,没有extern "C",目标文件里符号名不变
g++编译时,有extern "C",目标文件里符号名变化

不要不加条件的在声明一个函数时为其加上extern "C"
而应该使用#ifdef __cplusplus ... #endif

glibc里的头文件都是这样的
比如glibc-2.4里的/usr/include/stdio.h的一开始
有一个
__BEGIN_DECLS
这个宏的定义在/usr/include/sys/cdefs.h里
#ifdef  __cplusplus
# define __BEGIN_DECLS  extern "C" {
# define __END_DECLS    }
#else
# define __BEGIN_DECLS
# define __END_DECLS
#endif
回复 支持 反对

使用道具 举报

发表于 2006-4-19 16:40:43 | 显示全部楼层
extern "C"
是要告诉编译器,编译出来的库要兼容C
如果不用,那么编译出来的库,中间文件,是不能被C编译器使用的。
回复 支持 反对

使用道具 举报

发表于 2006-4-19 19:40:47 | 显示全部楼层
Post by darkise
extern "C"
是要告诉编译器,编译出来的库要兼容C

https://secure.wikimedia.org/wikipedia/en/wiki/Name_mangling
Name mangling in C++

C++ compilers are the most widespread, and yet least standard, users of name mangling. The first C++ compilers were implemented as translators to C source code, which would then be compiled by a C compiler to object code; because of this, symbol names had to conform to C identifier rules. Even later, with the emergence of compilers which produced machine code or assembler directly, the system's linker generally did not support C++ symbols, and mangling was still required.

The C++ language does not define a standard decoration scheme, so each compiler uses its own. Combined with the fact that C++ decoration can become fairly complex (storing information about classes, default arguments, variable ownership, operator overloading, etc), this means that object code produced by different compilers is not usually linkable.
[edit]

Simple example

Consider the following two definitions of f() in a C++ program:

int f (void) { return 1; }
int f (int)  { return 0; }
void g (void) { int i = f(), j = f(0); }

These are distinct functions, with no relation to each other apart from the name. If they were naïvely translated into C with no changes, the result would be an error — C does not permit two functions with the same name. The compiler therefore will encode the type information in the symbol name, the result being something resembling:

int __f_v (void) { return 1; }
int __f_i (int)  { return 0; }
void __g_v (void) { int i = __f_v(), j = __f_i(0); }

Notice that g() is mangled even though there is no conflict; name mangling applies to all symbols.
[edit]

Complex example

For a more complex example, we'll consider an example of a real-world name mangling implementation: that used by GNU GCC 3.x, and how it mangles the following example class. The mangled symbol is shown below the respective identifier name.

namespace wikipedia {
    class article {
    public:
       std::string format (void);
                /* = _ZN9wikipedia7article6formatEv */

       bool print_to (std:stream&);
                /* = _ZN9wikipedia7article8print_toERSo */

       class wikilink {
       public:
           wikilink (std::string const& name);
                    /* = _ZN9wikipedia7article8wikilinkC1ERKSs */
       };
    };
}

The name mangling scheme used here is relatively simple. All mangled symbols begin with _Z (note that an underscore followed by a capital is a reserved identifier in C and C++, so conflict with user identifiers is avoided); for nested names (including both namespaces and classes), this is followed by N, then a series of <length,id> pairs (the length being the length of the next identifier), and finally E. For example, wikipedia::article::format becomes

_ZN·9wikipedia·7article·6format·E  

For functions, this is then followed by the type information; as format() is a void function, this is simply v; hence:

_ZN·9wikipedia·7article·6format·E·v

For print_to, a standard type std:stream (or more properly std::basic_ostream<char, char_traits<char> >) is used, which has the special alias So; a reference to this type is therefore RSo, with the complete name for the function being:

_ZN·9wikipedia·7article·8print_to·E·RSo

[edit]

How different compilers mangle the same functions

There isn't a standard scheme by which even trivial C++ identifiers are mangled, and consequently different compiler vendors (or even different versions of the same compiler, or the same compiler on different platforms) mangle public symbols in radically different (and thus totally incompatible) ways. Consider how different C++ compilers mangle the same functions:
Compiler         void h(int)         void h(int, char)         void h(void)
GNU GCC 3.x         _Z1hi         _Z1hic         _Z1hv
GNU GCC 2.9x         h__Fi         h__Fic         h__Fv
Intel C++ 8.0 for Linux         _Z1hi         _Z1hic         _Z1hv
Microsoft VC++ v6/v7         ?h@@YAXH@Z         ?h@@YAXHD@Z         ?h@@YAXXZ
Borland C++ v3.1         @h$qi         @h$qizc         @h$qv
OpenVMS C++ V6.5 (ARM mode)         H__XI         H__XIC         H__XV
OpenVMS C++ V6.5 (ANSI mode)         CXX$__7H__FI0ARG51T         CXX$__7H__FIC26CDH77         CXX$__7H__FV2CB06E8
OpenVMS C++ X7.1 IA-64         CXX$_Z1HI2DSQ26A         CXX$_Z1HIC2NP3LI4         CXX$_Z1HV0BCA19V
Digital Mars C++         ?h@@YAXH@Z         ?h@@YAXHD@Z         ?h@@YAXXZ
SunPro CC         __1cBh6Fi_v_         __1cBh6Fic_v_         __1cBh6F_v_
HP aC++ A.05.55 IA-64         _Z1hi         _Z1hic         _Z1hv
HP aC++ A.03.45 PA-RISC         h__Fi         h__Fic         h__Fv
Tru64 C++ V6.5 (ARM mode)         h__Xi         h__Xic         h__Xv
Tru64 C++ V6.5 (ANSI mode)         __7h__Fi         __7h__Fic         __7h__Fv

Notes:

    * The Compaq C++ compiler on OpenVMS VAX and Alpha (but not IA-64) and Tru64 has two name mangling schemes. The original, pre-standard scheme is known as ARM model, and is based on the name mangling described in the C++ Annotated Reference Manual (ARM). With the advent of new features in standard C++, particularly templates, the ARM scheme became more and more unsuitable — it could not encode certain function types, or produced identical mangled names for different functions. It was therefore replaced by the newer "ANSI" model, which supported all ANSI template features, but was not backwards compatible. todo: the different isn't obvious from the examples. maybe a template or something should be added...
    * On IA-64, a standard ABI exists (see external links), which defines (among other things) a standard name-mangling scheme, and which is used by all the IA-64 compilers. GNU GCC 3.x, in addition, has adopted the name mangling scheme defined in this standard for use on other, non-Intel platforms.


[edit]

Handling of C symbols when linking from C++

The job of the common C++ idiom:

#ifdef __cplusplus
extern "C" {
#endif
    /* ... */
#ifdef __cplusplus
}
#endif

is to ensure that the symbols following are "unmangled" - that the compiler emits a binary file with their names undecorated, as a C compiler would do. As C language definitions are unmangled, the C++ compiler needs to avoid mangling references to these identifiers.

For example, the standard strings library, <string.h> usually contains something resembling:

#ifdef __cplusplus
extern "C" {
#endif

void *memset (void *, int, size_t);
char *strcat (char *, const char *);
int   strcmp (const char *, const char *);
char *strcpy (char *, const char *);

#ifdef __cplusplus
}
#endif

Thus, code such as:

if (strcmp(argv[1], "-x") == 0)
    strcpy(a, argv[2]);
else
    memset(a, 0, sizeof(a));

uses the correct, unmangled strcmp and memset. If the extern had not been used, the C++ compiler would produce code equivalent to:

if (__1cGstrcmp6Fpkc1_i_(argv[1], "-x") == 0)
    __1cGstrcpy6Fpcpkc_0_(a, argv[2]);
else
    __1cGmemset6FpviI_0_(a, 0, sizeof(a));

Since those symbols do not exist in the C runtime library (e.g. libc), link errors would result.


[edit]

Standardised name mangling in C++

While it is a relatively common belief that standardised name mangling in the C++ language would lead to greater interoperability between implementations, this is not really the case. Name mangling is only one of several ABI issues in a C++ implementation, and other language details like exception handling, virtual table layout, structure padding, etc. would render differing implementations yet incompatible. Further, requiring a particular form of mangling would cause issues for systems where implementation limits (e.g. length of symbols) dictate a particular mangling scheme. A standardised requirement for name mangling would also prevent an implementation where mangling was not required at all — for example, a linker which understood the C++ language.

The C++ standard therefore does not attempt to standardise name mangling. On the contary, the Annotated C++ Reference Manual (also known as ARM, ISBN 0-201-51459-1, section 7.2.1c) actively encourages to use different mangling schemes to prevent linking when other aspects of the ABI, such as exception handling and virtual table layout, are incompatible.
[edit]

Real-world effects of C++ name mangling

As C++ symbols are routinely exported from DLL and shared object files, the name mangling scheme is not merely a compiler-internal matter. Different compilers (or different versions of the same compiler, in many cases) produce such binaries under different name decoration schemes, meaning that symbols are frequently unresolved if the compilers used to create the library and the program using it employed different schemes. For example, if a system with multiple C++ compilers installed (e.g. GNU GCC and the OS vendor's compiler) wished to install the Boost library, it would have to be compiled twice — once for the vendor compiler and once for GCC.

For this reason name decoration is an important aspect of any C++-related ABI
回复 支持 反对

使用道具 举报

 楼主| 发表于 2006-4-19 21:58:59 | 显示全部楼层
那我怎么看到mangle以后的程序呢?
就像下面这段一样:
Thus, code such as:

if (strcmp(argv[1], "-x") == 0)
strcpy(a, argv[2]);
else
memset(a, 0, sizeof(a));

uses the correct, unmangled strcmp and memset. If the extern had not been used, the C++ compiler would produce code equivalent to:

if (__1cGstrcmp6Fpkc1_i_(argv[1], "-x") == 0)
__1cGstrcpy6Fpcpkc_0_(a, argv[2]);
else
__1cGmemset6FpviI_0_(a, 0, sizeof(a));

Since those symbols do not exist in the C runtime library (e.g. libc), link errors would result.
回复 支持 反对

使用道具 举报

发表于 2006-4-19 23:03:20 | 显示全部楼层
编译出目标文件,比如mangle.o
然后readelf -s mangle.o
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 注册

本版积分规则

快速回复 返回顶部 返回列表