LEFT | RIGHT |
(no file at all) | |
1 // Copyright 2009 The Go Authors. All rights reserved. | 1 // Copyright 2009 The Go Authors. All rights reserved. |
2 // Use of this source code is governed by a BSD-style | 2 // Use of this source code is governed by a BSD-style |
3 // license that can be found in the LICENSE file. | 3 // license that can be found in the LICENSE file. |
4 | 4 |
5 /* | 5 /* |
6 | 6 |
7 Cgo enables the creation of Go packages that call C code. | 7 Cgo enables the creation of Go packages that call C code. |
8 | 8 |
9 Usage: | 9 Usage: |
10 go tool cgo [compiler options] file.go | 10 go tool cgo [compiler options] file.go |
(...skipping 116 matching lines...) Expand 10 before | Expand all | Expand 10 after Loading... |
127 The standard package construction rules of the go command | 127 The standard package construction rules of the go command |
128 automate the process of using cgo. See $GOROOT/misc/cgo/stdio | 128 automate the process of using cgo. See $GOROOT/misc/cgo/stdio |
129 and $GOROOT/misc/cgo/gmp for examples. | 129 and $GOROOT/misc/cgo/gmp for examples. |
130 | 130 |
131 Cgo does not yet work with gccgo. | 131 Cgo does not yet work with gccgo. |
132 | 132 |
133 See "C? Go? Cgo!" for an introduction to using cgo: | 133 See "C? Go? Cgo!" for an introduction to using cgo: |
134 http://golang.org/doc/articles/c_go_cgo.html | 134 http://golang.org/doc/articles/c_go_cgo.html |
135 */ | 135 */ |
136 package main | 136 package main |
| 137 |
| 138 /* |
| 139 Implementation details. |
| 140 |
| 141 Cgo provides a way for Go programs to call C code linked into the same |
| 142 address space. This comment explains the operation of cgo. |
| 143 |
| 144 Cgo reads a set of Go source files and looks for statements saying |
| 145 import "C". If the import has a doc comment, that comment is |
| 146 taken as literal C code to be used as a preamble to any C code |
| 147 generated by cgo. A typical preamble #includes necessary definitions: |
| 148 |
| 149 // #include <stdio.h> |
| 150 import "C" |
| 151 |
| 152 For more details about the usage of cgo, see the documentation |
| 153 comment at the top of this file. |
| 154 |
| 155 Understanding C |
| 156 |
| 157 Cgo scans the Go source files that import "C" for uses of that |
| 158 package, such as C.puts. It collects all such identifiers. The next |
| 159 step is to determine each kind of name. In C.xxx the xxx might refer |
| 160 to a type, a function, a constant, or a global variable. Cgo must |
| 161 decide which. |
| 162 |
| 163 The obvious thing for cgo to do is to process the preamble, expanding |
| 164 #includes and processing the corresponding C code. That would require |
| 165 a full C parser and type checker that was also aware of any extensions |
| 166 known to the system compiler (for example, all the GNU C extensions) as |
| 167 well as the system-specific header locations and system-specific |
| 168 pre-#defined macros. This is certainly possible to do, but it is an |
| 169 enormous amount of work. |
| 170 |
| 171 Cgo takes a different approach. It determines the meaning of C |
| 172 identifiers not by parsing C code but by feeding carefully constructed |
| 173 programs into the system C compiler and interpreting the generated |
| 174 error messages, debug information, and object files. In practice, |
| 175 parsing these is significantly less work and more robust than parsing |
| 176 C source. |
| 177 |
| 178 Cgo first invokes gcc -E -dM on the preamble, in order to find out |
| 179 about simple #defines for constants and the like. These are recorded |
| 180 for later use. |
| 181 |
| 182 Next, cgo needs to identify the kinds for each identifier. For the |
| 183 identifiers C.foo and C.bar, cgo generates this C program: |
| 184 |
| 185 <preamble> |
| 186 void __cgo__f__(void) { |
| 187 #line 1 "cgo-test" |
| 188 foo; |
| 189 enum { _cgo_enum_0 = foo }; |
| 190 bar; |
| 191 enum { _cgo_enum_1 = bar }; |
| 192 } |
| 193 |
| 194 This program will not compile, but cgo can look at the error messages |
| 195 to infer the kind of each identifier. The line number given in the |
| 196 error tells cgo which identifier is involved. |
| 197 |
| 198 An error like "unexpected type name" or "useless type name in empty |
| 199 declaration" or "declaration does not declare anything" tells cgo that |
| 200 the identifier is a type. |
| 201 |
| 202 An error like "statement with no effect" or "expression result unused" |
| 203 tells cgo that the identifier is not a type, but not whether it is a |
| 204 constant, function, or global variable. |
| 205 |
| 206 An error like "not an integer constant" tells cgo that the identifier |
| 207 is not a constant. If it is also not a type, it must be a function or |
| 208 global variable. For now, those can be treated the same. |
| 209 |
| 210 Next, cgo must learn the details of each type, variable, function, or |
| 211 constant. It can do this by reading object files. If cgo has decided |
| 212 that t1 is a type, v2 and v3 are variables or functions, and c4, c5, |
| 213 and c6 are constants, it generates: |
| 214 |
| 215 <preamble> |
| 216 typeof(t1) *__cgo__1; |
| 217 typeof(v2) *__cgo__2; |
| 218 typeof(v3) *__cgo__3; |
| 219 typeof(c4) *__cgo__4; |
| 220 enum { __cgo_enum__4 = c4 }; |
| 221 typeof(c5) *__cgo__5; |
| 222 enum { __cgo_enum__5 = c5 }; |
| 223 typeof(c6) *__cgo__6; |
| 224 enum { __cgo_enum__6 = c6 }; |
| 225 |
| 226 long long __cgo_debug_data[] = { |
| 227 0, // t1 |
| 228 0, // v2 |
| 229 0, // v3 |
| 230 c4, |
| 231 c5, |
| 232 c6, |
| 233 1 |
| 234 }; |
| 235 |
| 236 and again invokes the system C compiler, to produce an object file |
| 237 containing debug information. Cgo parses the DWARF debug information |
| 238 for __cgo__N to learn the type of each identifier. (The types also |
| 239 distinguish functions from global variables.) If using a standard gcc, |
| 240 cgo can parse the DWARF debug information for the __cgo_enum__N to |
| 241 learn the identifier's value. The LLVM-based gcc on OS X emits |
| 242 incomplete DWARF information for enums; in that case cgo reads the |
| 243 constant values from the __cgo_debug_data from the object file's data |
| 244 segment. |
| 245 |
| 246 At this point cgo knows the meaning of each C.xxx well enough to start |
| 247 the translation process. |
| 248 |
| 249 Translating Go |
| 250 |
| 251 [The rest of this comment refers to 6g and 6c, the Go and C compilers |
| 252 that are part of the amd64 port of the gc Go toolchain. Everything here |
| 253 applies to another architecture's compilers as well.] |
| 254 |
| 255 Given the input Go files x.go and y.go, cgo generates these source |
| 256 files: |
| 257 |
| 258 x.cgo1.go # for 6g |
| 259 y.cgo1.go # for 6g |
| 260 _cgo_gotypes.go # for 6g |
| 261 _cgo_defun.c # for 6c |
| 262 x.cgo2.c # for gcc |
| 263 y.cgo2.c # for gcc |
| 264 _cgo_export.c # for gcc |
| 265 _cgo_main.c # for gcc |
| 266 |
| 267 The file x.cgo1.go is a copy of x.go with the import "C" removed and |
| 268 references to C.xxx replaced with names like _Cfunc_xxx or _Ctype_xxx. |
| 269 The definitions of those identifiers, written as Go functions, types, |
| 270 or variables, are provided in _cgo_gotypes.go. |
| 271 |
| 272 Here is a _cgo_gotypes.go containing definitions for C.flush (provided |
| 273 in the preamble) and C.puts (from stdio): |
| 274 |
| 275 type _Ctype_char int8 |
| 276 type _Ctype_int int32 |
| 277 type _Ctype_void [0]byte |
| 278 |
| 279 func _Cfunc_CString(string) *_Ctype_char |
| 280 func _Cfunc_flush() _Ctype_void |
| 281 func _Cfunc_puts(*_Ctype_char) _Ctype_int |
| 282 |
| 283 For functions, cgo only writes an external declaration in the Go |
| 284 output. The implementation is in a combination of C for 6c (meaning |
| 285 any gc-toolchain compiler) and C for gcc. |
| 286 |
| 287 The 6c file contains the definitions of the functions. They all have |
| 288 similar bodies that invoke runtime·cgocall to make a switch from the |
| 289 Go runtime world to the system C (GCC-based) world. |
| 290 |
| 291 For example, here is the definition of _Cfunc_puts: |
| 292 |
| 293 void _cgo_be59f0f25121_Cfunc_puts(void*); |
| 294 |
| 295 void |
| 296 ·_Cfunc_puts(struct{uint8 x[1];}p) |
| 297 { |
| 298 runtime·cgocall(_cgo_be59f0f25121_Cfunc_puts, &p); |
| 299 } |
| 300 |
| 301 The hexadecimal number is a hash of cgo's input, chosen to be |
| 302 deterministic yet unlikely to collide with other uses. The actual |
| 303 function _cgo_be59f0f25121_Cfunc_flush is implemented in a C source |
| 304 file compiled by gcc, the file x.cgo2.c: |
| 305 |
| 306 void |
| 307 _cgo_be59f0f25121_Cfunc_puts(void *v) |
| 308 { |
| 309 struct { |
| 310 char* p0; |
| 311 int r; |
| 312 char __pad12[4]; |
| 313 } __attribute__((__packed__)) *a = v; |
| 314 a->r = puts((void*)a->p0); |
| 315 } |
| 316 |
| 317 It extracts the arguments from the pointer to _Cfunc_puts's argument |
| 318 frame, invokes the system C function (in this case, puts), stores the |
| 319 result in the frame, and returns. |
| 320 |
| 321 Linking |
| 322 |
| 323 Once the _cgo_export.c and *.cgo2.c files have been compiled with gcc, |
| 324 they need to be linked into the final binary, along with the libraries |
| 325 they might depend on (in the case of puts, stdio). 6l has been |
| 326 extended to understand basic ELF files, but it does not understand ELF |
| 327 in the full complexity that modern C libraries embrace, so it cannot |
| 328 in general generate direct references to the system libraries. |
| 329 |
| 330 Instead, the build process generates an object file using dynamic |
| 331 linkage to the desired libraries. The main function is provided by |
| 332 _cgo_main.c: |
| 333 |
| 334 int main() { return 0; } |
| 335 void crosscall2(void(*fn)(void*, int), void *a, int c) { } |
| 336 void _cgo_allocate(void *a, int c) { } |
| 337 void _cgo_panic(void *a, int c) { } |
| 338 |
| 339 The extra functions here are stubs to satisfy the references in the C |
| 340 code generated for gcc. The build process links this stub, along with |
| 341 _cgo_export.c and *.cgo2.c, into a dynamic executable and then lets |
| 342 cgo examine the executable. Cgo records the list of shared library |
| 343 references and resolved names and writes them into a new file |
| 344 _cgo_import.c, which looks like: |
| 345 |
| 346 #pragma dynlinker "/lib64/ld-linux-x86-64.so.2" |
| 347 #pragma dynimport puts puts#GLIBC_2.2.5 "libc.so.6" |
| 348 #pragma dynimport __libc_start_main __libc_start_main#GLIBC_2.2.5 "libc.
so.6" |
| 349 #pragma dynimport stdout stdout#GLIBC_2.2.5 "libc.so.6" |
| 350 #pragma dynimport fflush fflush#GLIBC_2.2.5 "libc.so.6" |
| 351 #pragma dynimport _ _ "libpthread.so.0" |
| 352 #pragma dynimport _ _ "libc.so.6" |
| 353 |
| 354 In the end, the compiled Go package, which will eventually be |
| 355 presented to 6l as part of a larger program, contains: |
| 356 |
| 357 _go_.6 # 6g-compiled object for _cgo_gotypes.go *.cgo1.go |
| 358 _cgo_defun.6 # 6c-compiled object for _cgo_defun.c |
| 359 _all.o # gcc-compiled object for _cgo_export.c, *.cgo2.c |
| 360 _cgo_import.6 # 6c-compiled object for _cgo_import.c |
| 361 |
| 362 The final program will be a dynamic executable, so that 6l can avoid |
| 363 needing to process arbitrary .o files. It only needs to process the .o |
| 364 files generated from C files that cgo writes, and those are much more |
| 365 limited in the ELF or other features that they use. |
| 366 |
| 367 In essence, the _cgo_import.6 file includes the extra linking |
| 368 directives that 6l is not sophisticated enough to derive from _all.o |
| 369 on its own. Similarly, the _all.o uses dynamic references to real |
| 370 system object code because 6l is not sophisticated enough to process |
| 371 the real code. |
| 372 |
| 373 The main benefits of this system are that 6l remains relatively simple |
| 374 (it does not need to implement a complete ELF and Mach-O linker) and |
| 375 that gcc is not needed after the package is compiled. For example, |
| 376 package net uses cgo for access to name resolution functions provided |
| 377 by libc. Although gcc is needed to compile package net, gcc is not |
| 378 needed to link programs that import package net. |
| 379 |
| 380 Runtime |
| 381 |
| 382 When using cgo, Go must not assume that it owns all details of the |
| 383 process. In particular it needs to coordinate with C in the use of |
| 384 threads and thread-local storage. The runtime package, in its own |
| 385 (6c-compiled) C code, declares a few uninitialized (default bss) |
| 386 variables: |
| 387 |
| 388 bool runtime·iscgo; |
| 389 void (*libcgo_thread_start)(void*); |
| 390 void (*initcgo)(G*); |
| 391 |
| 392 Any package using cgo imports "runtime/cgo", which provides |
| 393 initializations for these variables. It sets iscgo to 1, initcgo to a |
| 394 gcc-compiled function that can be called early during program startup, |
| 395 and libcgo_thread_start to a gcc-compiled function that can be used to |
| 396 create a new thread, in place of the runtime's usual direct system |
| 397 calls. |
| 398 |
| 399 */ |
LEFT | RIGHT |