If you plan on contributing code or submitting patches to fix bugs then PLEASE adhere to the following coding conventions. Doing so makes my life easier (since I *will* reformat myself any code that doesn't conform) and makes your code much more likely to be accepted (since any code that is difficult to read because of formatting problems simply won't get read, and if I haven't read it then it isn't going into my sources -- period). Of course, if you're submitting an entire file full of stuff (and you are therefore the original author of the file) then you can format it any way you like (and tell the rest of us how we have to format our code to get it included in your file. ;-) The following guidelines are sufficiently close to the GNU Coding Standards that you shouldn't need to change any of Emacs's C-mode variables from their default settings. FORMATTING CODE --------------- Tabs stops are every TWO characters. Regardless of what Torvalds might say, if tab stops were meant to be 8 characters apart then humans would have evolved wide-angle vision shortly after the invention of the Newbury terminal. (If your editor doesn't remind you about the matching delimiter when a block is more than a page long then it's time you evolved to Emacs. ;-) Always use C-style comment delimiters. Unix Squeak runs on lots of platforms where GNU gcc is not the standard compiler. Many of these compilers choke (quite rightly) on C++ style comments. Similarly, always comment-out any additional explanatory text appearing after `#else' and `#endif'. Use `#if defined(...)' in preference to `#ifdef ...'. Indent preprocessor commands to reflect the nesting of conditional sections, and place a comment after any `#else' or `#endif' that isn't utterly obvious: #if defined(LAZY) # include <sleep.h> # ... # if defined(VERY_LAZY) # include <vacation.h> # ... # endif # ... #else /* !LAZY */ # include <schedule.h> #endif Declarations and external definitions should start in column zero. The only exception is for small conditional sections where indentation helps to show the "scope" of the conditional code. Hence: #if defined(EAT_DONUTS) # include <donut/supplies.h> int donutCounter= 0; /* number of donuts consumed since startup */ #endif /* EAT_DONUTS */ But don't overdo this -- the start/end of the conditional section should remain clearly visible while looking at indented declarations. Always use ANSI prototypes, and put the return types of functions and their identifiers on the same line (unless you rely on tools that prefer function names to start in column zero, in which case you're forgiven in advance). Hence: static char *concat(char *s1, char *s2) { ... } If the arguments don't fit nicely on one line, and the function identifier is short, split the declaration like this: int lotsOfArgs(int anInteger, long aLong, short aShort, double aDouble, float aFloat) ... If the identifier is enormous (which sometimes happens because of the frequent association between C functions and Squeak selectors) then put the opening parenthesis on the next line, indented by four characters. (That way Emacs knows how to indent the rest of the arguments nicely.) Like this: int ioDoSomethingObscureWithFarTooMuchBabbleInTheIdentifier (int anInteger, long aLong, short aShort, double aDouble, float aFloat) ... Put spaces on either side of an arithmetic or relational operator. One of the most common (and most difficult to debug) problems in C is caused by writing `=' when you meant `=='. To avoid such mistakes, it is a great help to omit the space before an operator that side-effects its l-value. For example: if (x == 42) x= 666; while (ptr >= base) processWord(ptr-= 4); (In my three years of writing C before adopting this convention I spent many tens of hours tracking down `=' vs. `==' bugs. In the fifteen years since the instant that I adopted the above convention I have suffered from that bug precisely ONCE.) For similar reasons it's also a good idea to place constant values on the left side of a comparison. Most compilers will happily let this this pass: int c= getchar(); if (c= EOF) ... but will detect the problem if the comparison is written the other way round: int c= getchar(); if (EOF= c) ... Character constants are characters: void terminate(char *string, size_t len) { string[len]= '\0'; /* NOT `string[len]= 0;' !!! */ } Always initialise a variable, even if you're just about to give it a value: int min= 0; if (x < y) min= x; else min= y; ... Just about every compiler in existence can optimise away the assignment. On the hand, always assigning an initial value avoids warning messages about possibly uninitialised variables from several compilers when their optimisation dials are `cranked up to 11'. Vertically align related elements in dense program sections, such as a long series of assignments or declarations: int a= 42; /* current data value */ int counter= 0; /* how many we've counted */ int moreToDo= 1; /* 1 if we've got more to count */ This applies to the _identifiers_ when indirection is present: int a= 42; /* current data value */ int *current= 0; /* current data location */ int **dataLists= 0; /* current data list */ Don't be tempted to write `C++ style' types. Respect also the proper placement of whitespace in casts, which should have spaces in exactly the same places as would the equivalent declaration. Hence: /* ptr to (ptr to int) fn of ptr to ptr to int */ typedef int *(*ppifpi)(int **); ppifpi myPpifpi= 0; int ioSetPpifpi(int sqFunctionIndex) { myPpifpi= (int *(*)(int **))sqFunctionIndex; } (Outside Squeak code too, I guarantee that you'll come up against many more brick walls trying to use the `C++ style' "int* foo" [ugh -- it _isn't_ the `int' that's the pointer in there!!] and suchlike than you will if you stick to the old-fashioned, and _rational_, C syntax.) Put a space between a keyword and a parenthesised argument. Put spaces after commas. Don't put a space between a function name and the argument list, and never put a space after an opening parenthesis or before a closing parenthesis: if (x < foo(y, z)) haha= bar[4] + 5; else { while (z) { haha+= foo(z, z); --z; } return ++x + bar(); } When you split an expression into multiple lines, split it BEFORE an operator, not after one. Here is the right way: if (fooThisIsLong && bar > win(x, y, z) && remaining_condition) ... Avoid having two operators of different precedence at the same level of indentation. For example, don't write this: mode= (inmode[j] == VOIDmode || GET_MODE_SIZE(outmode[j]) > GET_MODE_SIZE(inmode[j]) ? outmode[j] : inmode[j]); Instead, use extra parentheses so that the indentation shows the logical nesting: mode= ((inmode[j] == VOIDmode || (GET_MODE_SIZE(outmode[j]) > GET_MODE_SIZE(inmode[j]))) ? outmode[j] : inmode[j]); Insert extra parentheses so that Emacs will indent the code properly. For example, the following indentation looks nice if you do it by hand, but Emacs would mess it up: v = rup->ru_utime.tv_sec*1000 + rup->ru_utime.tv_usec/1000 + rup->ru_stime.tv_sec*1000 + rup->ru_stime.tv_usec/1000; But adding a set of parentheses solves the problem: v = (rup->ru_utime.tv_sec*1000 + rup->ru_utime.tv_usec/1000 + rup->ru_stime.tv_sec*1000 + rup->ru_stime.tv_usec/1000); With very few exceptions, braces should always go on a line by themselves. Instead of this: if (x < y) { z= x; } else { z= y; } write this: if (x < y) { z= x; } else { z= y; } Never, ever, put an `else' and a brace on the same line. Format do-while statements like this: do { a= foo(a); } while (a > 0); Empty `while' and `for' loops should have their trailing semicolon on a line by itself, to make the absence of the body utterly obvious: for (i= 1; i < argc; processArg(i++)) ; while ('\0' != (*dst++= *src++)) ; Introduce redundant braces to restrict the scope of temporaries to the smallest possible region. This is WRONG: void foo(void) { int i; /* ... lots of code ... */ for (i= 0; i < LIMIT; ++i) { ... } /* ... lots of code ... */ for (i= 0; i < LIMIT; ++i) { ... } } while this is right: void foo(void) { /* ... lots of code ... */ { int i= 0; for (i= 0; i < LIMIT; ++i) { ... } } /* ... lots of code ... */ { int i= 0; for (i= 0; i < LIMIT; ++i) { ... } } } The above is not just to help small-brained coders (like me) remember to delete variables when the code that uses them is removed. It's preferable because some compilers are hopelessly bad at live-range splitting. Reducing the scope of variables to the absolute minimum encourages them to produce _much_ better code. (You'll also be far less tempted to use the value of an iteration variable after the end of the loop, which is unforgivably bad programming style.) Just like a good composer, a good programmer knows that rarefied zones are the most effective media in which to produce contrast. Leave lots of vertical whitespace between logically independent program sections, with a minimum of comment `decoration' to make them stand out. For example: 1 line of whitespace within function bodies to separate variable declarations from statements 1 line of whitespace between groups of external declarations. 2 lines of whitespace between function definitions. Begin a new "logical section" of the code with three blank lines, a comment explaining what the section is all about, and then three more blank lines before the section itself. and so on. Just use common sense. In particular, when there are many lines of declarations/definitions in a given section, the following is much kinder on the reader: ... /*** variable declarations ***/ /* counters */ int a= 0; int b= 0; /* pointers */ void *start= 0; /* initial address */ void *position= 0; /* current address, post-incremented */ void *end= 0; /* last address + 1 */ /*** function definitions ***/ /* accessors */ int getA(void) { return a; } /* predicates */ bool atEnd(void) { return position == end; } ... (and more effective) than the `noisy' equivalent: ... /*****************************/ /*****************************/ /*** ***/ /*** variable declarations ***/ /*** ***/ /*****************************/ /*****************************/ /****************/ /*** counters ***/ /****************/ int a= 0; int b= 0; /****************************/ /****************************/ /*** ***/ /*** function definitions ***/ /*** ***/ /****************************/ /****************************/ /*****************/ /*** accessors ***/ /*****************/ int getA(void) { return a; } /******************/ /*** predicates ***/ /******************/ bool atEnd(void) { return position == end; } ... where the details are hopelessly lost in the noisy comments, which serve only to make the program HARDER to read. (Besides, it looks just plain ugly. Typographers have known for centuries about the importance of maintaining a consistent `density' in the characters that make up an aesthetically-pleasing and easy to read font. The same applies to programs. Avoid large horizontal blobs of asterisks and other `dense' material intended to make things stand out -- they simply distract the reader's attention away from the essential. Use whitespace instead!) One final comment: remember that a crystal clear program is a MUCH more impressive achievement in the eyes of the vast majority of readers compared to a program designed to show off the author's cleverness in using obscure and obfuscating `C-isms' and cramming the maximum possible amount of functionality in the minimum number of source lines. Also, the time you spend making your program visually pleasing (and consistent) will be paid back many times over when you eventually have to go bug hunting. PORTABILITY ----------- Stick to ANSI Standard C in source code. Resist the temptation to introduce GNU-specific extensions, even if you think your code is only ever going to run in a GNU environment. If you use gcc then be sure your code compiles without warnings when `-Wall' is in effect (which is the case for all source files in the "unix" source directory). Ideally, your code should compile without warnings when "-pedantic" is turned on too, but I'm not insisting on this -- yet. ABSOLUTELY avoid GNU make-specific stuff in Makefiles. If your wizz-bang Squeak feature requires a GNU make-specific rule or conditional section in a Makefile someplace then go away and rethink your feature. (If the entire GNU compiler suite can be built using the tiniest imaginable subset of Makefile features, then so can Squeak.) Another way to avoid GNU make features in Makefiles is to run a script from `configure' to make a `fragment' and then AC_SUBST_FILE() it into the Makefile.in. If you do this then please, please make sure that the script is *strict* Bourne shell. (If you're on a system that has bash as its standard shell then you must ABSOLUTELY install ash [or something similar] and exercise every path through the script with the first line set to "#!/bin/ash", before changing it to "#!/bin/sh".) If you normally develop on a GNU-based Linux system (or any similarly well-documented system) then check the "CONFORMING TO" or "STANDARDS" section of the manual page for every library function and system call that you use. Avoid functions that don't conform to ANSI or POSIX, and pay particular attention to functions that are missing in either BSD or SysV. (These should be used with extreme caution, and only if the necessary conditional stuff [with a suitable fallback] is present.)