Showing posts with label C++. Show all posts
Showing posts with label C++. Show all posts

Monday, April 2, 2012

GCC lossy conversion warnings for C and C++

Have you ever been bitten by the bug where you, for example, return a long long from a function, but set the return type of the function to int (by mistake)? I sure have, and off the top of my head, I can remember this bug costing me a failed solution at least twice at TopCoder. Just to make this more concrete, here's a simple example function that has this problem.

int foo() {
  long long x = 1LL<<42;
  return x;
}

This kind of conversion is actually perfectly legal (for example, by section 4.7 of C++11 n3337), and the result is implementation-defined if the target type (here int) is signed and can't represent the value of the source type (here long long). GCC defines the result to be the value modulo 2^n such that it fits into the type, but what they actually mean is that the lower however-many bits are kept and the rest is discarded, which makes sense in two's complement (which is what GCC and the large majority of "conventional" systems use). So if int is 32-bit, and long long is 64-bit, you get the lower 32-bits of the original value. In this specific case, the result would therefore be zero.

Now, you'd really like to be warned about this kind of thing (at least I would :), but you don't get that with neither -Wextra (a.k.a. -W) nor -Wall. It turns out you have to explicitly ask for it using -Wconversion. The argument they make for code that intentionally does these kinds of conversions (without casts) is definitely valid, but I will be adding this flag to my default set for sure, at least for TopCoder :).

Friday, April 10, 2009

Standard C function for reading arbitrarily long lines from files

This is a function I wrote back when I was learning C (and understanding why standards matter, thanks mostly to the wonderful Usenet group comp.lang.c). I translated the variable names and comments from Croatian to English and thought I'd post it here in case someone might find it useful. What it does basically is read a file one character at a time attempting to allocate more memory as needed. It's written in standards compliant C, and packed in a small test driver program. I'm not including a header file for it since it's only one function, but you can write one easily yourself.

You can compile it with something like

gcc read_line.c -W -Wall -ansi -pedantic

and should get no warnings. If you want to test it, you should probably reduce ALLOC_INCREMENT to something like 4 or so.

Most of the code is self explanatory and has detailed comments. There are a few design decisions I'd like to talk about. First off, the function returns the number of characters read as an int, which is not really in line with the standard library which uses size_t for buffer sizes. The simple reason is that I personally don't like working with unsigned values, except when doing modulo 2^n arithmetic and doing bitmask handling. It's also a nice way to report an error by returning -1 (although you could return (size_t)-1, but that is less nice). Also, in contexts where I use this function, int is more then enough to represent all the possible line sizes.

Second, the final amount of allocated memory might be higher then the number of bytes read. You can solve this easily by creating a wrapper function, similar to this one

int read_line_with_realloc(char **s, FILE *fp) {
char *p = NULL;
int len = read_line(s, fp);
if (len == 0) {
return 0;
} else {
p = realloc(*s, len + 1); /* room for '\0' */
if (p == NULL) {
free(*s);
return -1;
}
*s = p;
return len;
}
}


Finally, there's no way to reuse the memory already allocated in a buffer, that is, it is assumed that the passed buffer is initially empty. This is basically to keep the interface clean, but can also be easily fixed with a wrapper.

You can get the code here.

I've used this function many times, but as always (even with trivial code), there might be some errors in there. If you find any, please let me know.