main resources links contact

Recitation 18: Intro to C

Strings:

[strcat.h] [strcat.c] [strcat-correct.c] [strcat-crash.c] [strcat.no-crash]

These examples center on a function that concatenates two strings, similar to the string.h library function strcat. Note that we're not allocating a new block of memory for the resulting string. We're adding src to the end of dest. The problems center around the need for enough memory to do this. Note that out of bounds array accesses are undefined, so crash and no-crash may not behave the same for you if you run the code.

Strings in C are quite a bit different in than in C0. Well, strings don't really exist in C. We have arrays of characters. These arrays are special because the end in a NULL terminator, represented as '\0'. So, if we wanted to print a char array, the program would read characters until hitting this NULL terminator. We need to explicitly manage this NULL terminator. This has many consequences. Suppose I want to store the string "abc" in a char array. Well, I need to allocate 4 bytes--one extra for the NULL terminator. Also, if you lose the NULL terminator, bad things will happen.

Memory Allocation:

In C0, we're used to calling either alloc or alloc_array. In C, we have malloc and calloc. They are defined as:

 void *calloc(size_t nmemb, size_t size);
 void *malloc(size_t size);

Despite visual similarities (pattern matching tricks us), don't equate malloc with alloc and calloc with alloc_array. The only difference between the two is that calloc sets all memory to 0 before giving it to you, like C0 does. Since values of 0 have a meaning in some sense for all types, this can be useful. So, it probably sounds like you should always use calloc. Not so fast! Setting everything to 0 is expensive. If you're just going to immediately overwrite that memory with your data, then there's no sense in zeroing it first. When I say that memory from malloc is not zeroed, this might mean that you're getting memory that your program used previously, but was finished with. So, the old data is still there. So, just because calloc has two arguments and malloc one, doesn't mean that calloc is for arrays and malloc everything else. To further drive home the point, you could write calloc as follows: multiply the two arguments together, call malloc, loop of the memory and zero it, return.

The other important thing is what the arguments actually are. All are of type size_t. This is an unsigned integer, but it may have a different number of bytes than the usual int type. By definition, it's the maximum size any object is allowed to be. So, we're asking for an integral number of bytes, not a type name like in C0. To help with this, we use the sizeof function that takes in a type and tells us how big it is. Suppose I want to allocate an array of 10 ints. I could do the following:

  malloc(10 * sizeof(int));
  calloc(10, sizeof(int));

Don't just say, "well, I know how big an int is" and hardcode 4. The truth is that some types will change their size depending on the platform. For example, a long int is 8 bytes on a 64 bit system and 4 bytes on a 32 bit system.

Other Examples:

These are some examples that I've accumulated over several semesters of teaching. Some of these I showed today. I didn't get to all of them in recitation. Note that some of these relate to next week's lecture topics, but I'd rather keep everything in the same place. I highly recommend going through these. I have a suggested order to do so in README.txt. They are heavily commented and hopefully explain what's going on sufficiently. Most of them highlight new things that we didn't have in C0 or behavior that was defined in C0 that is undefined in C. While poking fun at weird things C does is rather amusing, there's also a lot of educational value to understanding why these things happen, especially when you're debugging.

That being said, don't feel responsible for everything in them. We're throwing a lot all at once--the transition to C is pretty crazy. Hopefully you can appreciate why we've been using C0 up until this point--so that we wouldn't have to worry about all of this from day 1. Topics that are important will be revisited, so don't feel like you need to learn everything there is about C on your own.

If you have any comments or see any errors, please let me know.