Char Data Type in C/C++

Time now to learn about C++'s data type for characters and strings: char. Chars are, for all intents and purposes, ints whose values are printed as characters rather than numbers based on the ASCII table. Look up asciitable.com for a full listing of available chars; while it is dependent on which compiler you use, most will probably comply with this standard. Oh, and when looking at the ascii table, the Dec value is the number and the char value is, of course, the character that will be displayed. You'll notice that many of the chars prior to 32 are rather strange; my advice is to ignore them, as you probably won't be using them much, if at all.
But I digress, chars are numbers that are displayed as characters based on that chart. They take values between 0 and 255; anything after that and it simply goes back to 0 and starts again (the math would be char# % 255, if you're interested). Chars can be assigned integer values and vice versa, but if you don't want to have that ASCII table sitting in front of you at all times, you can simply assign the character representation. However, you must use single quotes if you're assigning or otherwise indicating a single character. Using double quotes would indicate a string which I'll describe below. For now, though, here are a few char declarations and assignments.
char a = 'a';
char b = 'A';
char c = 126;
char d = a + 14;
char e = 300;
The first two remind you that chars are case sensetive. The third example shows that you can assign a char an integer value and it will work just fine; in this example, variable c would print out as the tilde (~). You can also perform math on chars, just as you would any other integers. Variable d would be o -- 14 units after 'a' on the ascii table. Finally, you have variable e, which comes out to be a comma (,). It got to 255, rolled back around (i.e., performed the arithmetic 300 % 255) which left 44 more units to go, which is the comma on the ascii chart.
There are also special commands that can be placed in a char. They are probably stored in the first 32 slots of the table (though don't ask which is which, because I don't know, either), but they do special things. You've already been introduced to the most common one, ' ', which indicates a new line. As you might have guessed, the slash indicates its status as a special character. When a char containing this character is printed, it will simply drop the cursor to the next line. Next is ' ', which is a tab stop. There is another often used one, but I'll save it for later this lesson.
Knowing how to store and use single characters at a time is all well and good, but eventually you'll want to store entire strings -- that is, multiple characters -- under a single variable. This is done by way of arrays of characters, or character arrays: whichever you want to call them. Due to the limitations of arrays, though, strings (multiple characters) must possess a few characteristics that you might not expect, namely the null terminator. The null terminator is one of those special characters which is used to indicate the end of the string, and it can be inserted by two different means. The first and easiest way is to simply make the assignment using double quotes. This implicitly places the null terminator after the string. However, if you find that you must do it manually, the code for it is ''. You can place it wherever you want to in a character array, and the compiler will interpret the first occurance of it as the end of the string.
Before throwing examples out there, allow me to explain how character arrays work. They are unique in that you're allowed to print the entire array in one cout statement; the compiler prints each individual character in the array automatically. As you remember, arrays are immutable; they stay the same size for their entire lifespan. That is where the null terminator comes in; without it, whenever you went to print out the character array, it would print out any characters that occurred after your actual string ended. With the null terminator, though, the compiler knows to stop printing when it is reached. Here are some examples of character arrays and strings.
char a[] = "abcde";
char b[500] = "fghijk";
cout << b;
The first would create a character array containing 6 elements. One for each of the visible characters, and an additional one for the null terminator. The second one creates what is loosely coined a buffer; a character array with reusability. It can hold any string you throw at it, up to 499 characters (remember, it needs one extra for the null terminator). This is useful for taking user input, since you never know what someone will write and you wouldn't want the program to act suspiciously. The cout statement would print "fghijk" without the quotes. It would print about 494 characters of garbage afterwards, but the null terminator prevents that. Unfortunately, these character arrays are not quite reusable yet. Once you've initialized an array like this, you cannot assign it another string. I'll show you how to work around this limitation momentarily.
Now that you know what strings are, there is an existing library that is utterly necessary for anybody who uses them. It is called string.h, and to begin using it, you must include it into your project. So just place #include "string.h" at the top of any file that uses it, and you're off. A full listing of the functions and their uses can be found here, but I'll go over the ones you'll use frequently. By far the most frequently used function in the library is strcpy: string copy. This is the function you'll use to place (or replace) text into a pre-existing string buffer. The function takes two arguments: the first is the variable receiving the string, and the second is the string to be placed into the variable.
char a[500];
strcpy(a,"Ryan Boyer");
cout << a;
After this code executes, variable a will contain my name. You can also use a character array (string buffer) in place of my name.
Next is strlen: string length. This just tells you how many characters occur prior to the null terminator. It takes one argument -- the string -- and returns the length as an integer value.
int myName = strlen(a);
Variable myName will be 10 after this code executes; "Ryan Boyer" contains 10 characters (spaces included).
Next, strchr, which checks for the occurrance of a character in a string. This is an odd function, though; it returns a pointer to the character if it finds it and a null pointer if it doesn't. That reference site I linked to uses the following algorithm to glean the location of the character in the string: returned pointer - original string + 1. This is simply doing math on memory locations. C allocates memory in reverse, so it is subtracting the smaller location from the larger location, which gives the location prior to the character, and then adds 1. If you want the index for some reason, be sure to leave off that +1 at the end.
char *ptr = strchr(a,'n');
int loc = ptr - a + 1;
Variable loc will now contain 4, which means that n is the 4th character (3rd index) in the array. Likewise, the function strrchr finds the LAST occurrence of a character in a string, and it is used the same way as above.
The last function I'll go over here is strcat, or concatenation. That means to tack one string onto the end of the other. This function takes two strings as arguments; the first string will receive the second one, so it must have a large enough capacity to accomodate it.
strcat(a," is my name!");
If you find that these functions do not do what you want to do, take a look at the reference site. There are a few more options available to you there.

No comments:

Post a Comment