Wednesday, October 04, 2006

Memory Marching Madly

5.0 Memory Marching Madly

We know now that when you declair a variable, you are reserving memory to use. This memory is statically set in a section of memory, but in a random available section somewhere out there in the computer’s memory. You could declair two variables one right after each other and they can have dramatically different addresses. The variable itself stores the address to where the data is stored; you could even refence that variable so you can read the hexadecimal location of where it is located for study.

But what about cases when you want to have a bunch of information collected together in a row. You could declair 52 varibables to store the amount of money you spent in a particular week of the year. And if you wanted to collect a bunch of characters together to spell out actual words. This is what is call as an “Array.”

5.1 Arrays

Arrays variables work in a similar way to normal variables in that they reserve space. However, they are specially designed to server a group of continuous spaces of memory. The variable itself actually just points to the first “space” in that array. One feature of an array is that they must “homogenius” data types…that just means they all have to be the same type. No mixing allowed.

5.2 Getting the words out

One special type of array is an array of characters, also called “Strings.” While…my first impulse is to chase strings, these are just strings of characters. The main idea is that the most common thing any computer will ever do is to put words together to display information. Most compilers have whole librarys dedicated to just dealing with “strings” (C/C++ has ). Some common string functions are: “Trim” to clean out any control characters, like tab or spaces, “Substring” will tell you were inside a big string is an instance of a smaller string, so you can find patterns or locations.

5.3 “Hello Array!” Addressing of Arrays and Address Jumping

Variables that are arrays must be addressed differently than your avarage non-array variable. Lets say your array is named “strName,” if you just used the word “strName” in your program, you are just dealing with the ADDRESS of the FIRST space in the row of variables you declaired. Look at it this way:

Actual Container of Info strName
___ ___
_a_ << = = = = = = = = = = =___
_b_
_c_
_d_
_e_
_f_
_g_

The variable strName just contains the address of where the actual content of the array is stored in memory, with the letters “abcdefg.” So if you want to deal with the content, you have to address the array variables differently. You must provide an “index” to the array, based upon the position in the array from the first. So “a” is located in the “zero” index, because its in the first box, right where the address is pointing to. But the letter “c” is 2 from the first, and therefore is in the 2nd index. So to access the “a” in the variable “strName” you would use the addressing of “strName[0]” and to access the “c” you would use the addressing of “strName[2].” (NOTE: most string handling functions do most of this indexing already, so you will only need to pass in the “strName” to get anything done; however if you need to address individual parts, then you need to figure out the indexing yourself.)

One thing to remember is that since strName has the address, and the compiler knows how big the size of each variable is, then what would happen if you increment that variable? This nifty trick is: it points to the next spot on the list! So in the above picture, if you did “strName++” the new picture would look like this:

Actual Container of Info strName
___
_a_ ___
_b_ << = = = = = = = = = = =___
_c_
_d_
_e_
_f_
_g_

The address has changed, and you now have a new set of indexes. This is a common short cut for faster access to memeory; however, it only works in some compilers. But the principle is important to know: Addresses can be added and subtracted. This is called “pointer arithmatic” but its not important for now.

5.4 How do you know you are done?

One important idea to go over again is the fact that even though you declaired a variable, even arrays, it doesn’t mean the memory is empty. It most likely will contain junk values left over from another program. So..if you have an array, how do you know you are done? There are 2 schools of thought to solve this. The first idea is back to our sentinel value. A character that couldn’t possibly be used, and what they have chosen is the “NULL” character. The null value is universally a unusuable character, which is essentially Zero. So, what this means is that if you always have to add one to every calculation (although most string-handling functions will do this for you) to account for the null character at the end. IF you want to fit the alphebet in an array, you will need space for 27 characters, instead of just 26. The second school of thought is called a “Pascal” string (named after a mathmetician “Blaise Pascal”) that uses the first character in the string to tell the length of the string. Therefore, the first index will always be a number, but the others will be the actual characters. The draw back is that for normal characters, no string can be longer than 255 in length. (Except for “wide” characters that are much bigger).

5.5 Memory of Memory of Memory

So what about going multi-dementional? Lets just say you have a spreadsheet that shows how much you spent on stuff over the month of November? For one, we want an array of names telling what we spent money on. For another, we want what we spent money on, and the days we are talking about.

Lets say the list of things we spend on in our budget. Lets assume no more than 20 items, of no more than 100 characters long. So, in C++, it would look like this:

char budget[20][100];

If we would scketch this out, it would look something like this:

[ ] ----> [ ] ------> [R][e][n][t][NULL][ ][ ][ ][ ]
[ ] ------> [F][o][o][d][NULL][ ][ ][ ][ ]
[ ] ------> [G][a][s][NULL][ ][ ][ ][ ][ ]
[ ] ------> [ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ] ------> [ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ] ------> [ ][ ][ ][ ][ ][ ][ ][ ][ ]

The idea is that the first cell points to the first cell of the first array of 20 characters, so basically it’s “root” or “origin” of the variable. The first array of 20 characters isn’t really a character, but are again pointing to the NEXT set of arrays, each being 100 characters wide. These are the spots in memory that actually have the information. So, if you wanted to find the word “Gas,” you would start at the origin, move to the 3rd index (which is #2) and you find the word “Gas.” In C++, it would look like this:

cout << budget[2];

Now lets think about the actual money. Budgeting November would include a total of 30 days and 20 different items (referring back to our budget names). Lets make it 31 days, just so we can use the same source code for other months. But what kind of variable should we use to represent money? You might think float, because…there’s fractions in dem dar dollars. But really, what happens when you suddenly have 0.125 dollars? You can’t…most people just round out fractions with money…except for banks & governments…so a better idea is to use int to count pennies, and when we want to print things out, we just divide by 100 to get the right decimal. So we might see it like this:

int intSpentItems[20][31];

this might look like this:

[ ] ----> [ ] ------> [100][123][233][4000][NULL][ ][ ][ ][ ]
[ ] ------> [123][334][4999][5999][NULL][ ][ ][ ][ ]
[ ] ------> [ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ] ------> [ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ] ------> [ ][ ][ ][ ][ ][ ][ ][ ][ ]

Very similar to our budget, and we’d access it in a similar way:

cout << “budget item \”” << budget[2] << “\” had “ << intSpentItems[2][x] << “ on the “ << x << “ of November.\n”;

But what if we wanted to record not only how much spent, but also how much we planned on spending. Do we need another array? We could, but how about adding another demention to that array to make things easier.

int intSpentItems[20][2][31];

Now we have a 3 dementional array. Lets try to draw how this will work:


[ ] ---->[ ]------>[]--->[104][150][200][4000][NULL][ ][ ][ ][ ]
[]--->[100][123][233][4000][NULL][ ][ ][ ][ ]
[ ]------>[]--->[123][334][4999][5999][NULL][ ][ ][ ][ ]
[]--->[100][123][233][4000][NULL][ ][ ][ ][ ]
[ ]------>[]---> [ ][ ][ ][ ][ ][ ][ ][ ][ ]
[]---> [ ][ ][ ][ ][ ][ ][ ][ ][ ]
[ ]------>[]---> [ ][ ][ ][ ][ ][ ][ ][ ][ ]
[]---> [ ][ ][ ][ ][ ][ ][ ][ ][ ]

Again, we have the same origin point, pointing to the start of an array of more origins, pointing to another array of origins pointing to an array of integers. Now we have a useful, multimentional array full of possibilities.

5.6 Memory Clubbing

There often times when you have many different variables that are directly related to each other, even if they aren’t the same kind. The example of the list of expendatures and list of labels is a perfect example. One was integer and the other was strings of words. Now you can just take those variables and declair them seperatly and everything will work just fine. But if you wanted to be very clear about what you were doing, especially if you wanted to reuse your source code over and over again, you might consider something called a “memory structure.”

A Memory Structure is a group of variables stuck together, much like arrays are--one right next to the other--but they can be of different types of data. The structure becomes one variable made up of different members of its own little party. (This is often refered as a single “object” but its no EXACTLY what you would consider an object, but remember that name for later.) That way, you can declair a single variable to contain a whole bunch of information to use all over the place.

5.7 Structured Examples

Lets redo our budget as a structure, using C++ wordings:

struct MyBudget {
char budget[20][100];
int intSpentItems[20][31];
}

That simple. And it looks just like we did before…just within the “struct” command. When you use it, you just declair a variable the same way you would any other variable, but of type “MyBudget.”

MyBudget data;

To get to the different parts or memebers of that structure, you use a “dot” notation (fancy way of saying “put a period”).

strcpy( data.budget[0], “Food” );

Neato speado, eh? SO…how would this map out? VEERY similar to the array! The info gets all jumbled together so you can grab at it the same way. In fact, since we’ve got 20 items grouped together this way, we can get even fancier and clearer for our programming. Why have multi-dementional array when want to be VERY clear that each label gets tagged along with whats getting spent on it. How does this look?

struct MyBudget {
char label[100];
int intActualSpent[31];
int intPlannedSpent[31];
}

MyBudget data[20];

It works the same way, but now each label has both the actual spent and the planned spent spelled out in the array, so there’s no confusion as what each part is gonna be used for. This is a very common way programmers use to make things clearer and more efficient, even though both ways still work.

No comments: