Memory

Memory management – the process of managing what’s in RAM, and getting rid of things when you’re not using them anymore. Newer programming languages do it all for you. Storage space, such as cloud storage, hard drives, SSDs, and flash drives, are persistent storage, meaning it keeps your data even when you power the device off. By contrast, RAM is volatile, meaning it loses its contents when powered off. RAM is used for temporary stuff, and it’s also faster than regular storage. Your program will do many things in RAM, but it will also need to use storage for file IO now and then. Memory management refers to RAM, not storage space.

These days, computers have a lot of RAM, so you can get away with being less efficient than in the past, where software was much more resource-constrained. Even so, you should still try to have proper memory management. Garbage collected languages do the heavy lifting for you, but if you want to learn C++, you will need to get comfortable with memory management.

Address space – to access something in memory, it needs a memory address. Modern programming languages abstract these concepts away from the developer, but they’re still used under the hood. In the following example, I used the python3 shell to make a variable called x, and then print out the hexadecimal representation of its address.

>>> x = 456

>>> print(hex(id(x)))

0x1016caa90

id(x) returns the ID of the variable. hex() returns a hexadecimal (base 16) representation of the argument you pass to it. hex(id(x)) returns a hexadecimal memory address. 0x is the hexadecimal prefix which indicates that it’s hexadecimal, as opposed to some other form.

0x1016caa90 is the memory address in that example, though if you try it, you will see a different memory address. If you quit the python3 shell and do it again, it will be different. It doesn’t mean anything other than where it’s being stored.

So that’s what a memory address is, but what about an address space? An address space is a block of memory addresses.

There is a security concept called ASLR, or Address Space Layout Randomization, which puts things into memory semi-randomly. This is so that the memory location of various things isn’t predictable, because there are memory-related attacks such as buffer overflows.

Memory leak – if your program uses memory but doesn’t free it up after it’s done, it can use more and more RAM as time goes on, eventually leading to running out of memory or crashing. Or, at the very least, just using way more resources than it needs to. For manual memory management languages, like C++, for every new, you need a delete. You can also set a variable to null to get rid of it. When you make things, like variables or instances of classes you made, you might use them briefly, and then never use them again. In that case, it’s of utmost importance to get rid of them so they don’t occupy space in RAM. But in a garbage collected language, it’s not as much of a problem.

There are many ways to monitor memory usage. In Windows, use the task manager. A quick way to open it is to hit ctrl+shift+esc. On macOS, you can use htop, top, or activity monitor. You can find it by going to Launchpad and then searching for activity monitor, or in Finder, you can go to Applications and then Utilities. htop doesn’t come with macOS by default, but if you install homebrew, you can install it with the command brew install htop. On Linux, you should use htop or system monitor. You can also use command line tools like free or vmstat.

In addition to basic system monitoring tools, Valgrind is a tool specifically for finding memory leaks in your code.

There is a kind of security vulnerability related to freed memory called use-after-free, though beginners shouldn’t focus on more advanced concepts like that.

Garbage collection – some languages will automatically manage memory for you so that you don’t have to deal with deletions or destructors. This only applies to RAM, not disk stuff. Some people say that garbage collection makes you lazy, and others say it’s good because then you’re less likely to have memory leaks. Older programming languages tend not to have garbage collection, while newer ones do.

It’s like the automatic vs. manual transmission debate. Some people say they think they can do it better on their own, while others would rather have a machine do it for them.

On the whole, though, I think automation and convenience will win out over the old school hard and “right” way of doing it.

Memory address – the 32-bit or 64-bit address of where something is contained within RAM. You will see this a lot more in lower-level languages like assembly, C, and C++. Not as much in more abstracted languages such as Python or JavaScript, but still possible.

Pointer – something that points to a memory address. Used in C++, but not Java, Python, or JavaScript. A lot of CS students hate pointers because they can be easy to mess up. This will be covered more in the C++ chapter, as it’s a complicated topic.

Pointer dereference – getting whatever is stored in the memory address of a pointer. You have to use asterisks and ampersands with pointers.

Namespace – where a name is valid. A name can be something like the identifier for a variable or function. You will probably be using the standard namespace most of the time, but if you are using an external library, you might use a different one.

:: – to avoid namespace collisions, you can use a namespace or package name followed by :: and then the function name. Let’s say you’re using two third-party libraries for your program, science_lib and math_lib, and they both have a method called calculate(). Without specifying the namespace, there would be a namespace collision, which is not good. A namespace collision means that there are two things with the same name and the program can’t figure out which one you want to use.

Then, with this example, you would write either science_lib::calculate() or math_lib::calculate() to disambiguate it. In the aforementioned example, if you just called calculate() on its own, that wouldn’t work.

Pass by value – passing by value only gives you the value of something, but it doesn’t reflect the original variable itself. If x = 2 and you are passing by value, you are only dealing with the value of 2, not the x variable anymore. Depending on what you are doing, this may or may not be what you want.

Pass by reference – just a way of pointing to something so that it will be changed persistently. If you pass something by reference, then you are modifying the original thing, not just the exact value of it.

← Previous | Next →

Intermediate CS Topic List

Main Topic List

Leave a Reply

Your email address will not be published. Required fields are marked *