At some point or another, your code will need to do something over a network instead of locally. It’s good to be aware of networking concepts, but again, your first programs probably won’t have this much complexity to begin with.
TCP/IP model – a network stack consisting of different layers, each with different associated protocols. TCP stands for Transmission Control Protocol, which is a layer 4 transport protocol. IP stands for Internet Protocol and is a layer 3 protocol. There are many more protocols in this stack, but it’s still referred to as TCP/IP.
Here are the layers in TCP/IP:
Application (7, 6, 5)
Transport (4)
Internet (3)
Network access (2, 1)
The numbers might seem confusing, but it means which OSI model layers correspond to the TCP/IP ones.
OSI model – stands for Open Standards Interconnect. It’s similar to the TCP/IP model. Here are the layers:
7. Application
6. Presentation
5. Session
4. Transport
3. Network
2. Data link (with LLC (Logical Link Control) and MAC (Media Access Control) sub-layers)
1. Physical
IP addresses are associated with layer 3. MAC addresses are layer 2.
When you are sending data from your device to some other device over a network, it starts from layer 7 and works its way down, adding encapsulation along the way. Network encapsulation, not to be confused with object-oriented programming encapsulation, is when a piece of data has headers and trailers added to it, kind of like putting a letter in an envelope, and then another envelope, and another one, and then mailing it. When your device receives data, it comes with all these headers and trailers, so what your device does is take it from layer 1 to layer 7, stripping away headers and trailers, kind of like taking mail out of an envelope, except that it has to take the mail out of multiple envelopes instead of just one.
Router – I should make the distinction between a router in general, and a home router, which has more features than normal. A router by itself is in charge of routing. Routing uses things called routing tables to make a list of where things are. A routing table will have lists of IP addresses and physical ports associated with them (not to be confused with software ports). So when a router receives a packet, it will know which port to send it out based on its routing table. Routers deal with IP addresses.
A home router is a Swiss army knife and often has a DHCP server, wireless access point, wired switch, a web server for the web interface (like if you go to something like 192.168.1.1 in your browser), firewall, and more. Sometimes they even have file server software and USB ports so you can plug in a hard drive or flash drive and share it on the network. Not all routers have the same features as home routers. A larger-scale enterprise might have separate and dedicated devices, like a Ubiquiti wireless access point, a Cisco router, an HP switch, and a Fortinet firewall. While a home router does have lots of features all in one device, they usually aren’t as good as separate dedicated devices.
Switch – a switch builds up something called a CAM table, or Content Addressable Memory table, which has a list of MAC addresses and which physical switch ports they correspond to. A CAM table is gradually built up over time when devices use the network. When a switch receives a frame, it looks in its table to see if it knows which port to send it out. If not, it blasts out all ports until it finds it. Routers are concerned with IP addresses, and switches deal with MAC addresses.
Switches are useful within a local area network. You can connect devices or make things faster, such as by using trunk lines, which combine multiple ethernet ports and cables to form one faster connection.
One of my professors summed it up quite succinctly: “switch when you can, route when you must.”
Modem – stands for modulation and demodulation. A modem is a device which is concerned with the physical signaling of network communications, getting your ISP’s signals from coaxial cable and converting it to stuff your home network can use. However, they also have MAC addresses. These have to be registered with your ISP. That’s because if a modem with an unregistered MAC address was allowed to get internet access, many people could steal free internet that way. So modems are partially for signaling stuff, but they are also useful for the sake of ISPs and protecting their bottom line. Sometimes, a modem might even have a web interface, such as 192.168.100.1.
For your ISP, the only MAC address that matters is your modem’s MAC address. With IP addresses, the source IP address will be the device sending a packet, and the destination IP address is the one that receives it. But for every hop, even an intermediary one, the source MAC address is the device that sent it (not the original sender), and the destination MAC is the next hop. So the source and destination MAC addresses constantly change along the way, but the source and destination IP address are the same.
Gateway (device) – a gateway is a combination of a modem and a router, and sometimes has a wireless access point built-in too. Sometimes, ISPs will rent gateways out to customers, but they have complete control over it, and also charge you monthly for it. If you want more control over your network and don’t want a monthly fee, then you can get your own modem and router, and possibly install third-party firmware on it, such as DD-WRT. Gateways are better options for less technical people, but if you’re willing to do more in-depth stuff, it’s worth looking into a different option.
Default gateway – the IP address your computer sends something if it doesn’t know how to get to the destination. If you want to go to google.com, your computer doesn’t know how to get there, but it does know your default gateway, which is usually your router, such as 192.168.1.1.
Port – a networking port (software, not physical ports) can be 0-65535. The first 1024 or so ports are reserved for well-known services, and application developers can user higher-number ports. An IP address alone isn’t enough to facilitate data exchange. You need to know where to send that data. Some common examples of ports include port 80 for HTTP, port 443 for HTTPS, and port 22 for SSH. When you go to http://example.com in a browser, it is implied that you’re using port 80. If you go to https://example.com, you will be using port 443. Sometimes hackers will do port scans to see what services are running on a device.
Socket – a socket is a combination of an IP address, a colon, and a port number. For example, if you have a local web server stack, such as WAMP, installed and running on your computer so you can do web development without a dedicated server, you might visit something like 127.0.0.1:3000 in your browser.
Packet – a piece of data that is used for sending and receiving data over a network. It is concerned with source and destination IP addresses. Instead of sending a big file over a network all in one go, it is broken up into lots of small packets.
IP address – IP stands for Internet Protocol. An IP address is a way to uniquely identify a device, such as a network printer, router, smartphone, or computer. There are two main versions: IPv4 and IPv6. IPv4 is older and uses 32-bit addresses, which means it can only support up to 4.3 billion (232) unique addresses. That’s not enough anymore. IPv6 uses 128-bit addresses, which means there are more addresses to go around: 2128 or 340 undecillion of them. It’s a number so large that it’s hard to grasp just how many that is.
Octet – an IPv4 address is really just 32 1s and 0s. We use 4 octets, which are groups of 8 bits converted into decimal numbers that we can understand easily, just to make it easier to read. After all, 192.168.1.1 is relatively easy to read and understand, but 11000000101010000000000100000001 is a bit more cumbersome to deal with. They both refer to the same IP address though, just represented in different ways. In 192.168.1.1, the four octets are 192, 168, 1, and 1. Because an octet is only 8 bits, that means it can only have 28, or 256, unique values. But because you start with 0, the possible values of an octet are 0-255.
LAN – Local Area Network. Your home network, which is separate from the internet. You can leave your LAN to get to the internet (by sending stuff to your default gateway, which is usually your home router), but internal IP addresses on a LAN are not accessible via the internet.
WLAN – Wireless Local Area Network. Home wifi is an example of a WLAN.
WAN – Wide Area Network. A big, long-distance network. The internet is a WAN, though you could technically have a WAN that is not on the public internet.
4G/LTE/5G – your laptop connects to a wireless router, but a phone connects to a cell tower. They use different underlying technology, but the end result is the same: being able to access the internet. Cell tower connections are often slower and more expensive than wired broadband or wifi. You will often have lower speeds and lower data caps as well. For example, with a Comcast internet connection, you can get 1TB of data per month, which is quite a lot.
“Unlimited” data plans are false advertising. They are actually quite limited. I looked up one cell carrier that throttles “unlimited” customers after about 20GB. This might seem off-topic for the book, but you really should be aware of how economics plays into technology. It’s not all about code and whatnot. If you decide to become a mobile app developer, you should try to make your app use as little data as possible because mobile users have significantly less data and lower speeds than desktop/laptop users. You also need to consider mobile users when you’re making a website. A web page that might load very quickly on a laptop on a fast wifi connection might take a while to load on a phone, so compressing and optimizing things is very important.
Network address translation – your router has an IP address like 192.168.1.1, and your neighbor can have the same exact address for their router. That’s because that’s a private IP address, which is not something that can be used on the public internet. You use LAN addressing internally and then use NAT, or Network Address Translation, so that you can translate between your internal IP addresses and your external IP address. A single home with multiple devices needs a way to uniquely address each of them internally, but your ISP will only give you one IP address at a time, and that’s where NAT comes in.
Private addresses – the following ranges can only be used in internal LANs, not on the internet:
10.0.0.0 to 10.255.255.255
172.16.0.0 to 172.31.255.255
192.168.0.0 to 192.168.255.255
You can use a private address, and your neighbor or friend can use the same one. They don’t interfere with each other because they are only locally relevant. Your house has a master bedroom (a locally relevant address), but so do other houses. But they are not the same master bedroom. My 192.168.1.1 is not the same as your 192.168.1.1. But 8.8.8.8 (a server owned by Google) is a public IP address, so it’s the same no matter where you go.
localhost – 127.0.0.0 to 127.255.255.255. Addresses on your local computer, not on a LAN or the internet. Most people use 127.0.0.1, but you don’t have to stick to just that. However, keep in mind that all 0s in the host portion of an IP address is the network address, and all 1s in the host portion of the address means the broadcast. Because localhost is 127.0.0.0 with 8 bits for the network and 24 bits for hosts (a subnet mask of 255.0.0.0), that means the 2nd, 3rd, and 4th octets are all host. So with all 0s in binary, that would give you 0.0.0, hence 127.0.0.0, which is the network address, which can’t be used for a host. All 1s in binary in the host octets would result in 255.255.255, for a broadcast address of 127.255.255.255, which is also not a host address. So the real usable localhost addresses, verifiable with ping, are 127.0.0.1 to 127.255.255.254. However, if you’re running a local server, such as a LAMP server for local web development before you push to production on an actual internet-facing web server, you can visit it using localhost, 127.0.0.1, 127.0.0.2, 127.0.0.3, and so on. They will all point to the same thing.
If you have a local server running on your computer but it’s configured to accept remote connections, you might visit it with localhost or 127.0.0.1 on your computer, but someone else on the same LAN might visit it using something like 192.168.1.124, which is the internal (LAN-only) IP address assigned to that computer by a DHCP server (such as a home router).
0.0.0.0 – if you want network traffic to go nowhere, send it to 0.0.0.0. Some ways of blocking sites including editing your computer’s hosts file and adding an entry like this:
0.0.0.0 facebook.com
The location of the hosts file on Windows is C:\Windows\System32\Drivers\etc\hosts. If you want to edit it on Windows, you will need to right click on Notepad and hit “Run as administrator.” The hosts file on Linux is at /etc/hosts. You will have to edit it with root privileges, such as sudo vim /etc/hosts. On a Mac, it’s also located at /etc/hosts, and you will have to run it with elelvated privileges, doing something like sudo nano /etc/hosts.
Just keep in mind that some websites will have multiple subdomains, so if you really want to block a website with a hosts file, you might have to make multiple entries, one for each subdomain. For example:
0.0.0.0 www.example.com
0.0.0.0 example.com
0.0.0.0 m.example.com
0.0.0.0 something.example.com
DNS – a way to resolve domain names to IP addresses. A domain name doesn’t tell you where the server is. A DNS server is like a phone book.
DHCP – Dynamic Host Configuration Protocol. A way to automatically obtain an IP address configuration without having to set it up yourself. A DHCP server is in charge of giving out IP addresses on a LAN. A home router has a DHCP server built into it.
Frame – a piece of data on a network that is concerned with source and destination MAC addresses. Frames are associated with switching. Frames contain packets.
MAC address – 48-bit physical addresses of devices that don’t change, unlike IP addresses. MAC stands for Media Access Control.
Subnet – a subnet is a network split into smaller networks.
Subnet mask – separates the network portion of the IP address from the host portion.
To understand the differences between network portion and host portion, think of it like this: let’s say there’s a street called Lincoln Blvd, and there is a specific block on the street where the houses are 1301, 1303, 1305, 1307, 1309, and 1311. In that case, you could say 1300 is the network portion and the remaining part is the host portion. A network portion is the same for all devices on the same network. A host portion is unique to the host device.
A router can figure out roughly where something should be routed based on its network portion. If you know a house is on Lincoln Blvd and has 1300 in its number, you know roughly where to drive to get there. Only when you’re on the specific street block, or network, do you look for the host portion.
I think it’s easier to understand subnet masks when you look at them in binary:
255.0.0.0: 11111111000000000000000000000000
255.255.0.0: 11111111111111110000000000000000
255.255.255.0: 11111111111111111111111100000000
The zeroes are host bits, and the 1s are network bits. You can have networks of different sizes, and the subnet mask tells you the size.
Topology – the layout of a network.
Network segmentation – breaking up a network into smaller pieces for administration, privacy, or security. If you have a company with many different departments, each can have a separate network segment which is not allowed to communicate with the other ones.
Flat network – a terrible way to set up a network is to have a flat network topology. This means that everything is on the same network, with no separation or hierarchy whatsoever. Every device on the network can talk to every other device without any restrictions. This can lead to broadcast storms, and it can also allow malware to propagate through the entire network, such as wormable ransomware. Wormable means malware that can spread itself over a network. A flat network is a lazy and insecure way of doing things. Many home networks have flat topologies, but it’s inexcusable for a business or school to have one. A better alternative to a flat network is to set up subnets and firewall rules.
Let’s say there’s a business with a waiting room and free wifi for customers. The business doesn’t want their private business assets to be accessible by random customers, so they create a separate subnet and firewall rules for the public wifi. The firewall rules are set up in such a way to not allow ingress traffic to the from the public subnet to the business subnet. VLANs are another way to add additional network segmentation.
NTP – Network Time Protocol. Back in the old days, people would have to manually set many different clocks, but they might be off by anywhere from a few seconds to a minute. That method was manual, error-prone, and couldn’t automatically account for things like daylight savings time, or resetting after a power outage (on certain appliances). But now, instead of setting the time on every device, they can contact an NTP server to ask it what time it is. If all your devices use the same NTP server, they will all have the same exact time.