Now that you have *AMP (and MySQL Workbench if you’re on Windows) set up on your computer, it’s time to learn about SQL queries. SQL is Structured Query Language, so queries are the core focus of the language. Some people say S-Q-L, while others pronounce it as “sequel.” There are just a handful of queries that you will need to learn to get started. I’ve only listed the most important queries in this book, but there are some more advanced ones you might want to learn later on. But this is an introduction rather than a super in-depth book about SQL. SQL queries must end in semicolons.
- CREATE DATABASE – the CREATE DATABASE statement will let you make a new database. Also, you don’t need to have statements be in all caps, but it’s a common convention to do so. But keep in mind that only the statement-related words should be uppercase, not things like variable or table names.
- For this book example, let’s create a database to mess around in for learning purposes.
- In the Query 1 window (on Windows), or at the mysql> prompt in Linux and macOS (following the instructions from before for how to get to it), type the following text:
CREATE DATABASE learning;
Then click the thunderbolt icon (Windows) or hit enter (Linux and macOS), which will run the query.
IF NOT EXISTS – you can do something like CREATE DATABASE IF NOT EXISTS learning; to only create a database called learning if there isn’t one there already.
- USE – USE will let you switch to a certain database.
In MySQL Workbench (Windows), for every new query you make, you first need to delete the text that was in it and ran for a previous query. This is not a problem for macOS or Linux, as you will just be typing in queries at the mysql> prompt directly.
Type and run the following query:
USE learning;
Then hit the thunderbolt icon (Windows) or hit enter (Linux and macOS).
Windows: you can tell if it was successful or not based on the bottom section of the screen that says “Output.” A green checkmark means the query was successful. If you see a red circle with an X in it, that means your query failed.
And for future reference, if you close MySQL Workbench (Windows), or if you quit out of mysql> (Linux and macOS), such as if you turn off or restart your computer, you will need to use the USE statement again to get back to where you were before. If you try to use a table without having selected a database to use with USE database_name, you will get an error message. Because this chapter uses two different database examples, and if you’re spacing out this chapter over multiple days rather than all at once, you will have to remember to use the proper USE query so that you will be able to continue following the commands in this chapter.
DROP DATABASE and SHOW DATABASES – if you want to delete a database, you will have to use the DROP DATABASE some_name statement.
Firstly, create a new example database that will be deleted to demonstrate this statement:
CREATE DATABASE delete_this;
Windows only instruction: before every query, delete the previous contents in the Query 1 window.
Hit the thunderbolt icon again (in Windows) or hit enter (in Linux or macOS) to run it.
Now, to check that a new database was made, use the following query:
SHOW DATABASES;
Here’s what you should see:
delete_this
information_schema
learning
mysql
performance_schema
sys
Now run the following query:
DROP DATABASE delete_this;
Now run the following query to see that the database was deleted:
SHOW DATABASES;
And sure enough, you should see that the delete_this database is now gone:
information_schema
learning
mysql
performance_schema
sys
CREATE TABLE and DROP TABLE – now that you know how to create, view, use, and delete databases, it’s time to make a table. You can have many tables within a single database. In Microsoft Excel, when you make a workbook, you can have multiple sheets within the same workbook, which are like different pages, but they are all still within the same project. Spreadsheets are basically just overly-simplified databases.
Let’s create a temporary table called temp, just so it can be deleted so you know how to drop tables:
CREATE TABLE temp_table(
example_column CHAR(5)
);
On Windows, MySQL Workbench will show no extra stuff between the lines. But on Linux and macOS, it will look like this (with arrows indicating a continued query that spans multiple lines):
mysql> CREATE TABLE temp_table(
-> example_column CHAR(5)
->);
They’re the same thing though.
On Windows, MySQL Workbench shows the status of the queries in the bottom portion of the screen called Output. But on Linux, it will give you a message, like the following:
Query OK, 0 rows affected (0.19 sec)
Now drop it with the following query:
DROP TABLE temp_table;
Now temp_table no longer exists.
Let’s say you want to make an online shop. It wouldn’t make sense to have customers and products in the same table. So you’d have one table for customers and one for products.
Now let’s create a table called users:
/*creating a new table*/
CREATE TABLE users (
userID INT NOT NULL PRIMARY KEY AUTO_INCREMENT,
phone_number CHAR(10),
username VARCHAR(30) NOT NULL,
email VARCHAR(20) NOT NULL,
confirmed_email BOOLEAN NOT NULL,
signup_date DATE NOT NULL,
user_password VARCHAR(32) NOT NULL,
account_locked BOOLEAN NOT NULL
);
A lot of new concepts were introduced with the above query. In the next couple pages, I will explain each new concept individually.
Comments – in SQL, there are many different ways to write comments. Use /*comment goes here*/ for multi-line comments or — for single-line comments. Later in the chapter, you’ll learn about how commenting out the end of a query can be used in SQL injection.
users (); – when you use the CREATE TABLE statement, you have to provide a name for the table, and then in parentheses (on different lines, if you want), you can specify the columns to create in the database. And like any query, it must end in a semicolon.
userID INT – this means to create a column in the new table called userID with a data type of INT. You can optionally specify the size of a column, at least for some data types. After the data type and optional length, you can also list additional options for a column, such as NOT NULL and/or PRIMARY KEY. If you specify multiple columns in the same table, you will need to separate them with commas. But the last column in the list of the table creation query cannot end in a comma, or else you will get a syntax error when you try to run the query.
NOT NULL – if something says NOT NULL, that means it always needs a value. So it can’t be left blank. If something is mandatory in your data, make it NOT NULL.
PRIMARY KEY – a unique identifier. Some people have the same first name. Some people have the same last name. Two products on Amazon could be called the same thing, like printer paper. So some identifiers might not be unique. But the thing that always needs to be unique is a PRIMARY KEY. In the US, a social security number is kind of like a primary key. If there are two types of apples called gala apples at a grocery store, one might have a primary key (aka SKU or product ID) of 13453453 and the other might have a key of 645345342. This is how you can tell them apart. Yes, even groceries and inventory are managed using databases. The entire world relies on databases.
One important thing to remember about primary keys is that there can only be one primary key per table. It would not make sense for a person to have two separate social security numbers, and it wouldn’t make sense for a user record to have two separate primary keys.
When you call a customer support line, and they ask for your account number, and then say they’re “pulling up your info,” what they mean is that they’re using a front-end program that allows them to search a database by certain terms. An account number might be a primary key in a MySQL database, and by entering an account number, the front-end program might pass the customer support representative’s input to a database server that might be making a query that looks something like this (don’t run this though):
SELECT *
FROM users
WHERE account_number = 165443362;
AUTO_INCREMENT – for numeric data types, such as integers, AUTO_INCREMENT will increment them automatically for every record that gets created in that table. If you have a user with an ID of 1, the next user should be 2, then 3, and so on. That being said, the Open Web Application Security Project (OWASP) website mentions a security concern called “forced browsing” aka “predictable resource location.” Some might argue that randomized user IDs would be better than automatically incrementing them. But for the sake of simplicity, this demonstrative chapter will use AUTO_INCREMENT. More info about predictable resource locations here:
https://www.owasp.org/index.php/Forced_browsing
There are tons of different potential security issues that relate to databases, server back-end code, installed dependencies, and server operating systems. If security was easy, we wouldn’t be hearing about well-known companies getting hacked all the time in the news. But security is hard. I’ve mentioned little security tidbits throughout this writing, though this is not primarily a security book.
CHAR(10) – fixed length characters. Basically a string. The number in parentheses is the number of bytes. If you have less than 10 characters in this column with this length, it will pad the rest of it with spaces. The maximum length of a CHAR column is 255.
VARCHAR(30) – variable-length characters. A string that can resize. In the example for the table that was created previously, 30 means it’s up to 30 bytes long.
TEXT – text. A text column could be useful for things like a user comment, thread, tweet, blog post, user description/bio, report form, product description, and so on. A TEXT column can be longer than CHAR or VARCHAR.
ENUM – an enum type column in MySQL is like an enum in most other programming languages.
It would have to be something like this in a table creation query:
name_of_enum ENUM(‘value1’, ‘value2’, ‘value3’)
DECIMAL – a numeric type that can support non-integer numbers, such as 6.2 instead of just 6.
BOOLEAN and TINYINT(1) – if you want to represent true/false values in MySQL, you can declare something as a BOOLEAN column, or TINYINT(1). BOOLEAN doesn’t really exist per se, but when you create a column with a type of BOOLEAN, it actually creates a column with type TINYINT(1). 0 means false and 1 means true. It’s a tiny integer with a size of 1 because it can only be 0 or 1.
Warning messages for queries that still go through – if you were to create the users table with TINYINT(1) instead of BOOLEAN, it would result in the same exact thing, except that the Output section of MySQL Workbench (in Windows) would give you a warning saying that it’s a feature that will be deprecated in the future. But even though a warning is bad, it will still do it anyway. If it was super severe, it would just give you an error and not do anything at all. Warnings are for issues that are not very severe. You can tell when a MySQL Workbench query has a warning because it will have a yellow triangle icon with an exclamation point in it.
For Linux, it doesn’t seem to care about using BOOLEAN. It will automatically switch it to TINYINT(1) and give no warnings.
MySQL vs. SQL – SQL is supposed to be semi-generic, but you will find that DBMSes often add their own slight little customizations to things. So PostgresQL will be slightly different than SQLite, which will also be slightly different from MySQL. But they are all more or less the same, with only small differences. But just like in the C++ chapter where you learned about some OS-specific things, where not everything was compatible with every platform, it’s the same case here, but for database queries.
DATE – DATE is a data type in SQL that has the year, month, and day. There are many date-related data types, but for now, knowing one should be good enough.
DESCRIBE – all of the features of the lengthy and complicated table creation example query have been explained, so now it’s time to run a new query:
DESCRIBE users;
The above query will show you information about the users table. You can use DESCRIBE as a way to verify that you typed in the query that you actually wanted, rather than making a mistake for a data type or name or something. EXPLAIN does the same thing as DESCRIBE.
You might be wondering… what is the account_locked column for? Well, if it’s a social media type website, maybe you want the ability to ban users. Or if it’s an ecommerce site, where users can buy and sell things, you want the ability to lock someone’s account so that they can’t use it, such as if someone steals their account. In either case, just a Boolean is enough to make a simple system of disabling an account. Instead of deleting it, things like logins could fail if it checks the account_locked value associated with that account. If a customer has an account with payment information associated with it, then locking their account could mean preventing someone from changing their password, looking at the payment information, or buying things. The code to prevent those things would have to be on the business logic side of things, such as in PHP. But the users database would at least need a column that marks an account this way.
That’s just one example. You can get creative and do all sorts of things with databases and accounts.
UNIQUE – aside from primary keys, which are implicitly unique, you might want other columns in your table to be unique. Let’s say you’re making software that lets someone make blog posts. One piece of a blog post that isn’t a primary key that you might want to keep unique is the url. If your site is example.com, you don’t want 2 articles trying to have the same url, like example.com/article.html. So you could create suitable column, like so:
blogpost_url VARCHAR(20) UNIQUE,
DEFAULT – some columns might have default values. Have you ever made an account on a website and it had a default profile picture? Even one that looks sort of blank, that’s still an image. Maybe it was using a database with a column that had a default value for a url for a picture, and maybe it’s something like ‘/profile_pictures/default.png’.
Editing tables after you’ve made them – if you created a table and messed something up, don’t worry. You can edit the table. You can also change things later, like if you change your website and want to add new features, or maybe remove old ones.
ALTER TABLE – if you want to make some sort of alteration to a table, you’ll have to start your alteration query with ALTER TABLE and then follow it with one of many different possibilities: add, drop, or change.
ADD COLUMN – you can add columns to a table.
Run the following SQL query:
ALTER TABLE users
ADD test_column VARCHAR(2) NOT NULL;
Now view it with DESCRIBE:
DESCRIBE users;
But maybe you intended to make the new column size 20 instead of 2. Do you have to delete it and then remake it all from scratch? No. You can just modify it.
MODIFY – to change a table column, you use MODIFY.
Run the following query:
ALTER TABLE users
MODIFY COLUMN test_column VARCHAR(20);
Now verify the changes with EXPLAIN:
EXPLAIN users;
And now you should see that test_column’s type is VARCHAR(20). Keep in mind that EXPLAIN and DESCRIBE only tell you about a table and its columns, not the data within it (the records). For that, you will need to use a different query which will be covered later.
DROP COLUMN – aside from adding new columns or editing existing ones, something else you can do is delete existing columns. The test_column column was useful for learning about ALTER TABLE, but it has no real use, so you should delete it now.
Run the following query:
ALTER TABLE users
DROP COLUMN test_column;
Keep in mind that, even though the above SQL is two separate lines, it is still only one query. You can tell if multiple lines are just a single query based not only on the words, but also the semicolon. There is one semicolon per query.
You can break up queries into as many lines as you want, though it makes no difference, like so (don’t run this one):
ALTER
TABLE
users
DROP
COLUMN
test_column;
Because you already dropped the test_column earlier, there is no need to run the same query a second time. If you tried to drop a column that doesn’t exist, you’d get an error message. But the point of the above query is to show that you can format it however you want. But there are certain conventions for when and where you should use line breaks.
Now run the following query to verify that the test_column column has been dropped:
DESCRIBE users;
Changing a primary key within a table – so far, you’ve been able to make, edit, or delete columns. But what about designating a different primary key? Well, as there can only be one primary key per table, first you must drop the primary key, and then add one. But don’t worry, dropping a primary key won’t drop the column. It will merely drop the primary key status of the column.
Also, in the case of the table that was created earlier in the chapter, the current PRIMARY KEY of our users table is designated as AUTO_INCREMENT. Before we can stop using userID as the PRIMARY KEY, it will need to no longer be AUTO_INCREMENT.
Run the following query to modify userID:
ALTER TABLE users
MODIFY userID INT NOT NULL;
The above command takes away userID’s AUTO_INCREMENT status, but it is still the designated PRIMARY KEY. Now we need to drop it as the primary key before being able to set another column as PRIMARY KEY. Run the following query to make the users table have no PRIMARY KEY:
ALTER TABLE users
DROP PRIMARY KEY;
Now once again view the information about the columns in users:
EXPLAIN users;
userID still exists, but it is no longer the PRIMARY KEY.
Now it’s time to finally set another column as PRIMARY KEY:
ALTER TABLE users
ADD PRIMARY KEY (email);
That demonstrated changing primary keys. But it would actually make more sense for userID to be the primary key, so let’s change it back. Some of these commands are seemingly useless, but remember that this is about education rather than creating an actual database. Not only that, but repetition makes it easier.
Run the following query to unset the primary key in the users table:
ALTER TABLE users
DROP PRIMARY KEY;
Now run the following query to change the userID column back to normal:
ALTER TABLE users
MODIFY userID INT NOT NULL PRIMARY KEY AUTO_INCREMENT;
And once again verify it. Make sure userID is now PRIMARY KEY and AUTO_INCREMENT:
DESCRIBE users;
Instead of typing out DESCRIBE users again, you can always just hit the up arrow key to scroll back to previous commands you’ve entered, kind of like in Bash.
Also, did you know that, in a MySQL prompt (such as in a terminal in Linux, not MySQL Workbench on Windows), you can use tab to autocomplete a fully disambiguated string? For example, if you type DESCR and then hit tab, it will fill in the remaining letters of DESCRIBE. But that wouldn’t work for DESC because that can mean either descending or describe, as it is ambiguous.
But now that you think about it, email should be unique, shouldn’t it? Only one account per email address. So let’s do one more modification, to get you used to doing ALTER TABLE queries, and make email UNIQUE:
UNIQUE is called a constraint in SQL. There are many different constraints.
ALTER TABLE users
MODIFY email VARCHAR(20) UNIQUE NOT NULL;
Now a user account can only use an email address if it isn’t already used by another account.
- INSERT INTO and VALUES – so enough about altering tables and columns. How about some data? The point of a database isn’t to spend time just messing around with columns, but to put records, or rows, into the tables you created.
- Here is a query you should run to add a user to the users table:
INSERT INTO users (userID, phone_number, username, email,
confirmed_email, signup_date, user_password, account_locked)
VALUES (1, ‘555-1234’, ‘bearsfan22’, ‘[email protected]’,
0, CURDATE(), ‘hunter2’, 0);
- If you make a typo, such as typing phon_number instead of phone_number, it will give you an error message and tell you where the error is (assuming that it’s a syntax error rather than a logic error). It won’t run the command, but you will be able to fix it and run it again. Sometimes you’ll make a typo, other times you’ll forget a semicolon. Instead of hoping that you’ll never make a mistake (which is impossible), it’s better to know what to do when you do make one. There’s the old cliché saying of “the master has failed more times than the beginner has even tried.” It might sound silly, but it’s true.
- Please note that the above method is a bad way to store passwords. Passwords should be hashed rather than stored in plain text. But for the time being, let’s keep it simple and just do it a bad way. Later in the chapter, I will cover password hashing.
- Now run the following query to see that the account was created in the users table:
SELECT * FROM users;
- And the output should show you the record you just made.
- CURDATE() – if you want to insert the current date into something, use the CURDATE() function. Yes, SQL supports functions, but this chapter doesn’t get into them too much.
- Download and copy and paste this dummy data to populate your database – I’ve created some “dummy” data for you to put into your users table. You can download it from a GitHub repository I made just for this section of this chapter:
https://github.com/0x416c616e/book_sql_demo/blob/master/dummy_data.txt
Copy the entire long query from the above link, and then run it in in MySQL Workbench (Windows) or the mysql> prompt (Linux/macOS). This is important for subsequent concepts and queries. When you copy and paste multi-line things into a mysql> prompt on Linux or macOS, you might have to hit enter to finish it.
Dummy data is a common concept in computer science, for when you want to practice or test something but don’t have any real data to use with it. In this case, the dummy data is being used for practicing SQL concepts, but there are many other uses of dummy data too.
Then use a SELECT query with a wildcard to view all the data:
SELECT * FROM users;
And then you’ll see a long list of users. Notice how no userID values were inserted manually. That’s where AUTO_INCREMENT comes into play. The users all have IDs despite not having them specified manually.
SELECT – allows you to pick and choose data from a database. The simplest SELECT statement uses a wildcard, which means select everything. But you have to choose FROM a table, such as FROM users.
- * – wildcard character. It’s a lazy way of selecting stuff and might not be good for performance, especially in real-world situations with large amounts of data.
- The following SELECT query will only show a single column from the users table:
SELECT username FROM users;
You can also select multiple columns in a single query, such as the following example that you should try out for yourself:
SELECT username, email, phone_number FROM users;
ORDER BY – the previous SELECT queries gave you data in no particular order. If you want the results to look a little better, use the ORDER BY clause. You can choose to order by ASC for ascending or DESC for descending.
Try out this query:
SELECT username, email
FROM users
ORDER BY username ASC;
- You not only have to choose ascending or descending, but also which column to sort with. Even if you only SELECT a single column, you still need to specify the one to sort by.
- WHERE – now you can get even more specific with your queries. Maybe you only want to view users who haven’t confirmed their email addresses so you can send them an email. In that case, run the following query:
SELECT email, confirmed_email FROM users WHERE confirmed_email = ‘0’;
- Now you might be thinking… sending emails? Like, manually? No. Even though so far you’ve just been typing out SQL queries in MySQL Workbench (Windows) or a mysql> prompt (Linux and macOS), most SQL queries are really handled by other code. So if you have a website with a user table in your MySQL database, you can use something like PHP to periodically query a table and look for user records of users who haven’t verified their email addresses yet, and then you can have PHP send them emails automatically, reminding them that they should confirm their email address. And if they click the link contained within the email, it will confirm their email address for them, updating the record in the table. Later in the book, in the PHP section, you will learn about how to connect to a MySQL database using PHP, and how to query it programmatically rather than manually. But it’s good to learn SQL on its own first rather than trying to learn both PHP and SQL at the same time.
SELECT DISTINCT – if you want to only view distinct results, use SELECT DISTINCT column_name FROM table_name. Sometimes, certain columns might be duplicates.
Copy, paste, and run the SQL queries in MySQL Workbench or the mysql> prompt from this file for the next few query demonstrations in this section:
https://github.com/0x416c616e/book_sql_demo/blob/master/dummy_data_2.txt
Now, to demonstrate SELECT DISTINCT, run the following query:
SELECT DISTINCT * from union_demo;
Notice how there are 6 results.
Try it again, but without distinct:
SELECT * from union_demo;
Now there are 7 results, two of which are the same.
You can also use DISTINCT with individual columns rather than a wildcard:
SELECT num FROM union_demo;
Compare that to the result from the DISTINCT version:
SELECT DISTINCT num FROM union_demo;
UNION – if you want to combine multiple queries into one, use UNION. There are many different types of unions. UNION excludes duplicates, but UNION ALL does not.
Try the following query to see how UNION only shows unique values:
SELECT num
FROM union_demo
UNION
SELECT num
FROM table_2;
Now try it again, this time including duplicates:
SELECT num
FROM union_demo
UNION ALL
SELECT num
FROM table_2;
Not only that, but you can also order unions:
SELECT num
FROM union_demo
UNION ALL
SELECT num
FROM table_2
ORDER BY num ASC;
Something worth noting about UNION vs. UNION ALL is that, if there are duplicates within a single table, they will be excluded. Take a look at how there are duplicate chars column values even in a single table:
SELECT chars
FROM union_demo
ORDER BY chars ASC;
Notice how ‘abc’ shows up three times in the single table.
But when you add a UNION into the mix, it only shows the distinct values from union_demo:
SELECT chars
FROM union_demo
UNION
SELECT chars
FROM table_2
ORDER BY chars ASC;
So the definition of “duplicate” in UNION terms doesn’t just apply to values that are shared by both tables, like a Venn diagram. It even applies to duplicates within a single table. Depending on what you’re doing, you may or may not want duplicate values.
DELETE – so far, you’ve learned how to use INSERT INTO table_name (column1, column2) VALUES(‘value1’, ‘value2’) to put records into a table. But what about getting rid of them? That’s where the DELETE statement comes into play.
Use the following query to go back to the learning database:
USE learning;
To refresh your memory about what’s in it, use the following query:
SHOW TABLES;
Now view all of the contents of the users table again, as “before” part of a before and after deleting stuff:
SELECT * FROM users;
Linux/macOS only instructions (if you’re on Windows, skip to the next bold text section): There’s a setting you’ll need to change for Bitnami LAMP/MAMP. You need to enable safe updating, which takes a few steps to enable.
Type exit; to exit the mysqld prompt. Then close the terminal and open it again. Then do cd ~/lampstack-7.2.24-0 or whatever your version number is. Then do cd mysql. Then do touch safe.sql. Then, if you don’t have vim installed, do sudo apt-get install vim on Debian-based Linux distros, or brew install vim on macOS, or use whatever package manager you have if you’re using another Linux distro. Then do vim safe.sql. Hit i to enter insert mode, then type this in safe.sql:
SET GLOBAL SQL_SAFE_UPDATES = ON;
Then hit escape, then type :wq to save and quit.
Now do vim my.cnf. Hit i to enter insert mode. Then go to the blank line beneath collation-server=utf8mb4_general_ci. Now make a new line by hitting enter, and type the following (for Linux):
init-file=”/home/yourname/lampstack-7.2.24-0/mysql/safe.sql”
If you’re on macOS instead of Linux, use this line instead of the above line:
init-file=”/Applications/mampstack-7.2.24-0/mysql/safe.sql”
Replace yourname with your actual account name on Linux. If you don’t know your account name, open a new terminal window and type whoami and hit enter. Replace 7.2.24-0 with whatever version you have installed. You should be able to see other stuff within the my.cnf that has your correct version number.
Then hit escape and then type :wq to save and quit.
Then, if you’re on Linux, do cd .. and then do ./manager-linux-x64.run
If you’re on macOS, go to the manager-osx icon from your dock. You will also need to do cd .. to get into the right directory for the next step (the mampstack directory rather than the mysql subfolder).
Click on MySQL Database and click Stop. Do the same for Apache Web Server. Wait a while until the Status column goes from saying “Stopping” to saying “Stopped.” Now start both of them again. Once it says “Running” for both of them, close the window. Now use ./use_lampstack. Then use mysql -u root -p and hit enter. Now type your MySQL root password. If everything was successful, you should see a mysql> prompt again. Now, to verify that your configurations were correct, use the following command:
SHOW VARIABLES LIKE ‘sql_safe_updates’;
It should say that the value is ON. If not, you did something wrong. If it doesn’t work, try restarting your computer, and if it still says that sql_safe_updates is OFF, then look through the previous instructions again and make sure you followed them correctly.
The above instructions enable safe updating, which prevents the use of unbounded deleting and editing. In Windows, using Bitnami WAMP and MySQL Workbench, safe updating is on by default. But for some reason, in Bitnami LAMP, it’s disabled by default, which is very bad. Why aren’t they all on by default? I have no idea. Technology isn’t always consistent.
Resuming instructions for all OSes:
Now that you’ve edited the config files (on macOS and Linux, anyway), get back to your mysql> prompt and then use USE learning; again.
Now I want to show you an example of a DELETE query that will fail (and I’ll explain why it’s important after you run it):
DELETE FROM users WHERE username = ‘bearsfan22’;
Notice how it doesn’t work. You get an error message in the Output section below the Query 1 tab (Windows) or the terminal (Linux/macOS):
Error Code: 1175. You are using safe update mode and you tried to update a table without a WHERE that uses a KEY column.
This is important because you need to know how certain queries can be dangerous, especially if you make a mistake. You could accidentally delete data that you didn’t intend to delete. If you google the error message, you will find results of people telling you how to disable it. But that’s an XY problem again (you have problem X and then think Y is the solution, so you concentrate on how to figure out how to do Y rather than figuring out if there is a better solution to X). If your queries won’t work with safe mode enabled, then you need to change your queries. Disabling safe mode is not the solution. It can be dangerous to use database queries without safe mode enabled.
There are really two types of queries: bounded and unbounded. With bounded queries, you are very sure of which records you’re trying to change. When something is unbounded, it can be a lot of things that you didn’t explicitly specify. Bounded queries are safe, but unbounded ones are not. Safe mode prevents you from making unbounded UPDATE or DELETE queries, which is actually a good thing.
If you want to delete something, and the table has a primary key, you will need to specify the primary key of the record to delete, such as the following query (which actually will work):
DELETE FROM users WHERE userID = ‘2’;
userID is the PRIMARY KEY for the users table, and it’s UNIQUE and NOT NULL. This is how you should refer to things to DELETE. If you disabled safe update mode, you would be able to make sweeping queries that delete tons of stuff all at once when you don’t even mean to. Referencing the primary key in a query means you’re relatively sure of which rows you want to modify, especially considering that primary keys are never duplicates.
Use the wildcard select to view all records in the table, and look at the userIDs:
SELECT * FROM users
ORDER BY userID ASC;
Notice how the row with a userID value of 2 is gone now.
- UPDATE and SET – now you know how to create and delete records. But what if you just want to edit them? Use UPDATE to change something.
- Run the following query even though it will fail, just to show that safe mode protects UPDATE queries as well as DELETE queries:
UPDATE users
SET username = ‘cubsfan’
WHERE username = ‘cubsfan99’;
- The username column is not the primary key, and as such, you can’t use it for an UPDATE.
- Try it again with the userID of the cubsfan99 account:
UPDATE users
SET username = ‘cubsfan’
WHERE userID = ‘3’;
- Now view the table again to see that it was updated:
SELECT userID, username
FROM users
ORDER BY userID ASC;
- LIMIT – Let’s say there are lots of users on your website, meaning lots of rows in your users table. If you were to fetch every single result every time, that could be very slow. To increase performance, and also decrease the amount of output of a query, you can use LIMIT followed by an integer to limit the output to that many results.
- Try this example query to get the last 5 accounts (alphabetically speaking) in your users table:
SELECT username
FROM users
ORDER BY username DESC
LIMIT 5;
- Numeric operators
- Run the following queries separately and then view the results:
USE duplicate_demo;
SELECT num + 1000 AS numAdd
FROM table_2;
SELECT num – 500 AS numSub
FROM table_2;
SELECT num * 123 AS numMult
FROM table_2;
SELECT num / 5
FROM table_2;
SELECT num % 4
FROM table_2;
Comparison operators
In other languages, = means “set equal to” or “assign.” In most languages, you’d need to use == to compare one value to another, like x == 5 to check if x is equal to 5. But in SQL = is used for both assignment and comparison. This is an example of operator overloading, as its meaning depends on its context.
SQL is strange in the sense that it doesn’t have a Boolean data type (TINYINT(1) isn’t exactly a “real” Boolean), but it does have things that can evaluate to either true or false.
Run the following queries in MySQL Workbench (Windows) or mysql> (Linux/macOS) to learn about comparison operators in SQL:
USE learning;
SELECT * FROM users WHERE userID = 5;
In the above example, you’re not assigning userID to 5, but rather you’re checking all the stuff from the users table and evaluating whether or not userID = 5 is true or false. If it’s true, it will be put in the results from the query.
Run the following query:
ALTER TABLE users
ADD num_column SMALLINT;
The above query adds a new column in the users table that will be used in the following queries to demonstrate SQL operators.
UPDATE users
SET num_column = ‘500’
WHERE userId > 10;
Then:
UPDATE users
SET num_column = num_column + 127
WHERE userID < 13;
Here’s something I want to point out about the above query: it looks as though all num_column values for userIDs of less than 13 will be increased by 127, right? But that’s wrong. It’s a little tricky at first, until you realize what’s wrong with it. MySQL will accept that query and run it with no error messages at all. But you know what will actually happen? In SQL, NULL + 127 is still NULL. NULL is not zero. NULL is the absence of a value. Zero is still a value, so adding 127 to 0 will yield 127. Any sort of arithmetic operations applied to a NULL value will still result in NULL, at least in SQL. Many of the records in the users table have a num_column value of NULL still.
When you created the new num_column column, its values were all NULL by default. You can change defaults or also enforce that things won’t be null if you really want though.
You can even verify this with a SELECT:
SELECT * FROM users;
Here’s how you can make them all 0 to start, so that you’ll be able to add stuff to them later:
UPDATE users
SET num_column = 0
WHERE userID <= 10;
Now use the following query to verify that they’re not NULL anymore:
SELECT * FROM users
WHERE num_column = NULL;
Notice how there are no results, which means they aren’t NULL anymore.
Run the following queries so that subsequent queries can demonstrate things like ranges and numeric comparison:
UPDATE users
SET num_column = num_column + 55
WHERE userID BETWEEN 0 AND 5;
The following query only displays odd-numbered IDs and demonstrates <>, which means not equal to:
SELECT * FROM users
WHERE userID % 2 <> 0;
- The following query selects userIDs 1, 6, 8, and 15 and applies the UPDATE to those records only:
UPDATE users
SET num_column = num_column * 3
WHERE userID IN (1, 6, 8, 15);
Here are a few more queries to change num_column values a little more, just to make them a little more varied and to demonstrate some new statements:
UPDATE users
SET num_column = num_column + 144
WHERE userID <= 8;
UPDATE users
SET num_column = num_column – 46
WHERE confirmed_email = 0
AND userID < 11;
If you want to combine multiple WHERE clauses, use AND. It will only apply to records where both conditions are true.
UPDATE users
SET num_column = num_column DIV 2
WHERE userID < 4
OR userID BETWEEN 10 AND 11
OR userID >= 14;
OR is for when you want a WHERE clause to apply to one thing or another. You can use multiple ORs together. In this case, the UPDATE will apply if any of the WHERE conditions are met: if the userID is less than 4, between 10 and 11 (which includes boundaries), or where the userID is greater than or equal to 14. AND requires both operands to be true in order for it to be true, such as 1+1=2 AND 2+2=4. But for OR, only one operand needs to be true in order for the whole statement to be true, such as 1+1=3 OR 1+1=2.
DIV is floor division, also known as integer division. Regular division in SQL will give you a remainder, such as 8 / 5 = 1.6, a.k.a. 1 remainder 3. But if you did 8 DIV 5, you’d get 1 because it discards the remainder.
The next query demonstrates XOR, which is exclusive OR:
UPDATE users
SET num_column = num_column * 2
WHERE userID > 10
AND (num_column > 0
XOR confirmed_email = 1);
In A OR B, it will be true if A is true, B is true, or if both A and B are true. But in A XOR B, it will only be true if A is true and B is false, or if A is false and B is true. It can be one or the other, but not both.
NOT – you can choose to select something that is true, or in the case of NOT, false.
The following query means that you will only get back results for accounts that are not locked, because in TINYINT(1), a 0 means false and a 1 means true. So it means select all from users where account_locked is not true:
SELECT * FROM users WHERE NOT account_locked = 1;
It’s similar to other programming languages that have operators like != or !.
JOIN – so far, we’ve only dealt with queries that deal with data from a single table. But what if you want to combine data from multiple tables?
To demonstrate a join, first run the following two queries to make a new table for sales (you don’t have to type in the multi-line comment):
USE learning;
CREATE TABLE sales (
quantity INT NOT NULL,
price_per DECIMAL(8,2) NOT NULL,
/*above means 8 places
and 2 digits for decimal places
because there are two places in cents
i.e. $10.50, but not 10.5000000
*/
product_name VARCHAR(40) NOT NULL,
product_sku INT NOT NULL,
purchase_time DATETIME NOT NULL,
userID INT
);
The above table is for keeping track of when a customer orders something at a shop. The users table is for users who have signed up. Maybe they can get discounts as a result of signing up. There can be some sort of incentive to get customers to give their information to you so that you can contact them and try to get more business from them. Both parties benefit from it. That being said, not every customer will be registered, which is why the userID column in the sales table can actually be null. You wouldn’t want to lose a sale just because a customer isn’t registered with your retail store. userID in the sales table is also not unique, because a user can make many sales.
It would not make sense to put orders in a users table, or vice versa. So you want the tables to be separate. But at the same time, you will want to be able to put data from both tables together. This is why JOIN exists.
Now that there are two tables in the learning database, we will be able to try out JOIN queries, but not until the new table is populated. Copy and paste the query from the git repo here:
https://github.com/0x416c616e/book_sql_demo/blob/master/dummy_data_3.txt
After copying and pasting the above example, run this query:
SELECT *
FROM sales
JOIN users
ON sales.userID = users.userID;
In the above example, on Linux, the output might look a little messed up if you have a low screen resolution, because it’s trying to display so many columns side by side, and if your screen is too small, things from one line will wrap over to the next line. If your screen resolution is less than 1920×1080, on Linux, in your terminal window, you can go to View -> Zoom Out a couple times until it fits for your personal screen resolution. Then you can see the entire output without any issues.
On macOS, in Terminal, go to View -> Smaller to zoom out to see more if it looks distorted because of there being too many things on a single line, leading to text wrapping.
The above query will show each order and the registered customer associated with it.
But that’s when everything has a value. What if you try to join tables when there are things that are null?
Try adding a new purchase record with the following query:
INSERT INTO sales (quantity, price_per, product_name,
product_sku, purchase_time)
VALUES (1, 149.99, “monitor”, 8595307, CURRENT_TIMESTAMP());
Because it doesn’t feature a userID, the userID column for this row will be null. It’s acceptable in this table because the column did not specify NOT NULL. And as I said before, you don’t want to turn down a sale at a business just because a user doesn’t want to sign up for an account.
So what happens when you run the JOIN again, considering that one record has nothing to JOIN something to? Find out by running the JOIN again, but this time, a LEFT JOIN, which will show purchases that do not have associated user accounts:
SELECT *
FROM sales
LEFT JOIN users
ON sales.userID = users.userID;
There are two other types of joins too.
RIGHT JOIN will show all the purchases associated with user accounts, in addition to accounts with no purchases, but not purchases that are not associated with accounts:
SELECT *
FROM sales
RIGHT JOIN users
ON sales.userID = users.userID;
What if you want to show everything? That is, users with purchases, users without purchases, and purchases without users. That would be a FULL JOIN in other versions of SQL, but MySQL doesn’t support it. This is one example of how not all DBMS software is the same, even if SQL is supposed to be the same across different implementations. But the reality of the matter is that they are slightly different. SQLite, PostgreSQL, MySQL, Microsoft SQL Server, Oracle Database, and IBM DB2 have some small differences between them. However, you can still view everything in MySQL, but you will need to use UNION to combine both a RIGHT JOIN and a LEFT JOIN:
SELECT *
FROM sales
RIGHT JOIN users
ON sales.userID = users.userID
UNION
SELECT *
FROM sales
LEFT JOIN users
ON sales.userID = users.userID;
The above query has a lot of results, but if you scroll down, you will notice the three categories of records all there: users with sales, users with null sales, and null users with sales. Keep in mind: JOIN does not change tables. It just shows you results from multiple tables joined together.
It’s easier to think of JOINs with Venn diagrams.