parent
f3575235c6
commit
d42a09c9aa
@ -0,0 +1,3 @@
|
|||||||
|
.DS_Store
|
||||||
|
.venv
|
||||||
|
site/
|
@ -0,0 +1,13 @@
|
|||||||
|
# Conclusion
|
||||||
|
We have covered basic concepts of SQL databases. We have also covered some of the tasks that an SRE may be responsible for - there is so much more to learn and do. We hope this course gives you a good start and inspires you to explore further.
|
||||||
|
|
||||||
|
|
||||||
|
### Further reading
|
||||||
|
|
||||||
|
* More practice with online resources like [this one](https://www.w3resource.com/sql-exercises/index.php)
|
||||||
|
* [Normalization](https://beginnersbook.com/2015/05/normalization-in-dbms/)
|
||||||
|
* [Routines](https://dev.mysql.com/doc/refman/8.0/en/stored-routines.html), [triggers](https://dev.mysql.com/doc/refman/8.0/en/trigger-syntax.html)
|
||||||
|
* [Views](https://www.essentialsql.com/what-is-a-relational-database-view/)
|
||||||
|
* [Transaction isolation levels](https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html)
|
||||||
|
* [Sharding](https://www.digitalocean.com/community/tutorials/understanding-database-sharding)
|
||||||
|
* [Setting up HA](https://severalnines.com/database-blog/introduction-database-high-availability-mysql-mariadb), [monitoring](https://blog.serverdensity.com/how-to-monitor-mysql/), [backups](https://dev.mysql.com/doc/refman/8.0/en/backup-methods.html)
|
Binary file not shown.
After Width: | Height: | Size: 36 KiB |
Binary file not shown.
After Width: | Height: | Size: 24 KiB |
@ -0,0 +1,21 @@
|
|||||||
|
# Relational Databases
|
||||||
|
|
||||||
|
### What to expect from this training
|
||||||
|
You will have an understanding of what relational databases are, their advantages, and some MySQL specific concepts.
|
||||||
|
|
||||||
|
### What is not covered under this course
|
||||||
|
* In depth implementation details
|
||||||
|
|
||||||
|
* Advanced topics like normalization, sharding
|
||||||
|
|
||||||
|
* Specific tools for administration
|
||||||
|
|
||||||
|
### Introduction
|
||||||
|
The main purpose of database systems is to manage data. This includes storage, adding new data, deleting unused data, updating existing data, retrieving data within a reasonable response time, other maintenance tasks to keep the system running etc.
|
||||||
|
|
||||||
|
### Prerequisites
|
||||||
|
* Complete [Linux course](/linux_basics/intro/)
|
||||||
|
* Install Docker (for lab section)
|
||||||
|
|
||||||
|
### Pre-reads
|
||||||
|
[RDBMS Concepts](https://beginnersbook.com/2015/04/rdbms-concepts/)
|
@ -0,0 +1,207 @@
|
|||||||
|
**Prerequisites**
|
||||||
|
|
||||||
|
Install Docker
|
||||||
|
|
||||||
|
|
||||||
|
**Setup**
|
||||||
|
|
||||||
|
Create a working directory named sos or something similar, and cd into it.
|
||||||
|
|
||||||
|
Enter the following into a file named my.cnf under a directory named custom.
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
sos $ cat custom/my.cnf
|
||||||
|
[mysqld]
|
||||||
|
# These settings apply to MySQL server
|
||||||
|
# You can set port, socket path, buffer size etc.
|
||||||
|
# Below, we are configuring slow query settings
|
||||||
|
slow_query_log=1
|
||||||
|
slow_query_log_file=/var/log/mysqlslow.log
|
||||||
|
long_query_time=0.1
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
Start a container and enable slow query log with the following:
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
sos $ docker run --name db -v custom:/etc/mysql/conf.d -e MYSQL_ROOT_PASSWORD=realsecret -d mysql:8
|
||||||
|
sos $ docker cp custom/mysqld.cnf $(docker ps -qf "name=db"):/etc/mysql/conf.d/custom.cnf
|
||||||
|
sos $ docker restart $(docker ps -qf "name=db")
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
Import a sample database
|
||||||
|
|
||||||
|
|
||||||
|
```
|
||||||
|
sos $ git clone git@github.com:datacharmer/test_db.git
|
||||||
|
sos $ docker cp test_db $(docker ps -qf "name=db"):/home/test_db/
|
||||||
|
sos $ docker exec -it $(docker ps -qf "name=db") bash
|
||||||
|
root@3ab5b18b0c7d:/# cd /home/test_db/
|
||||||
|
root@3ab5b18b0c7d:/# mysql -uroot -prealsecret mysql < employees.sql
|
||||||
|
root@3ab5b18b0c7d:/etc# touch /var/log/mysqlslow.log
|
||||||
|
root@3ab5b18b0c7d:/etc# chown mysql:mysql /var/log/mysqlslow.log
|
||||||
|
```
|
||||||
|
|
||||||
|
|
||||||
|
_Workshop 1: Run some sample queries_
|
||||||
|
Run the following
|
||||||
|
```
|
||||||
|
$ mysql -uroot -prealsecret mysql
|
||||||
|
mysql>
|
||||||
|
|
||||||
|
# inspect DBs and tables
|
||||||
|
# the last 4 are MySQL internal DBs
|
||||||
|
|
||||||
|
mysql> show databases;
|
||||||
|
+--------------------+
|
||||||
|
| Database |
|
||||||
|
+--------------------+
|
||||||
|
| employees |
|
||||||
|
| information_schema |
|
||||||
|
| mysql |
|
||||||
|
| performance_schema |
|
||||||
|
| sys |
|
||||||
|
+--------------------+
|
||||||
|
|
||||||
|
> use employees;
|
||||||
|
mysql> show tables;
|
||||||
|
+----------------------+
|
||||||
|
| Tables_in_employees |
|
||||||
|
+----------------------+
|
||||||
|
| current_dept_emp |
|
||||||
|
| departments |
|
||||||
|
| dept_emp |
|
||||||
|
| dept_emp_latest_date |
|
||||||
|
| dept_manager |
|
||||||
|
| employees |
|
||||||
|
| salaries |
|
||||||
|
| titles |
|
||||||
|
+----------------------+
|
||||||
|
|
||||||
|
# read a few rows
|
||||||
|
mysql> select * from employees limit 5;
|
||||||
|
|
||||||
|
# filter data by conditions
|
||||||
|
mysql> select count(*) from employees where gender = 'M' limit 5;
|
||||||
|
|
||||||
|
# find count of particular data
|
||||||
|
mysql> select count(*) from employees where first_name = 'Sachin';
|
||||||
|
```
|
||||||
|
|
||||||
|
_Workshop 2: Use explain and explain analyze to profile a query, identify and add indexes required for improving performance_
|
||||||
|
```
|
||||||
|
# View all indexes on table
|
||||||
|
#(\G is to output horizontally, replace it with a ; to get table output)
|
||||||
|
mysql> show index from employees from employees\G
|
||||||
|
*************************** 1. row ***************************
|
||||||
|
Table: employees
|
||||||
|
Non_unique: 0
|
||||||
|
Key_name: PRIMARY
|
||||||
|
Seq_in_index: 1
|
||||||
|
Column_name: emp_no
|
||||||
|
Collation: A
|
||||||
|
Cardinality: 299113
|
||||||
|
Sub_part: NULL
|
||||||
|
Packed: NULL
|
||||||
|
Null:
|
||||||
|
Index_type: BTREE
|
||||||
|
Comment:
|
||||||
|
Index_comment:
|
||||||
|
Visible: YES
|
||||||
|
Expression: NULL
|
||||||
|
|
||||||
|
# This query uses an index, idenitfied by 'key' field
|
||||||
|
# By prefixing explain keyword to the command,
|
||||||
|
# we get query plan (including key used)
|
||||||
|
mysql> explain select * from employees where emp_no < 10005\G
|
||||||
|
*************************** 1. row ***************************
|
||||||
|
id: 1
|
||||||
|
select_type: SIMPLE
|
||||||
|
table: employees
|
||||||
|
partitions: NULL
|
||||||
|
type: range
|
||||||
|
possible_keys: PRIMARY
|
||||||
|
key: PRIMARY
|
||||||
|
key_len: 4
|
||||||
|
ref: NULL
|
||||||
|
rows: 4
|
||||||
|
filtered: 100.00
|
||||||
|
Extra: Using where
|
||||||
|
|
||||||
|
# Compare that to the next query which does not utilize any index
|
||||||
|
mysql> explain select first_name, last_name from employees where first_name = 'Sachin'\G
|
||||||
|
*************************** 1. row ***************************
|
||||||
|
id: 1
|
||||||
|
select_type: SIMPLE
|
||||||
|
table: employees
|
||||||
|
partitions: NULL
|
||||||
|
type: ALL
|
||||||
|
possible_keys: NULL
|
||||||
|
key: NULL
|
||||||
|
key_len: NULL
|
||||||
|
ref: NULL
|
||||||
|
rows: 299113
|
||||||
|
filtered: 10.00
|
||||||
|
Extra: Using where
|
||||||
|
|
||||||
|
# Let's see how much time this query takes
|
||||||
|
mysql> explain analyze select first_name, last_name from employees where first_name = 'Sachin'\G
|
||||||
|
*************************** 1. row ***************************
|
||||||
|
EXPLAIN: -> Filter: (employees.first_name = 'Sachin') (cost=30143.55 rows=29911) (actual time=28.284..3952.428 rows=232 loops=1)
|
||||||
|
-> Table scan on employees (cost=30143.55 rows=299113) (actual time=0.095..1996.092 rows=300024 loops=1)
|
||||||
|
|
||||||
|
|
||||||
|
# Cost(estimated by query planner) is 30143.55
|
||||||
|
# actual time=28.284ms for first row, 3952.428 for all rows
|
||||||
|
# Now lets try adding an index and running the query again
|
||||||
|
mysql> create index idx_firstname on employees(first_name);
|
||||||
|
Query OK, 0 rows affected (1.25 sec)
|
||||||
|
Records: 0 Duplicates: 0 Warnings: 0
|
||||||
|
|
||||||
|
mysql> explain analyze select first_name, last_name from employees where first_name = 'Sachin';
|
||||||
|
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||||
|
| EXPLAIN |
|
||||||
|
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||||
|
| -> Index lookup on employees using idx_firstname (first_name='Sachin') (cost=81.20 rows=232) (actual time=0.551..2.934 rows=232 loops=1)
|
||||||
|
|
|
||||||
|
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||||
|
1 row in set (0.01 sec)
|
||||||
|
|
||||||
|
# Actual time=0.551ms for first row
|
||||||
|
# 2.934ms for all rows. A huge improvement!
|
||||||
|
# Also notice that the query involves only an index lookup,
|
||||||
|
# and no table scan (reading all rows of table)
|
||||||
|
# ..which vastly reduces load on the DB.
|
||||||
|
```
|
||||||
|
|
||||||
|
_Workshop 3: Identify slow queries on a MySQL server_
|
||||||
|
```
|
||||||
|
# Run the command below in two terminal tabs to open two shells into the container.
|
||||||
|
docker exec -it $(docker ps -qf "name=db") bash
|
||||||
|
|
||||||
|
# Open a mysql prompt in one of them and execute this command
|
||||||
|
# We have configured to log queries that take longer than 1s,
|
||||||
|
# so this sleep(3) will be logged
|
||||||
|
mysql -uroot -prealsecret mysql
|
||||||
|
mysql> sleep(3);
|
||||||
|
|
||||||
|
# Now, in the other terminal, tail the slow log to find details about the query
|
||||||
|
root@62c92c89234d:/etc# tail -f /var/log/mysqlslow.log
|
||||||
|
/usr/sbin/mysqld, Version: 8.0.21 (MySQL Community Server - GPL). started with:
|
||||||
|
Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock
|
||||||
|
Time Id Command Argument
|
||||||
|
# Time: 2020-11-26T14:53:44.822348Z
|
||||||
|
# User@Host: root[root] @ localhost [] Id: 9
|
||||||
|
# Query_time: 5.404938 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
|
||||||
|
use employees;
|
||||||
|
# Time: 2020-11-26T14:53:58.015736Z
|
||||||
|
# User@Host: root[root] @ localhost [] Id: 9
|
||||||
|
# Query_time: 10.000225 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
|
||||||
|
SET timestamp=1606402428;
|
||||||
|
select sleep(3);
|
||||||
|
```
|
||||||
|
|
||||||
|
These were simulated examples with minimal complexity. In real life, the queries would be much more complex and the explain/analyze and slow query logs would have more details.
|
@ -0,0 +1,38 @@
|
|||||||
|
### MySQL architecture
|
||||||
|
|
||||||
|
![alt_text](images/mysql_architecture.png "MySQL architecture diagram")
|
||||||
|
|
||||||
|
MySQL architecture enables you to select the right storage engine for your needs, and abstracts away all implementation details from the end users (application engineers and [DBA](https://en.wikipedia.org/wiki/Database_administrator)) who only need to know a consistent stable API.
|
||||||
|
|
||||||
|
Application layer:
|
||||||
|
|
||||||
|
* Connection handling - each client gets its own connection which is cached for the duration of access)
|
||||||
|
* Authentication - server checks (username,password,host) info of client and allows/rejects connection
|
||||||
|
* Security: server determines whether the client has privileges to execute each query (check with _show privileges_ command)
|
||||||
|
|
||||||
|
Server layer:
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
* Services and utilities - backup/restore, replication, cluster etc
|
||||||
|
* SQL interface - clients run queries for data access and manipulation
|
||||||
|
* SQL parser - creates a parse tree from the query (lexical/syntactic/semantic analysis and code generation)
|
||||||
|
* Optimizer - optimizes queries using various algorithms and data available to it(table level stats), modifies queries, order of scanning, indexes to use etc. (check with explain command)
|
||||||
|
* Caches and buffers - cache stores query results, buffer pool(InnoDB) stores table and index data in [LRU](https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)) fashion
|
||||||
|
|
||||||
|
Storage engine options:
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
* InnoDB: most widely used, transaction support, ACID compliant, supports row-level locking, crash recovery and multi-version concurrency control. Default since MySQL 5.5+.
|
||||||
|
* MyISAM: fast, does not support transactions, provides table-level locking, great for read-heavy workloads, mostly in web and data warehousing. Default upto MySQL 5.1.
|
||||||
|
* Archive: optimised for high speed inserts, compresses data as it is inserted, does not support transactions, ideal for storing and retrieving large amounts of seldom referenced historical, archived data
|
||||||
|
* Memory: tables in memory. Fastest engine, supports table-level locking, does not support transactions, ideal for creating temporary tables or quick lookups, data is lost after a shutdown
|
||||||
|
* CSV: stores data in CSV files, great for integrating into other applications that use this format
|
||||||
|
* … etc.
|
||||||
|
|
||||||
|
It is possible to migrate from one storage engine to another. But this migration locks tables for all operations and is not online, as it changes the physical layout of the data. It takes a long time and is generally not recommended. Hence, choosing the right storage engine at the beginning is important.
|
||||||
|
|
||||||
|
General guideline is to use InnoDB unless you have a specific need for one of the other storage engines.
|
||||||
|
|
||||||
|
Running `mysql> SHOW ENGINES; `shows you the supported engines on your MySQL server.
|
@ -0,0 +1,64 @@
|
|||||||
|
* Explain and explain+analyze
|
||||||
|
|
||||||
|
EXPLAIN <query> analyzes query plans from the optimizer, including how tables are joined, which tables/rows are scanned etc.
|
||||||
|
|
||||||
|
Explain analyze shows the above and additional info like execution cost, number of rows returned, time taken etc.
|
||||||
|
|
||||||
|
This knowledge is useful to tweak queries and add indexes.
|
||||||
|
|
||||||
|
Watch this performance tuning [tutorial video](https://www.youtube.com/watch?v=pjRTLPeUOug).
|
||||||
|
|
||||||
|
Checkout the [lab section](../lab.md) for a hands-on about indexes.
|
||||||
|
|
||||||
|
* [Slow query logs](https://dev.mysql.com/doc/refman/5.7/en/slow-query-log.html)
|
||||||
|
|
||||||
|
Used to identify slow queries (configurable threshold), enabled in config or dynamically with a query
|
||||||
|
|
||||||
|
Checkout the [lab section](../lab.md) about identifying slow queries.
|
||||||
|
|
||||||
|
* User management
|
||||||
|
|
||||||
|
This includes creation and changes to users, like managing privileges, changing password etc.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
* Backup and restore strategies, pros and cons
|
||||||
|
|
||||||
|
Logical backup using mysqldump - slower but can be done online
|
||||||
|
|
||||||
|
Physical backup (copy data directory or use xtrabackup) - quick backup/recovery. Copying data directory requires locking or shut down. xtrabackup is an improvement because it supports backups without shutting down (hot backup).
|
||||||
|
|
||||||
|
Others - PITR, snapshots etc.
|
||||||
|
|
||||||
|
|
||||||
|
* Crash recovery process using redo logs
|
||||||
|
|
||||||
|
After a crash, when you restart server it reads redo logs and replays modifications to recover
|
||||||
|
|
||||||
|
|
||||||
|
* Monitoring MySQL
|
||||||
|
|
||||||
|
Key MySQL metrics: reads, writes, query runtime, errors, slow queries, connections, running threads, InnoDB metrics
|
||||||
|
|
||||||
|
Key OS metrics: CPU, load, memory, disk I/O, network
|
||||||
|
|
||||||
|
|
||||||
|
* Replication
|
||||||
|
|
||||||
|
Copies data from one instance to one or more instances. Helps in horizontal scaling, data protection, analytics and performance. Binlog dump thread on primary, replication I/O and SQL threads on secondary. Strategies include the standard async, semi async or group replication.
|
||||||
|
|
||||||
|
* High Availability
|
||||||
|
|
||||||
|
Ability to cope with failure at software, hardware and network level. Essential for anyone who needs 99.9%+ uptime. Can be implemented with replication or clustering solutions from MySQL, Percona, Oracle etc. Requires expertise to setup and maintain. Failover can be manual, scripted or using tools like Orchestrator.
|
||||||
|
|
||||||
|
* [Data directory](https://dev.mysql.com/doc/refman/8.0/en/data-directory.html)
|
||||||
|
|
||||||
|
Data is stored in a particular directory, with nested directories for the data contained in each database. There are also MySQL log files, InnoDB log files, server process ID file and some other configs. The data directory is configurable.
|
||||||
|
|
||||||
|
* [MySQL configuration](https://dev.mysql.com/doc/refman/5.7/en/server-configuration.html)
|
||||||
|
|
||||||
|
This can be done by passing [parameters during startup](https://dev.mysql.com/doc/refman/5.7/en/server-options.html), or in a [file](https://dev.mysql.com/doc/refman/8.0/en/option-files.html). There are a few [standard paths](https://dev.mysql.com/doc/refman/8.0/en/option-files.html#option-file-order) where MySQL looks for config files, `/etc/my.cnf` is one of the commonly used paths. These options are organized under headers (mysqld for server and mysql for client), you can explore them more in the lab that follows.
|
||||||
|
|
||||||
|
* [Logs](https://dev.mysql.com/doc/refman/5.7/en/server-logs.html)
|
||||||
|
|
||||||
|
MySQL has logs for various purposes - general query log, errors, binary logs (for replication), slow query log. Only error log is enabled by default (to reduce I/O and storage requirement), the others can be enabled when required - by specifying config parameters at startup or running commands at runtime. [Log destination](https://dev.mysql.com/doc/refman/5.7/en/log-destinations.html) can also be tweaked with config parameters.
|
Loading…
Reference in New Issue