reorganise course folders
Before Width: | Height: | Size: 65 KiB |
Before Width: | Height: | Size: 21 KiB |
Before Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 30 KiB |
Before Width: | Height: | Size: 371 KiB |
Before Width: | Height: | Size: 30 KiB |
@ -1,57 +0,0 @@
|
||||
# Big Data
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Basics of Linux File systems.
|
||||
- Basic understanding of System Design.
|
||||
|
||||
## What to expect from this course
|
||||
|
||||
This course covers the basics of Big Data and how it has evolved to become what it is today. We will take a look at a few realistic scenarios where Big Data would be a perfect fit. An interesting assignment on designing a Big Data system is followed by understanding the architecture of Hadoop and the tooling around it.
|
||||
|
||||
## What is not covered under this course
|
||||
|
||||
Writing programs to draw analytics from data.
|
||||
|
||||
## Course Contents
|
||||
|
||||
1. [Overview of Big Data](https://linkedin.github.io/school-of-sre/big_data/intro/#overview-of-big-data)
|
||||
2. [Usage of Big Data techniques](https://linkedin.github.io/school-of-sre/big_data/intro/#usage-of-big-data-techniques)
|
||||
3. [Evolution of Hadoop](https://linkedin.github.io/school-of-sre/big_data/evolution/)
|
||||
4. [Architecture of hadoop](https://linkedin.github.io/school-of-sre/big_data/evolution/#architecture-of-hadoop)
|
||||
1. HDFS
|
||||
2. Yarn
|
||||
5. [MapReduce framework](https://linkedin.github.io/school-of-sre/big_data/evolution/#mapreduce-framework)
|
||||
6. [Other tooling around hadoop](https://linkedin.github.io/school-of-sre/big_data/evolution/#other-tooling-around-hadoop)
|
||||
1. Hive
|
||||
2. Pig
|
||||
3. Spark
|
||||
4. Presto
|
||||
7. [Data Serialisation and storage](https://linkedin.github.io/school-of-sre/big_data/evolution/#data-serialisation-and-storage)
|
||||
|
||||
|
||||
# Overview of Big Data
|
||||
|
||||
1. Big Data is a collection of large datasets that cannot be processed using traditional computing techniques. It is not a single technique or a tool, rather it has become a complete subject, which involves various tools, techniques, and frameworks.
|
||||
2. Big Data could consist of
|
||||
1. Structured data
|
||||
2. Unstructured data
|
||||
3. Semi-structured data
|
||||
3. Characteristics of Big Data:
|
||||
1. Volume
|
||||
2. Variety
|
||||
3. Velocity
|
||||
4. Variability
|
||||
4. Examples of Big Data generation include stock exchanges, social media sites, jet engines, etc.
|
||||
|
||||
|
||||
# Usage of Big Data Techniques
|
||||
|
||||
1. Take the example of the traffic lights problem.
|
||||
1. There are more than 300,000 traffic lights in the US as of 2018.
|
||||
2. Let us assume that we placed a device on each of them to collect metrics and send it to a central metrics collection system.
|
||||
3. If each of the IoT devices sends 10 events per minute, we have 300000x10x60x24 = 432x10^7 events per day.
|
||||
4. How would you go about processing that and telling me how many of the signals were “green” at 10:45 am on a particular day?
|
||||
2. Consider the next example on Unified Payments Interface (UPI) transactions:
|
||||
1. We had about 1.15 billion UPI transactions in the month of October 2019 in India.
|
||||
12. If we try to extrapolate this data to about a year and try to find out some common payments that were happening through a particular UPI ID, how do you suggest we go about that?
|
@ -1,14 +0,0 @@
|
||||
# Tasks and conclusion
|
||||
|
||||
## Post-training tasks:
|
||||
|
||||
1. Try setting up your own 3 node Hadoop cluster.
|
||||
1. A VM based solution can be found [here](http://hortonworks.com/wp-content/uploads/2015/04/Import_on_VBox_4_07_2015.pdf)
|
||||
2. Write a simple spark/MR job of your choice and understand how to generate analytics from data.
|
||||
1. Sample dataset can be found [here](https://grouplens.org/datasets/movielens/)
|
||||
|
||||
## References:
|
||||
1. [Hadoop documentation](http://hadoop.apache.org/docs/current/)
|
||||
2. [HDFS Architecture](http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)
|
||||
3. [YARN Architecture](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)
|
||||
4. [Google GFS paper](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/035fc972c796d33122033a0614bc94cff1527999.pdf)
|
@ -1,29 +0,0 @@
|
||||
# Conclusion
|
||||
|
||||
We have covered basic concepts of NoSQL databases. There is much more to learn and do. We hope this course gives you a good start and inspires you to explore further.
|
||||
|
||||
# Further reading
|
||||
|
||||
NoSQL:
|
||||
|
||||
[https://hostingdata.co.uk/nosql-database/](https://hostingdata.co.uk/nosql-database/)
|
||||
|
||||
[https://www.mongodb.com/nosql-explained](https://www.mongodb.com/nosql-explained)
|
||||
|
||||
[https://www.mongodb.com/nosql-explained/nosql-vs-sql](https://www.mongodb.com/nosql-explained/nosql-vs-sql)
|
||||
|
||||
Cap Theorem
|
||||
|
||||
[http://www.julianbrowne.com/article/brewers-cap-theorem](http://www.julianbrowne.com/article/brewers-cap-theorem)
|
||||
|
||||
Scalability
|
||||
|
||||
[http://www.slideshare.net/jboner/scalability-availability-stability-patterns](http://www.slideshare.net/jboner/scalability-availability-stability-patterns)
|
||||
|
||||
Eventual Consistency
|
||||
|
||||
[https://www.allthingsdistributed.com/2008/12/eventually_consistent.html](https://www.allthingsdistributed.com/2008/12/eventually_consistent.html)
|
||||
|
||||
[https://www.toptal.com/big-data/consistent-hashing](https://www.toptal.com/big-data/consistent-hashing)
|
||||
|
||||
[https://web.stanford.edu/class/cs244/papers/chord_TON_2003.pdf](https://web.stanford.edu/class/cs244/papers/chord_TON_2003.pdf)
|
Before Width: | Height: | Size: 30 KiB |
Before Width: | Height: | Size: 58 KiB |
Before Width: | Height: | Size: 73 KiB |
Before Width: | Height: | Size: 20 KiB |
Before Width: | Height: | Size: 50 KiB |
@ -1,13 +0,0 @@
|
||||
# Conclusion
|
||||
We have covered basic concepts of SQL databases. We have also covered some of the tasks that an SRE may be responsible for - there is so much more to learn and do. We hope this course gives you a good start and inspires you to explore further.
|
||||
|
||||
|
||||
### Further reading
|
||||
|
||||
* More practice with online resources like [this one](https://www.w3resource.com/sql-exercises/index.php)
|
||||
* [Normalization](https://beginnersbook.com/2015/05/normalization-in-dbms/)
|
||||
* [Routines](https://dev.mysql.com/doc/refman/8.0/en/stored-routines.html), [triggers](https://dev.mysql.com/doc/refman/8.0/en/trigger-syntax.html)
|
||||
* [Views](https://www.essentialsql.com/what-is-a-relational-database-view/)
|
||||
* [Transaction isolation levels](https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html)
|
||||
* [Sharding](https://www.digitalocean.com/community/tutorials/understanding-database-sharding)
|
||||
* [Setting up HA](https://severalnines.com/database-blog/introduction-database-high-availability-mysql-mariadb), [monitoring](https://blog.serverdensity.com/how-to-monitor-mysql/), [backups](https://dev.mysql.com/doc/refman/8.0/en/backup-methods.html)
|
Before Width: | Height: | Size: 36 KiB |
Before Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 74 KiB |
Before Width: | Height: | Size: 128 KiB |
Before Width: | Height: | Size: 38 KiB |
Before Width: | Height: | Size: 706 KiB |
Before Width: | Height: | Size: 207 KiB |
Before Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 32 KiB |
Before Width: | Height: | Size: 568 KiB |
Before Width: | Height: | Size: 315 KiB |
@ -1,33 +0,0 @@
|
||||
# Relational Databases
|
||||
|
||||
### Prerequisites
|
||||
* Complete [Linux course](https://linkedin.github.io/school-of-sre/linux_basics/intro/)
|
||||
* Install Docker (for lab section)
|
||||
|
||||
### What to expect from this course
|
||||
You will have an understanding of what relational databases are, their advantages, and some MySQL specific concepts.
|
||||
|
||||
### What is not covered under this course
|
||||
* In depth implementation details
|
||||
|
||||
* Advanced topics like normalization, sharding
|
||||
|
||||
* Specific tools for administration
|
||||
|
||||
### Introduction
|
||||
The main purpose of database systems is to manage data. This includes storage, adding new data, deleting unused data, updating existing data, retrieving data within a reasonable response time, other maintenance tasks to keep the system running etc.
|
||||
|
||||
### Pre-reads
|
||||
[RDBMS Concepts](https://beginnersbook.com/2015/04/rdbms-concepts/)
|
||||
|
||||
### Course Contents
|
||||
- [Key Concepts](https://linkedin.github.io/school-of-sre/databases_sql/concepts/)
|
||||
- [MySQL Architecture](https://linkedin.github.io/school-of-sre/databases_sql/mysql/#mysql-architecture)
|
||||
- [InnoDB](https://linkedin.github.io/school-of-sre/databases_sql/innodb/)
|
||||
- [Backup and Recovery](https://linkedin.github.io/school-of-sre/databases_sql/backup_recovery/)
|
||||
- [MySQL Replication](https://linkedin.github.io/school-of-sre/databases_sql/replication/)
|
||||
- Operational Concepts
|
||||
- [SELECT Query](https://linkedin.github.io/school-of-sre/databases_sql/select_query/)
|
||||
- [Query Performance](https://linkedin.github.io/school-of-sre/databases_sql/query_performance/)
|
||||
- [Lab](https://linkedin.github.io/school-of-sre/databases_sql/lab/)
|
||||
- [Further Reading](https://linkedin.github.io/school-of-sre/databases_sql/conclusion/#further-reading)
|
@ -1,207 +0,0 @@
|
||||
**Prerequisites**
|
||||
|
||||
Install Docker
|
||||
|
||||
|
||||
**Setup**
|
||||
|
||||
Create a working directory named sos or something similar, and cd into it.
|
||||
|
||||
Enter the following into a file named my.cnf under a directory named custom.
|
||||
|
||||
|
||||
```
|
||||
sos $ cat custom/my.cnf
|
||||
[mysqld]
|
||||
# These settings apply to MySQL server
|
||||
# You can set port, socket path, buffer size etc.
|
||||
# Below, we are configuring slow query settings
|
||||
slow_query_log=1
|
||||
slow_query_log_file=/var/log/mysqlslow.log
|
||||
long_query_time=1
|
||||
```
|
||||
|
||||
|
||||
Start a container and enable slow query log with the following:
|
||||
|
||||
|
||||
```
|
||||
sos $ docker run --name db -v custom:/etc/mysql/conf.d -e MYSQL_ROOT_PASSWORD=realsecret -d mysql:8
|
||||
sos $ docker cp custom/my.cnf $(docker ps -qf "name=db"):/etc/mysql/conf.d/custom.cnf
|
||||
sos $ docker restart $(docker ps -qf "name=db")
|
||||
```
|
||||
|
||||
|
||||
Import a sample database
|
||||
|
||||
|
||||
```
|
||||
sos $ git clone git@github.com:datacharmer/test_db.git
|
||||
sos $ docker cp test_db $(docker ps -qf "name=db"):/home/test_db/
|
||||
sos $ docker exec -it $(docker ps -qf "name=db") bash
|
||||
root@3ab5b18b0c7d:/# cd /home/test_db/
|
||||
root@3ab5b18b0c7d:/# mysql -uroot -prealsecret mysql < employees.sql
|
||||
root@3ab5b18b0c7d:/etc# touch /var/log/mysqlslow.log
|
||||
root@3ab5b18b0c7d:/etc# chown mysql:mysql /var/log/mysqlslow.log
|
||||
```
|
||||
|
||||
|
||||
_Workshop 1: Run some sample queries_
|
||||
Run the following
|
||||
```
|
||||
$ mysql -uroot -prealsecret mysql
|
||||
mysql>
|
||||
|
||||
# inspect DBs and tables
|
||||
# the last 4 are MySQL internal DBs
|
||||
|
||||
mysql> show databases;
|
||||
+--------------------+
|
||||
| Database |
|
||||
+--------------------+
|
||||
| employees |
|
||||
| information_schema |
|
||||
| mysql |
|
||||
| performance_schema |
|
||||
| sys |
|
||||
+--------------------+
|
||||
|
||||
> use employees;
|
||||
mysql> show tables;
|
||||
+----------------------+
|
||||
| Tables_in_employees |
|
||||
+----------------------+
|
||||
| current_dept_emp |
|
||||
| departments |
|
||||
| dept_emp |
|
||||
| dept_emp_latest_date |
|
||||
| dept_manager |
|
||||
| employees |
|
||||
| salaries |
|
||||
| titles |
|
||||
+----------------------+
|
||||
|
||||
# read a few rows
|
||||
mysql> select * from employees limit 5;
|
||||
|
||||
# filter data by conditions
|
||||
mysql> select count(*) from employees where gender = 'M' limit 5;
|
||||
|
||||
# find count of particular data
|
||||
mysql> select count(*) from employees where first_name = 'Sachin';
|
||||
```
|
||||
|
||||
_Workshop 2: Use explain and explain analyze to profile a query, identify and add indexes required for improving performance_
|
||||
```
|
||||
# View all indexes on table
|
||||
#(\G is to output horizontally, replace it with a ; to get table output)
|
||||
mysql> show index from employees from employees\G
|
||||
*************************** 1. row ***************************
|
||||
Table: employees
|
||||
Non_unique: 0
|
||||
Key_name: PRIMARY
|
||||
Seq_in_index: 1
|
||||
Column_name: emp_no
|
||||
Collation: A
|
||||
Cardinality: 299113
|
||||
Sub_part: NULL
|
||||
Packed: NULL
|
||||
Null:
|
||||
Index_type: BTREE
|
||||
Comment:
|
||||
Index_comment:
|
||||
Visible: YES
|
||||
Expression: NULL
|
||||
|
||||
# This query uses an index, idenitfied by 'key' field
|
||||
# By prefixing explain keyword to the command,
|
||||
# we get query plan (including key used)
|
||||
mysql> explain select * from employees where emp_no < 10005\G
|
||||
*************************** 1. row ***************************
|
||||
id: 1
|
||||
select_type: SIMPLE
|
||||
table: employees
|
||||
partitions: NULL
|
||||
type: range
|
||||
possible_keys: PRIMARY
|
||||
key: PRIMARY
|
||||
key_len: 4
|
||||
ref: NULL
|
||||
rows: 4
|
||||
filtered: 100.00
|
||||
Extra: Using where
|
||||
|
||||
# Compare that to the next query which does not utilize any index
|
||||
mysql> explain select first_name, last_name from employees where first_name = 'Sachin'\G
|
||||
*************************** 1. row ***************************
|
||||
id: 1
|
||||
select_type: SIMPLE
|
||||
table: employees
|
||||
partitions: NULL
|
||||
type: ALL
|
||||
possible_keys: NULL
|
||||
key: NULL
|
||||
key_len: NULL
|
||||
ref: NULL
|
||||
rows: 299113
|
||||
filtered: 10.00
|
||||
Extra: Using where
|
||||
|
||||
# Let's see how much time this query takes
|
||||
mysql> explain analyze select first_name, last_name from employees where first_name = 'Sachin'\G
|
||||
*************************** 1. row ***************************
|
||||
EXPLAIN: -> Filter: (employees.first_name = 'Sachin') (cost=30143.55 rows=29911) (actual time=28.284..3952.428 rows=232 loops=1)
|
||||
-> Table scan on employees (cost=30143.55 rows=299113) (actual time=0.095..1996.092 rows=300024 loops=1)
|
||||
|
||||
|
||||
# Cost(estimated by query planner) is 30143.55
|
||||
# actual time=28.284ms for first row, 3952.428 for all rows
|
||||
# Now lets try adding an index and running the query again
|
||||
mysql> create index idx_firstname on employees(first_name);
|
||||
Query OK, 0 rows affected (1.25 sec)
|
||||
Records: 0 Duplicates: 0 Warnings: 0
|
||||
|
||||
mysql> explain analyze select first_name, last_name from employees where first_name = 'Sachin';
|
||||
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| EXPLAIN |
|
||||
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| -> Index lookup on employees using idx_firstname (first_name='Sachin') (cost=81.20 rows=232) (actual time=0.551..2.934 rows=232 loops=1)
|
||||
|
|
||||
+--------------------------------------------------------------------------------------------------------------------------------------------+
|
||||
1 row in set (0.01 sec)
|
||||
|
||||
# Actual time=0.551ms for first row
|
||||
# 2.934ms for all rows. A huge improvement!
|
||||
# Also notice that the query involves only an index lookup,
|
||||
# and no table scan (reading all rows of table)
|
||||
# ..which vastly reduces load on the DB.
|
||||
```
|
||||
|
||||
_Workshop 3: Identify slow queries on a MySQL server_
|
||||
```
|
||||
# Run the command below in two terminal tabs to open two shells into the container.
|
||||
docker exec -it $(docker ps -qf "name=db") bash
|
||||
|
||||
# Open a mysql prompt in one of them and execute this command
|
||||
# We have configured to log queries that take longer than 1s,
|
||||
# so this sleep(3) will be logged
|
||||
mysql -uroot -prealsecret mysql
|
||||
mysql> select sleep(3);
|
||||
|
||||
# Now, in the other terminal, tail the slow log to find details about the query
|
||||
root@62c92c89234d:/etc# tail -f /var/log/mysqlslow.log
|
||||
/usr/sbin/mysqld, Version: 8.0.21 (MySQL Community Server - GPL). started with:
|
||||
Tcp port: 3306 Unix socket: /var/run/mysqld/mysqld.sock
|
||||
Time Id Command Argument
|
||||
# Time: 2020-11-26T14:53:44.822348Z
|
||||
# User@Host: root[root] @ localhost [] Id: 9
|
||||
# Query_time: 5.404938 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
|
||||
use employees;
|
||||
# Time: 2020-11-26T14:53:58.015736Z
|
||||
# User@Host: root[root] @ localhost [] Id: 9
|
||||
# Query_time: 10.000225 Lock_time: 0.000000 Rows_sent: 1 Rows_examined: 1
|
||||
SET timestamp=1606402428;
|
||||
select sleep(3);
|
||||
```
|
||||
|
||||
These were simulated examples with minimal complexity. In real life, the queries would be much more complex and the explain/analyze and slow query logs would have more details.
|
@ -1,38 +0,0 @@
|
||||
### MySQL architecture
|
||||
|
||||
![alt_text](images/mysql_architecture.png "MySQL architecture diagram")
|
||||
|
||||
MySQL architecture enables you to select the right storage engine for your needs, and abstracts away all implementation details from the end users (application engineers and [DBA](https://en.wikipedia.org/wiki/Database_administrator)) who only need to know a consistent stable API.
|
||||
|
||||
Application layer:
|
||||
|
||||
* Connection handling - each client gets its own connection which is cached for the duration of access)
|
||||
* Authentication - server checks (username,password,host) info of client and allows/rejects connection
|
||||
* Security: server determines whether the client has privileges to execute each query (check with _show privileges_ command)
|
||||
|
||||
Server layer:
|
||||
|
||||
|
||||
|
||||
* Services and utilities - backup/restore, replication, cluster etc
|
||||
* SQL interface - clients run queries for data access and manipulation
|
||||
* SQL parser - creates a parse tree from the query (lexical/syntactic/semantic analysis and code generation)
|
||||
* Optimizer - optimizes queries using various algorithms and data available to it(table level stats), modifies queries, order of scanning, indexes to use etc. (check with explain command)
|
||||
* Caches and buffers - cache stores query results, buffer pool(InnoDB) stores table and index data in [LRU](https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)) fashion
|
||||
|
||||
Storage engine options:
|
||||
|
||||
|
||||
|
||||
* InnoDB: most widely used, transaction support, ACID compliant, supports row-level locking, crash recovery and multi-version concurrency control. Default since MySQL 5.5+.
|
||||
* MyISAM: fast, does not support transactions, provides table-level locking, great for read-heavy workloads, mostly in web and data warehousing. Default upto MySQL 5.1.
|
||||
* Archive: optimised for high speed inserts, compresses data as it is inserted, does not support transactions, ideal for storing and retrieving large amounts of seldom referenced historical, archived data
|
||||
* Memory: tables in memory. Fastest engine, supports table-level locking, does not support transactions, ideal for creating temporary tables or quick lookups, data is lost after a shutdown
|
||||
* CSV: stores data in CSV files, great for integrating into other applications that use this format
|
||||
* … etc.
|
||||
|
||||
It is possible to migrate from one storage engine to another. But this migration locks tables for all operations and is not online, as it changes the physical layout of the data. It takes a long time and is generally not recommended. Hence, choosing the right storage engine at the beginning is important.
|
||||
|
||||
General guideline is to use InnoDB unless you have a specific need for one of the other storage engines.
|
||||
|
||||
Running `mysql> SHOW ENGINES; `shows you the supported engines on your MySQL server.
|
@ -1,64 +0,0 @@
|
||||
* Explain and explain+analyze
|
||||
|
||||
EXPLAIN <query> analyzes query plans from the optimizer, including how tables are joined, which tables/rows are scanned etc.
|
||||
|
||||
Explain analyze shows the above and additional info like execution cost, number of rows returned, time taken etc.
|
||||
|
||||
This knowledge is useful to tweak queries and add indexes.
|
||||
|
||||
Watch this performance tuning [tutorial video](https://www.youtube.com/watch?v=pjRTLPeUOug).
|
||||
|
||||
Checkout the [lab section](https://linkedin.github.io/school-of-sre/databases_sql/lab/) for a hands-on about indexes.
|
||||
|
||||
* [Slow query logs](https://dev.mysql.com/doc/refman/5.7/en/slow-query-log.html)
|
||||
|
||||
Used to identify slow queries (configurable threshold), enabled in config or dynamically with a query
|
||||
|
||||
Checkout the [lab section](https://linkedin.github.io/school-of-sre/databases_sql/lab/) about identifying slow queries.
|
||||
|
||||
* User management
|
||||
|
||||
This includes creation and changes to users, like managing privileges, changing password etc.
|
||||
|
||||
|
||||
|
||||
* Backup and restore strategies, pros and cons
|
||||
|
||||
Logical backup using mysqldump - slower but can be done online
|
||||
|
||||
Physical backup (copy data directory or use xtrabackup) - quick backup/recovery. Copying data directory requires locking or shut down. xtrabackup is an improvement because it supports backups without shutting down (hot backup).
|
||||
|
||||
Others - PITR, snapshots etc.
|
||||
|
||||
|
||||
* Crash recovery process using redo logs
|
||||
|
||||
After a crash, when you restart server it reads redo logs and replays modifications to recover
|
||||
|
||||
|
||||
* Monitoring MySQL
|
||||
|
||||
Key MySQL metrics: reads, writes, query runtime, errors, slow queries, connections, running threads, InnoDB metrics
|
||||
|
||||
Key OS metrics: CPU, load, memory, disk I/O, network
|
||||
|
||||
|
||||
* Replication
|
||||
|
||||
Copies data from one instance to one or more instances. Helps in horizontal scaling, data protection, analytics and performance. Binlog dump thread on primary, replication I/O and SQL threads on secondary. Strategies include the standard async, semi async or group replication.
|
||||
|
||||
* High Availability
|
||||
|
||||
Ability to cope with failure at software, hardware and network level. Essential for anyone who needs 99.9%+ uptime. Can be implemented with replication or clustering solutions from MySQL, Percona, Oracle etc. Requires expertise to setup and maintain. Failover can be manual, scripted or using tools like Orchestrator.
|
||||
|
||||
* [Data directory](https://dev.mysql.com/doc/refman/8.0/en/data-directory.html)
|
||||
|
||||
Data is stored in a particular directory, with nested directories for the data contained in each database. There are also MySQL log files, InnoDB log files, server process ID file and some other configs. The data directory is configurable.
|
||||
|
||||
* [MySQL configuration](https://dev.mysql.com/doc/refman/5.7/en/server-configuration.html)
|
||||
|
||||
This can be done by passing [parameters during startup](https://dev.mysql.com/doc/refman/5.7/en/server-options.html), or in a [file](https://dev.mysql.com/doc/refman/8.0/en/option-files.html). There are a few [standard paths](https://dev.mysql.com/doc/refman/8.0/en/option-files.html#option-file-order) where MySQL looks for config files, `/etc/my.cnf` is one of the commonly used paths. These options are organized under headers (mysqld for server and mysql for client), you can explore them more in the lab that follows.
|
||||
|
||||
* [Logs](https://dev.mysql.com/doc/refman/5.7/en/server-logs.html)
|
||||
|
||||
MySQL has logs for various purposes - general query log, errors, binary logs (for replication), slow query log. Only error log is enabled by default (to reduce I/O and storage requirement), the others can be enabled when required - by specifying config parameters at startup or running commands at runtime. [Log destination](https://dev.mysql.com/doc/refman/5.7/en/log-destinations.html) can also be tweaked with config parameters.
|
@ -1,183 +0,0 @@
|
||||
# Working With Branches
|
||||
|
||||
Coming back to our local repo which has two commits. So far, what we have is a single line of history. Commits are chained in a single line. But sometimes you may have a need to work on two different features in parallel in the same repo. Now one option here could be making a new folder/repo with the same code and use that for another feature development. But there's a better way. Use _branches._ Since git follows tree like structure for commits, we can use branches to work on different sets of features. From a commit, two or more branches can be created and branches can also be merged.
|
||||
|
||||
Using branches, there can exist multiple lines of histories and we can checkout to any of them and work on it. Checking out, as we discussed earlier, would simply mean replacing contents of the directory (repo) with the snapshot at the checked out version.
|
||||
|
||||
Let's create a branch and see how it looks like:
|
||||
|
||||
```bash
|
||||
$ git branch b1
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master, b1) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
We create a branch called `b1`. Git log tells us that b1 also points to the last commit (7f3b00e) but the `HEAD` is still pointing to master. If you remember, HEAD points to the commit/reference wherever you are checkout to. So if we checkout to `b1`, HEAD should point to that. Let's confirm:
|
||||
|
||||
```bash
|
||||
$ git checkout b1
|
||||
Switched to branch 'b1'
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> b1, master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
`b1` still points to the same commit but HEAD now points to `b1`. Since we create a branch at commit `7f3b00e`, there will be two lines of histories starting this commit. Depending on which branch you are checked out on, the line of history will progress.
|
||||
|
||||
At this moment, we are checked out on branch `b1`, so making a new commit will advance branch reference `b1` to that commit and current `b1` commit will become its parent. Let's do that.
|
||||
|
||||
```bash
|
||||
# Creating a file and making a commit
|
||||
$ echo "I am a file in b1 branch" > b1.txt
|
||||
$ git add b1.txt
|
||||
$ git commit -m "adding b1 file"
|
||||
[b1 872a38f] adding b1 file
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 b1.txt
|
||||
|
||||
# The new line of history
|
||||
$ git log --oneline --graph
|
||||
* 872a38f (HEAD -> b1) adding b1 file
|
||||
* 7f3b00e (master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
$
|
||||
```
|
||||
|
||||
Do note that master is still pointing to the old commit it was pointing to. We can now checkout to master branch and make commits there. This will result in another line of history starting from commit 7f3b00e.
|
||||
|
||||
```bash
|
||||
# checkout to master branch
|
||||
$ git checkout master
|
||||
Switched to branch 'master'
|
||||
|
||||
# Creating a new commit on master branch
|
||||
$ echo "new file in master branch" > master.txt
|
||||
$ git add master.txt
|
||||
$ git commit -m "adding master.txt file"
|
||||
[master 60dc441] adding master.txt file
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 master.txt
|
||||
|
||||
# The history line
|
||||
$ git log --oneline --graph
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
Notice how branch b1 is not visible here since we are on the master. Let's try to visualize both to get the whole picture:
|
||||
|
||||
```bash
|
||||
$ git log --oneline --graph --all
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
| * 872a38f (b1) adding b1 file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
Above tree structure should make things clear. Notice a clear branch/fork on commit 7f3b00e. This is how we create branches. Now they both are two separate lines of history on which feature development can be done independently.
|
||||
|
||||
**To reiterate, internally, git is just a tree of commits. Branch names (human readable) are pointers to those commits in the tree. We use various git commands to work with the tree structure and references. Git accordingly modifies contents of our repo.**
|
||||
|
||||
## Merges
|
||||
|
||||
Now say the feature you were working on branch `b1` is complete and you need to merge it on master branch, where all the final version of code goes. So first you will checkout to branch master and then you pull the latest code from upstream (eg: GitHub). Then you need to merge your code from `b1` into master. There could be two ways this can be done.
|
||||
|
||||
Here is the current history:
|
||||
|
||||
```bash
|
||||
$ git log --oneline --graph --all
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
| * 872a38f (b1) adding b1 file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
**Option 1: Directly merge the branch.** Merging the branch b1 into master will result in a new merge commit. This will merge changes from two different lines of history and create a new commit of the result.
|
||||
|
||||
```bash
|
||||
$ git merge b1
|
||||
Merge made by the 'recursive' strategy.
|
||||
b1.txt | 1 +
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 b1.txt
|
||||
$ git log --oneline --graph --all
|
||||
* 8fc28f9 (HEAD -> master) Merge branch 'b1'
|
||||
|\
|
||||
| * 872a38f (b1) adding b1 file
|
||||
* | 60dc441 adding master.txt file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
You can see a new merge commit created (8fc28f9). You will be prompted for the commit message. If there are a lot of branches in the repo, this result will end-up with a lot of merge commits. Which looks ugly compared to a single line of history of development. So let's look at an alternative approach
|
||||
|
||||
First let's [reset](https://git-scm.com/docs/git-reset) our last merge and go to the previous state.
|
||||
|
||||
```bash
|
||||
$ git reset --hard 60dc441
|
||||
HEAD is now at 60dc441 adding master.txt file
|
||||
$ git log --oneline --graph --all
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
| * 872a38f (b1) adding b1 file
|
||||
|/
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
**Option 2: Rebase.** Now, instead of merging two branches which has a similar base (commit: 7f3b00e), let us rebase branch b1 on to current master. **What this means is take branch `b1` (from commit 7f3b00e to commit 872a38f) and rebase (put them on top of) master (60dc441).**
|
||||
|
||||
```bash
|
||||
# Switch to b1
|
||||
$ git checkout b1
|
||||
Switched to branch 'b1'
|
||||
|
||||
# Rebase (b1 which is current branch) on master
|
||||
$ git rebase master
|
||||
First, rewinding head to replay your work on top of it...
|
||||
Applying: adding b1 file
|
||||
|
||||
# The result
|
||||
$ git log --oneline --graph --all
|
||||
* 5372c8f (HEAD -> b1) adding b1 file
|
||||
* 60dc441 (master) adding master.txt file
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
You can see `b1` which had 1 commit. That commit's parent was `7f3b00e`. But since we rebase it on master (`60dc441`). That becomes the parent now. As a side effect, you also see it has become a single line of history. Now if we were to merge `b1` into `master`, it would simply mean change `master` to point to `5372c8f` which is `b1`. Let's try it:
|
||||
|
||||
```bash
|
||||
# checkout to master since we want to merge code into master
|
||||
$ git checkout master
|
||||
Switched to branch 'master'
|
||||
|
||||
# the current history, where b1 is based on master
|
||||
$ git log --oneline --graph --all
|
||||
* 5372c8f (b1) adding b1 file
|
||||
* 60dc441 (HEAD -> master) adding master.txt file
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
|
||||
|
||||
# Performing the merge, notice the "fast-forward" message
|
||||
$ git merge b1
|
||||
Updating 60dc441..5372c8f
|
||||
Fast-forward
|
||||
b1.txt | 1 +
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 b1.txt
|
||||
|
||||
# The Result
|
||||
$ git log --oneline --graph --all
|
||||
* 5372c8f (HEAD -> master, b1) adding b1 file
|
||||
* 60dc441 adding master.txt file
|
||||
* 7f3b00e adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
Now you see both `b1` and `master` are pointing to the same commit. Your code has been merged to the master branch and it can be pushed. Also we have clean line of history! :D
|
@ -1,10 +0,0 @@
|
||||
## What next from here?
|
||||
|
||||
There are a lot of git commands and features which we have not explored here. But with the base built-up, be sure to explore concepts like
|
||||
|
||||
- Cherrypick
|
||||
- Squash
|
||||
- Amend
|
||||
- Stash
|
||||
- Reset
|
||||
|
@ -1,256 +0,0 @@
|
||||
# Git
|
||||
|
||||
## Prerequisites
|
||||
|
||||
1. Have Git installed [https://git-scm.com/downloads](https://git-scm.com/downloads)
|
||||
2. Have taken any git high level tutorial or following LinkedIn learning courses
|
||||
- [https://www.linkedin.com/learning/git-essential-training-the-basics/](https://www.linkedin.com/learning/git-essential-training-the-basics/)
|
||||
- [https://www.linkedin.com/learning/git-branches-merges-and-remotes/](https://www.linkedin.com/learning/git-branches-merges-and-remotes/)
|
||||
- [The Official Git Docs](https://git-scm.com/doc)
|
||||
|
||||
## What to expect from this course
|
||||
|
||||
As an engineer in the field of computer science, having knowledge of version control tools becomes almost a requirement. While there are a lot of version control tools that exist today like SVN, Mercurial, etc, Git perhaps is the most used one and this course we will be working with Git. While this course does not start with Git 101 and expects basic knowledge of git as a prerequisite, it will reintroduce the git concepts known by you with details covering what is happening under the hood as you execute various git commands. So that next time you run a git command, you will be able to press enter more confidently!
|
||||
|
||||
## What is not covered under this course
|
||||
|
||||
Advanced usage and specifics of internal implementation details of Git.
|
||||
|
||||
## Course Contents
|
||||
|
||||
1. [Git Basics](https://linkedin.github.io/school-of-sre/git/git-basics/#git-basics)
|
||||
2. [Working with Branches](https://linkedin.github.io/school-of-sre/git/branches/)
|
||||
3. [Git with Github](https://linkedin.github.io/school-of-sre/git/github-hooks/#git-with-github)
|
||||
4. [Hooks](https://linkedin.github.io/school-of-sre/git/github-hooks/#hooks)
|
||||
|
||||
## Git Basics
|
||||
|
||||
Though you might be aware already, let's revisit why we need a version control system. As the project grows and multiple developers start working on it, an efficient method for collaboration is warranted. Git helps the team collaborate easily and also maintains the history of the changes happening with the codebase.
|
||||
|
||||
### Creating a Git Repo
|
||||
|
||||
Any folder can be converted into a git repository. After executing the following command, we will see a `.git` folder within the folder, which makes our folder a git repository. **All the magic that git does, `.git` folder is the enabler for the same.**
|
||||
|
||||
```bash
|
||||
# creating an empty folder and changing current dir to it
|
||||
$ cd /tmp
|
||||
$ mkdir school-of-sre
|
||||
$ cd school-of-sre/
|
||||
|
||||
# initialize a git repo
|
||||
$ git init
|
||||
Initialized empty Git repository in /private/tmp/school-of-sre/.git/
|
||||
```
|
||||
|
||||
As the output says, an empty git repo has been initialized in our folder. Let's take a look at what is there.
|
||||
|
||||
```bash
|
||||
$ ls .git/
|
||||
HEAD config description hooks info objects refs
|
||||
```
|
||||
|
||||
There are a bunch of folders and files in the `.git` folder. As I said, all these enables git to do its magic. We will look into some of these folders and files. But for now, what we have is an empty git repository.
|
||||
|
||||
### Tracking a File
|
||||
|
||||
Now as you might already know, let us create a new file in our repo (we will refer to the folder as _repo_ now.) And see git status
|
||||
|
||||
```bash
|
||||
$ echo "I am file 1" > file1.txt
|
||||
$ git status
|
||||
On branch master
|
||||
|
||||
No commits yet
|
||||
|
||||
Untracked files:
|
||||
(use "git add <file>..." to include in what will be committed)
|
||||
|
||||
file1.txt
|
||||
|
||||
nothing added to commit but untracked files present (use "git add" to track)
|
||||
```
|
||||
|
||||
The current git status says `No commits yet` and there is one untracked file. Since we just created the file, git is not tracking that file. We explicitly need to ask git to track files and folders. (also checkout [gitignore](https://git-scm.com/docs/gitignore)) And how we do that is via `git add` command as suggested in the above output. Then we go ahead and create a commit.
|
||||
|
||||
```bash
|
||||
$ git add file1.txt
|
||||
$ git status
|
||||
On branch master
|
||||
|
||||
No commits yet
|
||||
|
||||
Changes to be committed:
|
||||
(use "git rm --cached <file>..." to unstage)
|
||||
|
||||
new file: file1.txt
|
||||
|
||||
$ git commit -m "adding file 1"
|
||||
[master (root-commit) df2fb7a] adding file 1
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 file1.txt
|
||||
```
|
||||
|
||||
Notice how after adding the file, git status says `Changes to be committed:`. What it means is whatever is listed there, will be included in the next commit. Then we go ahead and create a commit, with an attached messaged via `-m`.
|
||||
|
||||
### More About a Commit
|
||||
|
||||
Commit is a snapshot of the repo. Whenever a commit is made, a snapshot of the current state of repo (the folder) is taken and saved. Each commit has a unique ID. (`df2fb7a` for the commit we made in the previous step). As we keep adding/changing more and more contents and keep making commits, all those snapshots are stored by git. Again, all this magic happens inside the `.git` folder. This is where all this snapshot or versions are stored _in an efficient manner._
|
||||
|
||||
### Adding More Changes
|
||||
|
||||
Let us create one more file and commit the change. It would look the same as the previous commit we made.
|
||||
|
||||
```bash
|
||||
$ echo "I am file 2" > file2.txt
|
||||
$ git add file2.txt
|
||||
$ git commit -m "adding file 2"
|
||||
[master 7f3b00e] adding file 2
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 file2.txt
|
||||
```
|
||||
|
||||
A new commit with ID `7f3b00e` has been created. You can issue `git status` at any time to see the state of the repository.
|
||||
|
||||
**IMPORTANT: Note that commit IDs are long string (SHA) but we can refer to a commit by its initial few (8 or more) characters too. We will interchangeably using shorter and longer commit IDs.**
|
||||
|
||||
Now that we have two commits, let's visualize them:
|
||||
|
||||
```bash
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
`git log`, as the name suggests, prints the log of all the git commits. Here you see two additional arguments, `--oneline` prints the shorter version of the log, ie: the commit message only and not the person who made the commit and when. `--graph` prints it in graph format.
|
||||
|
||||
**Now at this moment the commits might look like just one in each line but all commits are stored as a tree like data structure internally by git. That means there can be two or more children commits of a given commit. And not just a single line of commits. We will look more into this part when we get to the Branches section. For now this is our commit history:**
|
||||
|
||||
```bash
|
||||
df2fb7a ===> 7f3b00e
|
||||
```
|
||||
|
||||
### Are commits really linked?
|
||||
|
||||
As I just said, the two commits we just made are linked via tree like data structure and we saw how they are linked. But let's actually verify it. Everything in git is an object. Newly created files are stored as an object. Changes to file are stored as an objects and even commits are objects. To view contents of an object we can use the following command with the object's ID. We will take a look at the contents of the second commit
|
||||
|
||||
```bash
|
||||
$ git cat-file -p 7f3b00e
|
||||
tree ebf3af44d253e5328340026e45a9fa9ae3ea1982
|
||||
parent df2fb7a61f5d40c1191e0fdeb0fc5d6e7969685a
|
||||
author Sanket Patel <spatel1@linkedin.com> 1603273316 -0700
|
||||
committer Sanket Patel <spatel1@linkedin.com> 1603273316 -0700
|
||||
|
||||
adding file 2
|
||||
```
|
||||
|
||||
Take a note of `parent` attribute in the above output. It points to the commit id of the first commit we made. So this proves that they are linked! Additionally you can see the second commit's message in this object. As I said all this magic is enabled by `.git` folder and the object to which we are looking at also is in that folder.
|
||||
|
||||
```bash
|
||||
$ ls .git/objects/7f/3b00eaa957815884198e2fdfec29361108d6a9
|
||||
.git/objects/7f/3b00eaa957815884198e2fdfec29361108d6a9
|
||||
```
|
||||
|
||||
It is stored in `.git/objects/` folder. All the files and changes to them as well are stored in this folder.
|
||||
|
||||
### The Version Control part of Git
|
||||
|
||||
We already can see two commits (versions) in our git log. One thing a version control tool gives you is ability to browse back and forth in history. For example: some of your users are running an old version of code and they are reporting an issue. In order to debug the issue, you need access to the old code. The one in your current repo is the latest code. In this example, you are working on the second commit (7f3b00e) and someone reported an issue with the code snapshot at commit (df2fb7a). This is how you would get access to the code at any older commit
|
||||
|
||||
```bash
|
||||
# Current contents, two files present
|
||||
$ ls
|
||||
file1.txt file2.txt
|
||||
|
||||
# checking out to (an older) commit
|
||||
$ git checkout df2fb7a
|
||||
Note: checking out 'df2fb7a'.
|
||||
|
||||
You are in 'detached HEAD' state. You can look around, make experimental
|
||||
changes and commit them, and you can discard any commits you make in this
|
||||
state without impacting any branches by performing another checkout.
|
||||
|
||||
If you want to create a new branch to retain commits you create, you may
|
||||
do so (now or later) by using -b with the checkout command again. Example:
|
||||
|
||||
git checkout -b <new-branch-name>
|
||||
|
||||
HEAD is now at df2fb7a adding file 1
|
||||
|
||||
# checking contents, can verify it has old contents
|
||||
$ ls
|
||||
file1.txt
|
||||
```
|
||||
|
||||
So this is how we would get access to old versions/snapshots. All we need is a _reference_ to that snapshot. Upon executing `git checkout ...`, what git does for you is use the `.git` folder, see what was the state of things (files and folders) at that version/reference and replace the contents of current directory with those contents. The then-existing content will no longer be present in the local dir (repo) but we can and will still get access to them because they are tracked via git commit and `.git` folder has them stored/tracked.
|
||||
|
||||
### Reference
|
||||
|
||||
I mention in the previous section that we need a _reference_ to the version. By default, git repo is made of tree of commits. And each commit has a unique IDs. But the unique ID is not the only thing we can reference commits via. There are multiple ways to reference commits. For example: `HEAD` is a reference to current commit. _Whatever commit your repo is checked out at, `HEAD` will point to that._ `HEAD~1` is reference to previous commit. So while checking out previous version in section above, we could have done `git checkout HEAD~1`.
|
||||
|
||||
Similarly, master is also a reference (to a branch). Since git uses tree like structure to store commits, there of course will be branches. And the default branch is called `master`. Master (or any branch reference) will point to the latest commit in the branch. Even though we have checked out to the previous commit in out repo, `master` still points to the latest commit. And we can get back to the latest version by checkout at `master` reference
|
||||
|
||||
```bash
|
||||
$ git checkout master
|
||||
Previous HEAD position was df2fb7a adding file 1
|
||||
Switched to branch 'master'
|
||||
|
||||
# now we will see latest code, with two files
|
||||
$ ls
|
||||
file1.txt file2.txt
|
||||
```
|
||||
|
||||
Note, instead of `master` in above command, we could have used commit's ID as well.
|
||||
|
||||
### References and The Magic
|
||||
|
||||
Let's look at the state of things. Two commits, `master` and `HEAD` references are pointing to the latest commit
|
||||
|
||||
```bash
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
The magic? Let's examine these files:
|
||||
|
||||
```bash
|
||||
$ cat .git/refs/heads/master
|
||||
7f3b00eaa957815884198e2fdfec29361108d6a9
|
||||
```
|
||||
|
||||
Viola! Where master is pointing to is stored in a file. **Whenever git needs to know where master reference is pointing to, or if git needs to update where master points, it just needs to update the file above.** So when you create a new commit, a new commit is created on top of the current commit and the master file is updated with the new commit's ID.
|
||||
|
||||
Similary, for `HEAD` reference:
|
||||
|
||||
```bash
|
||||
$ cat .git/HEAD
|
||||
ref: refs/heads/master
|
||||
```
|
||||
|
||||
We can see `HEAD` is pointing to a reference called `refs/heads/master`. So `HEAD` will point where ever the `master` points.
|
||||
|
||||
### Little Adventure
|
||||
|
||||
We discussed how git will update the files as we execute commands. But let's try to do it ourselves, by hand, and see what happens.
|
||||
|
||||
```bash
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
Now let's change master to point to the previous/first commit.
|
||||
|
||||
```bash
|
||||
$ echo df2fb7a61f5d40c1191e0fdeb0fc5d6e7969685a > .git/refs/heads/master
|
||||
$ git log --oneline --graph
|
||||
* df2fb7a (HEAD -> master) adding file 1
|
||||
|
||||
# RESETTING TO ORIGINAL
|
||||
$ echo 7f3b00eaa957815884198e2fdfec29361108d6a9 > .git/refs/heads/master
|
||||
$ git log --oneline --graph
|
||||
* 7f3b00e (HEAD -> master) adding file 2
|
||||
* df2fb7a adding file 1
|
||||
```
|
||||
|
||||
We just edited the `master` reference file and now we can see only the first commit in git log. Undoing the change to the file brings the state back to original. Not so much of magic, is it?
|
@ -1,42 +0,0 @@
|
||||
# Git with Github
|
||||
|
||||
Till now all the operations we did were in our local repo while git also helps us in a collaborative environment. GitHub is one place on the internet where you can centrally host your git repos and collaborate with other developers.
|
||||
|
||||
Most of the workflow will remain the same as we discussed, with addition of couple of things:
|
||||
|
||||
1. Pull: to pull latest changes from github (the central) repo
|
||||
2. Push: to push your changes to github repo so that it's available to all people
|
||||
|
||||
GitHub has written nice guides and tutorials about this and you can refer them here:
|
||||
|
||||
- [GitHub Hello World](https://guides.github.com/activities/hello-world/)
|
||||
- [Git Handbook](https://guides.github.com/introduction/git-handbook/)
|
||||
|
||||
## Hooks
|
||||
|
||||
Git has another nice feature called hooks. Hooks are basically scripts which will be called when a certain event happens. Here is where hooks are located:
|
||||
|
||||
```bash
|
||||
$ ls .git/hooks/
|
||||
applypatch-msg.sample fsmonitor-watchman.sample pre-applypatch.sample pre-push.sample pre-receive.sample update.sample
|
||||
commit-msg.sample post-update.sample pre-commit.sample pre-rebase.sample prepare-commit-msg.sample
|
||||
```
|
||||
|
||||
Names are self explanatory. These hooks are useful when you want to do certain things when a certain event happens. If you want to run tests before pushing code, you would want to setup `pre-push` hooks. Let's try to create a pre commit hook.
|
||||
|
||||
```bash
|
||||
$ echo "echo this is from pre commit hook" > .git/hooks/pre-commit
|
||||
$ chmod +x .git/hooks/pre-commit
|
||||
```
|
||||
|
||||
We basically create a file called `pre-commit` in hooks folder and make it executable. Now if we make a commit, we should see the message getting printed.
|
||||
|
||||
```bash
|
||||
$ echo "sample file" > sample.txt
|
||||
$ git add sample.txt
|
||||
$ git commit -m "adding sample file"
|
||||
this is from pre commit hook # <===== THE MESSAGE FROM HOOK EXECUTION
|
||||
[master 9894e05] adding sample file
|
||||
1 file changed, 1 insertion(+)
|
||||
create mode 100644 sample.txt
|
||||
```
|
@ -1,475 +0,0 @@
|
||||
# Command Line Basics
|
||||
|
||||
## Lab Environment Setup
|
||||
|
||||
One can use an online bash interpreter to run all the commands that are provided as examples in this course. This will also help you in getting a hands-on experience of various linux commands.
|
||||
|
||||
[REPL](https://repl.it/languages/bash) is one of the popular online bash interpreters for running linux commands. We will be using it for running all the commands mentioned in this course.
|
||||
|
||||
## What is a Command
|
||||
|
||||
A command is a program that tells the operating system to perform
|
||||
specific work. Programs are stored as files in linux. Therefore, a
|
||||
command is also a file which is stored somewhere on the disk.
|
||||
|
||||
Commands may also take additional arguments as input from the user.
|
||||
These arguments are called command line arguments. Knowing how to use
|
||||
the commands is important and there are many ways to get help in Linux,
|
||||
especially for commands. Almost every command will have some form of
|
||||
documentation, most commands will have a command-line argument -h or
|
||||
\--help that will display a reasonable amount of documentation. But the
|
||||
most popular documentation system in Linux is called man pages - short
|
||||
for manual pages.
|
||||
|
||||
Using \--help to show the documentation for ls command.
|
||||
|
||||
![](images/linux/commands/image19.png)
|
||||
|
||||
## File System Organization
|
||||
|
||||
The linux file system has a hierarchical (or tree-like) structure with
|
||||
its highest level directory called root ( denoted by / ). Directories
|
||||
present inside the root directory stores file related to the system.
|
||||
These directories in turn can either store system files or application
|
||||
files or user related files.
|
||||
|
||||
![](images/linux/commands/image17.png)
|
||||
|
||||
bin | The executable program of most commonly used commands reside in bin directory
|
||||
|
||||
sbin | This directory contains programs used for system administration.
|
||||
|
||||
home | This directory contains user related files and directories.
|
||||
|
||||
lib | This directory contains all the library files
|
||||
|
||||
etc | This directory contains all the system configuration files
|
||||
|
||||
proc | This directory contains files related to the running processes on the system
|
||||
|
||||
dev | This directory contains files related to devices on the system
|
||||
|
||||
mnt | This directory contains files related to mounted devices on the system
|
||||
|
||||
tmp | This directory is used to store temporary files on the system
|
||||
|
||||
usr | This directory is used to store application programs on the system
|
||||
|
||||
## Commands for Navigating the File System
|
||||
|
||||
There are three basic commands which are used frequently to navigate the
|
||||
file system:
|
||||
|
||||
- ls
|
||||
|
||||
- pwd
|
||||
|
||||
- cd
|
||||
|
||||
We will now try to understand what each command does and how to use
|
||||
these commands. You should also practice the given examples on the
|
||||
online bash shell.
|
||||
|
||||
### pwd (print working directory)
|
||||
|
||||
At any given moment of time, we will be standing in a certain directory.
|
||||
To get the name of the directory in which we are standing, we can use
|
||||
the pwd command in linux.
|
||||
|
||||
![](images/linux/commands/image2.png)
|
||||
|
||||
We will now use the cd command to move to a different directory and then
|
||||
print the working directory.
|
||||
|
||||
![](images/linux/commands/image20.png)
|
||||
|
||||
### cd (change directory)
|
||||
|
||||
The cd command can be used to change the working directory. Using the
|
||||
command, you can move from one directory to another.
|
||||
|
||||
In the below example, we are initially in the root directory. we have
|
||||
then used the cd command to change the directory.
|
||||
|
||||
![](images/linux/commands/image3.png)
|
||||
|
||||
### ls (list files and directories)**
|
||||
|
||||
The ls command is used to list the contents of a directory. It will list
|
||||
down all the files and folders present in the given directory.
|
||||
|
||||
If we just type ls in the shell, it will list all the files and
|
||||
directories present in the current directory.
|
||||
|
||||
![](images/linux/commands/image7.png)
|
||||
|
||||
We can also provide the directory name as argument to ls command. It
|
||||
will then list all the files and directories inside the given directory.
|
||||
|
||||
![](images/linux/commands/image4.png)
|
||||
|
||||
## Commands for Manipulating Files
|
||||
|
||||
There are five basic commands which are used frequently to manipulate
|
||||
files:
|
||||
|
||||
- touch
|
||||
|
||||
- mkdir
|
||||
|
||||
- cp
|
||||
|
||||
- mv
|
||||
|
||||
- rm
|
||||
|
||||
We will now try to understand what each command does and how to use
|
||||
these commands. You should also practice the given examples on the
|
||||
online bash shell.
|
||||
|
||||
### touch (create new file)
|
||||
|
||||
The touch command can be used to create an empty new file.
|
||||
This command is very useful for many other purposes but we will discuss
|
||||
the simplest use case of creating a new file.
|
||||
|
||||
General syntax of using touch command
|
||||
|
||||
```
|
||||
touch <file_name>
|
||||
```
|
||||
|
||||
![](images/linux/commands/image9.png)
|
||||
|
||||
### mkdir (create new directories)
|
||||
|
||||
The mkdir command is used to create directories.You can use ls command
|
||||
to verify that the new directory is created.
|
||||
|
||||
General syntax of using mkdir command
|
||||
|
||||
```
|
||||
mkdir <directory_name>
|
||||
```
|
||||
|
||||
![](images/linux/commands/image11.png)
|
||||
|
||||
### rm (delete files and directories)
|
||||
|
||||
The rm command can be used to delete files and directories. It is very
|
||||
important to note that this command permanently deletes the files and
|
||||
directories. It's almost impossible to recover these files and
|
||||
directories once you have executed rm command on them successfully. Do
|
||||
run this command with care.
|
||||
|
||||
General syntax of using rm command:
|
||||
|
||||
```
|
||||
rm <file_name>
|
||||
```
|
||||
|
||||
Let's try to understand the rm command with an example. We will try to
|
||||
delete the file and directory we created using touch and mkdir command
|
||||
respectively.
|
||||
|
||||
![](images/linux/commands/image18.png)
|
||||
|
||||
### cp (copy files and directories)
|
||||
|
||||
The cp command is used to copy files and directories from one location
|
||||
to another. Do note that the cp command doesn't do any change to the
|
||||
original files or directories. The original files or directories and
|
||||
their copy both co-exist after running cp command successfully.
|
||||
|
||||
General syntax of using cp command:
|
||||
|
||||
```
|
||||
cp <source_path> <destination_path>
|
||||
```
|
||||
|
||||
We are currently in the '/home/runner' directory. We will use the mkdir
|
||||
command to create a new directory named "test_directory". We will now
|
||||
try to copy the "\_test_runner.py" file to the directory we created just
|
||||
now.
|
||||
|
||||
![](images/linux/commands/image23.png)
|
||||
|
||||
Do note that nothing happened to the original "\_test_runner.py" file.
|
||||
It's still there in the current directory. A new copy of it got created
|
||||
inside the "test_directory".
|
||||
|
||||
![](images/linux/commands/image14.png)
|
||||
|
||||
We can also use the cp command to copy the whole directory from one
|
||||
location to another. Let's try to understand this with an example.
|
||||
|
||||
![](images/linux/commands/image12.png)
|
||||
|
||||
We again used the mkdir command to create a new directory called
|
||||
"another_directory". We then used the cp command along with an
|
||||
additional argument '-r' to copy the "test_directory".
|
||||
|
||||
**mv (move files and directories)**
|
||||
|
||||
The mv command can either be used to move files or directories from one
|
||||
location to another or it can be used to rename files or directories. Do
|
||||
note that moving files and copying them are very different. When you
|
||||
move the files or directories, the original copy is lost.
|
||||
|
||||
General syntax of using mv command:
|
||||
|
||||
```
|
||||
mv <source_path> <destination_path>
|
||||
```
|
||||
|
||||
In this example, we will use the mv command to move the
|
||||
"\_test_runner.py" file to "test_directory". In this case, this file
|
||||
already exists in "test_directory". The mv command will just replace it.
|
||||
**Do note that the original file doesn't exist in the current directory
|
||||
after mv command ran successfully.**
|
||||
|
||||
![](images/linux/commands/image26.png)
|
||||
|
||||
We can also use the mv command to move a directory from one location to
|
||||
another. In this case, we do not need to use the '-r' flag that we did
|
||||
while using the cp command. Do note that the original directory will not
|
||||
exist if we use mv command.
|
||||
|
||||
One of the important uses of the mv command is to rename files and
|
||||
directories. Let's see how we can use this command for renaming.
|
||||
|
||||
We have first changed our location to "test_directory". We then use the
|
||||
mv command to rename the ""\_test_runner.py" file to "test.py".
|
||||
|
||||
![](images/linux/commands/image29.png)
|
||||
|
||||
## Commands for Viewing Files
|
||||
|
||||
There are five basic commands which are used frequently to view the
|
||||
files:
|
||||
|
||||
- cat
|
||||
|
||||
- head
|
||||
|
||||
- tail
|
||||
|
||||
- more
|
||||
|
||||
- less
|
||||
|
||||
We will now try to understand what each command does and how to use
|
||||
these commands. You should also practice the given examples on the
|
||||
online bash shell.
|
||||
|
||||
We will create a new file called "numbers.txt" and insert numbers from 1
|
||||
to 100 in this file. Each number will be in a separate line.
|
||||
|
||||
![](images/linux/commands/image21.png)
|
||||
|
||||
Do not worry about the above command now. It's an advanced command which
|
||||
is used to generate numbers. We have then used a redirection operator to
|
||||
push these numbers to the file. We will be discussing I/O redirection in the
|
||||
later sections.
|
||||
|
||||
|
||||
### cat
|
||||
|
||||
The most simplest use of cat command is to print the contents of the file on
|
||||
your output screen. This command is very useful and can be used for many
|
||||
other purposes. We will study about other use cases later.
|
||||
|
||||
![](images/linux/commands/image1.png)
|
||||
|
||||
You can try to run the above command and you will see numbers being
|
||||
printed from 1 to 100 on your screen. You will need to scroll up to view
|
||||
all the numbers.
|
||||
|
||||
### head
|
||||
|
||||
The head command displays the first 10 lines of the file by default. We
|
||||
can include additional arguments to display as many lines as we want
|
||||
from the top.
|
||||
|
||||
In this example, we are only able to see the first 10 lines from the
|
||||
file when we use the head command.
|
||||
|
||||
![](images/linux/commands/image15.png)
|
||||
|
||||
By default, head command will only display the first 10 lines. If we
|
||||
want to specify the number of lines we want to see from start, use the
|
||||
'-n' argument to provide the input.
|
||||
|
||||
![](images/linux/commands/image16.png)
|
||||
|
||||
### tail
|
||||
|
||||
The tail command displays the last 10 lines of the file by default. We
|
||||
can include additional arguments to display as many lines as we want
|
||||
from the end of the file.
|
||||
|
||||
![](images/linux/commands/image22.png)
|
||||
|
||||
By default, the tail command will only display the last 10 lines. If we
|
||||
want to specify the number of lines we want to see from the end, use '-n'
|
||||
argument to provide the input.
|
||||
|
||||
![](images/linux/commands/image10.png)
|
||||
|
||||
In this example, we are only able to see the last 5 lines from the file
|
||||
when we use the tail command with explicit -n option.
|
||||
|
||||
|
||||
### more
|
||||
|
||||
More command displays the contents of a file or a command output,
|
||||
displaying one screen at a time in case the file is large (Eg: log files).
|
||||
It also allows forward navigation and limited backward navigation in the file.
|
||||
|
||||
![](images/linux/commands/image33.png)
|
||||
|
||||
More command displays as much as can fit on the current screen and waits for user input to advance. Forward navigation can be done by pressing Enter, which advances the output by one line and Space, which advances the output by one screen.
|
||||
|
||||
### less
|
||||
|
||||
Less command is an improved version of more. It displays the contents of a file or a command output, one page at a time.
|
||||
It allows backward navigation as well as forward navigation in the file and also has search options. We can use arrow keys for advancing backward or forward by one line. For moving forward by one page, press Space and for moving backward by one page, press b on your keyboard.
|
||||
You can go to the beginning and the end of a file instantly.
|
||||
|
||||
|
||||
## Echo Command in Linux
|
||||
|
||||
The echo command is one of the simplest commands that is used in the
|
||||
shell. This command is equivalent to what we have <print> in other
|
||||
programming languages.
|
||||
|
||||
The echo command prints the given input string on the screen.
|
||||
|
||||
![](images/linux/commands/image24.png)
|
||||
|
||||
## Text Processing Commands
|
||||
|
||||
In the previous section, we learned how to view the content of a file.
|
||||
In many cases, we will be interested in performing the below operations:
|
||||
|
||||
- Print only the lines which contain a particular word(s)
|
||||
|
||||
- Replace a particular word with another word in a file
|
||||
|
||||
- Sort the lines in a particular order
|
||||
|
||||
There are three basic commands which are used frequently to process
|
||||
texts:
|
||||
|
||||
- grep
|
||||
|
||||
- sed
|
||||
|
||||
- sort
|
||||
|
||||
We will now try to understand what each command does and how to use
|
||||
these commands. You should also practice the given examples on the
|
||||
online bash shell.
|
||||
|
||||
We will create a new file called "numbers.txt" and insert numbers from 1
|
||||
to 10 in this file. Each number will be in a separate line.
|
||||
|
||||
![](images/linux/commands/image8.png)
|
||||
|
||||
### grep
|
||||
|
||||
The grep command in its simplest form can be used to search particular
|
||||
words in a text file. It will display all the lines in a file that
|
||||
contains a particular input. The word we want to search is provided as
|
||||
an input to the grep command.
|
||||
|
||||
General syntax of using grep command:
|
||||
|
||||
```
|
||||
grep <word_to_search> <file_name>
|
||||
```
|
||||
|
||||
In this example, we are trying to search for a string "1" in this file.
|
||||
The grep command outputs the lines where it found this string.
|
||||
|
||||
![](images/linux/commands/image5.png)
|
||||
|
||||
### sed
|
||||
|
||||
The sed command in its simplest form can be used to replace a text in a
|
||||
file.
|
||||
|
||||
General syntax of using the sed command for replacement:
|
||||
|
||||
```
|
||||
sed 's/<text_to_replace>/<replacement_text>/' <file_name>
|
||||
```
|
||||
|
||||
Let's try to replace each occurrence of "1" in the file with "3" using
|
||||
sed command.
|
||||
|
||||
![](images/linux/commands/image31.png)
|
||||
|
||||
The content of the file will not change in the above
|
||||
example. To do so, we have to use an extra argument '-i' so that the
|
||||
changes are reflected back in the file.
|
||||
|
||||
### sort
|
||||
|
||||
The sort command can be used to sort the input provided to it as an
|
||||
argument. By default, it will sort in increasing order.
|
||||
|
||||
Let's first see the content of the file before trying to sort it.
|
||||
|
||||
![](images/linux/commands/image27.png)
|
||||
|
||||
Now, we will try to sort the file using the sort command. The sort
|
||||
command sorts the content in lexicographical order.
|
||||
|
||||
![](images/linux/commands/image32.png)
|
||||
|
||||
The content of the file will not change in the above
|
||||
example.
|
||||
|
||||
## I/O Redirection
|
||||
|
||||
Each open file gets assigned a file descriptor. A file descriptor is an
|
||||
unique identifier for open files in the system. There are always three
|
||||
default files open, stdin (the keyboard), stdout (the screen), and
|
||||
stderr (error messages output to the screen). These files can be
|
||||
redirected.
|
||||
|
||||
Everything is a file in linux -
|
||||
[https://unix.stackexchange.com/questions/225537/everything-is-a-file](https://unix.stackexchange.com/questions/225537/everything-is-a-file)
|
||||
|
||||
Till now, we have displayed all the output on the screen which is the
|
||||
standard output. We can use some special operators to redirect the
|
||||
output of the command to files or even to the input of other commands.
|
||||
I/O redirection is a very powerful feature.
|
||||
|
||||
In the below example, we have used the '>' operator to redirect the
|
||||
output of ls command to output.txt file.
|
||||
|
||||
![](images/linux/commands/image30.png)
|
||||
|
||||
In the below example, we have redirected the output from echo command to
|
||||
a file.
|
||||
|
||||
![](images/linux/commands/image13.png)
|
||||
|
||||
We can also redirect the output of a command as an input to another
|
||||
command. This is possible with the help of pipes.
|
||||
|
||||
In the below example, we have passed the output of cat command as an
|
||||
input to grep command using pipe(\|) operator.
|
||||
|
||||
![](images/linux/commands/image6.png)
|
||||
|
||||
In the below example, we have passed the output of sort command as an
|
||||
input to uniq command using pipe(\|) operator. The uniq command only
|
||||
prints the unique numbers from the input.
|
||||
|
||||
![](images/linux/commands/image28.png)
|
||||
|
||||
I/O redirection -
|
||||
[https://tldp.org/LDP/abs/html/io-redirection.html](https://tldp.org/LDP/abs/html/io-redirection.html)
|
@ -1,25 +0,0 @@
|
||||
# Conclusion
|
||||
|
||||
We have covered the basics of Linux operating systems and basic commands used in linux.
|
||||
We have also covered the Linux server administration commands.
|
||||
|
||||
We hope that this course will make it easier for you to operate on the command line.
|
||||
|
||||
## Applications in SRE Role
|
||||
|
||||
1. As a SRE, you will be required to perform some general tasks on these Linux servers. You will also be using the command line when you are troubleshooting issues.
|
||||
2. Moving from one location to another in the filesystem will require the help of `ls`, `pwd` and `cd` commands.
|
||||
3. You may need to search some specific information in the log files. `grep` command would be very useful here. I/O redirection will become handy if you want to store the output in a file or pass it as an input to another command.
|
||||
4. `tail` command is very useful to view the latest data in the log file.
|
||||
5. Different users will have different permissions depending on their roles. We will also not want everyone in the company to access our servers for security reasons. Users permissions can be restricted with `chown`, `chmod` and `chgrp` commands.
|
||||
6. `ssh` is one of the most frequently used commands for a SRE. Logging into servers and troubleshooting along with performing basic administration tasks will only be possible if we are able to login into the server.
|
||||
7. What if we want to run an apache server or nginx on a server? We will first install it using the package manager. Package management commands become important here.
|
||||
8. Managing services on servers is another critical responsibility of a SRE. Systemd related commands can help in troubleshooting issues. If a service goes down, we can start it using `systemctl start` command. We can also stop a service in case it is not needed.
|
||||
9. Monitoring is another core responsibility of a SRE. Memory and CPU are two important system level metrics which should be monitored. Commands like `top` and `free` are quite helpful here.
|
||||
10. If a service is throwing an error, how do we find out the root cause of the error ? We will certainly need to check logs to find out the whole stack trace of the error. The log file will also tell us the number of times the error has occurred along with time when it started.
|
||||
|
||||
## Useful Courses and tutorials
|
||||
|
||||
* [Edx basic linux commands course](https://courses.edx.org/courses/course-v1:LinuxFoundationX+LFS101x+1T2020/course/)
|
||||
* [Edx Red Hat Enterprise Linux Course](https://courses.edx.org/courses/course-v1:RedHat+RH066x+2T2017/course/)
|
||||
* [https://linuxcommand.org/lc3_learning_the_shell.php](https://linuxcommand.org/lc3_learning_the_shell.php)
|
Before Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 66 KiB |
Before Width: | Height: | Size: 91 KiB |
Before Width: | Height: | Size: 89 KiB |
Before Width: | Height: | Size: 74 KiB |
Before Width: | Height: | Size: 117 KiB |
Before Width: | Height: | Size: 33 KiB |
Before Width: | Height: | Size: 85 KiB |
Before Width: | Height: | Size: 134 KiB |
Before Width: | Height: | Size: 171 KiB |
Before Width: | Height: | Size: 26 KiB |
Before Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 145 KiB |
Before Width: | Height: | Size: 46 KiB |
Before Width: | Height: | Size: 38 KiB |
Before Width: | Height: | Size: 56 KiB |
Before Width: | Height: | Size: 29 KiB |
Before Width: | Height: | Size: 41 KiB |
Before Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 58 KiB |
Before Width: | Height: | Size: 171 KiB |
Before Width: | Height: | Size: 34 KiB |
Before Width: | Height: | Size: 13 KiB |
Before Width: | Height: | Size: 19 KiB |
Before Width: | Height: | Size: 48 KiB |
Before Width: | Height: | Size: 52 KiB |
Before Width: | Height: | Size: 83 KiB |
Before Width: | Height: | Size: 130 KiB |
Before Width: | Height: | Size: 13 KiB |
Before Width: | Height: | Size: 83 KiB |
Before Width: | Height: | Size: 23 KiB |
Before Width: | Height: | Size: 43 KiB |
Before Width: | Height: | Size: 76 KiB |
Before Width: | Height: | Size: 91 KiB |
Before Width: | Height: | Size: 20 KiB |
Before Width: | Height: | Size: 133 KiB |
Before Width: | Height: | Size: 75 KiB |
Before Width: | Height: | Size: 78 KiB |
Before Width: | Height: | Size: 17 KiB |
Before Width: | Height: | Size: 94 KiB |
Before Width: | Height: | Size: 83 KiB |
Before Width: | Height: | Size: 98 KiB |
Before Width: | Height: | Size: 38 KiB |
Before Width: | Height: | Size: 80 KiB |
Before Width: | Height: | Size: 36 KiB |
Before Width: | Height: | Size: 113 KiB |
Before Width: | Height: | Size: 42 KiB |
Before Width: | Height: | Size: 67 KiB |
Before Width: | Height: | Size: 126 KiB |
Before Width: | Height: | Size: 135 KiB |
Before Width: | Height: | Size: 188 KiB |
Before Width: | Height: | Size: 107 KiB |
Before Width: | Height: | Size: 115 KiB |
Before Width: | Height: | Size: 109 KiB |