You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

183 lines
10 KiB
Markdown

This file contains ambiguous Unicode characters!

This file contains ambiguous Unicode characters that may be confused with others in your current locale. If your use case is intentional and legitimate, you can safely ignore this warning. Use the Escape button to highlight these characters.

# Site Reliability Engineer (SRE) Interview Preparation Guide
[![HitCount](http://hits.dwyl.com/mxssl/sre-interview-prep-guide.svg)](http://hits.dwyl.com/mxssl/sre-interview-prep-guide)
This repository is an attempt to consolidate useful resources for Site Reliability Engineer (SRE) interview preparation.
## Basics
* Simple: [What happens when you type in www.cnn.com in your browser?](https://syedali.net/2013/08/18/what-happens-when-you-type-in-www-cnn-com-in-your-browser)
* Detailed: [What happens when you type google.com into your browser's address box and press enter?](https://github.com/alex/what-happens-when)
## Linux
### Boot Process
* [An introduction to the Linux boot and startup processes](https://opensource.com/article/17/2/linux-boot-and-startup)
* [What happens when we turn on computer?](https://www.cdn.geeksforgeeks.org/what-happens-when-we-turn-on-computer)
* [What happens when we turn on computer?](https://leetcode.com/discuss/interview-question/125107/What-happens-when-we-turn-on-computer)
* [From Power up to login prompt](http://www.scott-a-s.com/files/linux_boot.pdf)
### Filesystem
* [Understanding Inodes](https://syedali.net/2015/02/08/understanding-inodes)
* [Understand UNIX / Linux Inodes Basics with Examples](https://www.thegeekstuff.com/2012/01/linux-inodes)
* [Understanding proc filesystem](https://syedali.net/2013/08/20/understanding-proc-filesystem)
* [Common Mount Options](https://syedali.net/2015/01/06/common-mount-options)
* [Understanding Linux filesystems: ext4 and beyond](https://opensource.com/article/18/4/ext4-filesystem)
### Kernel
* [Explain the basics of Linux kernel](http://learnlinuxconcepts.blogspot.com/2014/03/explain-basics-of-linux-kernel.html)
* [Kernel Space and User Space](http://learnlinuxconcepts.blogspot.com/2014/02/kernel-space-and-user-space.html)
* [Linux Kernel Process Management](http://learnlinuxconcepts.blogspot.com/2014/03/process-management.html)
* [Linux Addressing](http://learnlinuxconcepts.blogspot.com/2014/02/linux-addressing.html)
* [Linux Kernel Memory Management](http://learnlinuxconcepts.blogspot.com/2014/02/linux-memory-management.html)
* [STACK AND HEAP](http://learnlinuxconcepts.blogspot.com/2014/02/stack-and-heap.html)
* [Paging and Segmentation](http://learnlinuxconcepts.blogspot.com/2014/02/paging-and-segmentation.html)
* [Linux Kernel System Calls](http://learnlinuxconcepts.blogspot.com/2014/02/system-calls.html)
* [The Virtual Filesystem](http://learnlinuxconcepts.blogspot.com/2014/10/the-virtual-filesystem.html)
* [Concurrency and Race Conditions](http://learnlinuxconcepts.blogspot.com/2014/07/concurrency-and-race-conditions.html)
* [Memory Leak](https://stackoverflow.com/questions/312069/the-best-memory-leak-definition)
* [What is a kernel Panic?](http://learnlinuxconcepts.blogspot.com/2014/07/what-is-kernel-panic.html)
### Troubleshooting
* [Linux troubleshooting tools](https://syedali.net/2013/08/20/linux-troubleshooting-tools)
* [Linux Performance Analysis in 60,000 Milliseconds](https://medium.com/netflix-techblog/linux-performance-analysis-in-60-000-milliseconds-accc10403c55)
## Networking
* [Network protocols for anyone who knows a programming language](https://www.destroyallsoftware.com/compendium/network-protocols?share_key=97d3ba4c24d21147)
* [Introduction to Linux interfaces for virtual networking](https://developers.redhat.com/blog/2018/10/22/introduction-to-linux-interfaces-for-virtual-networking)
* [Multi-tier load-balancing with Linux](https://vincent.bernat.ch/en/blog/2018-multi-tier-loadbalancer)
* [Introduction to modern network load balancing and proxying](https://blog.envoyproxy.io/introduction-to-modern-network-load-balancing-and-proxying-a57f6ff80236)
* [Load Balancing Algorithms](https://syedali.net/2013/08/22/load-balancing-algorithms)
## Containers
* [Introduction to Docker and Containers](http://container.training/intro-selfpaced.yml.html)
* [Containers Patterns](https://l0rd.github.io/containerspatterns)
* [Docker Container Anti Patterns](https://blog.couchbase.com/docker-container-anti-patterns/)
## Kubernetes
* [Deploying and Scaling Microservices with Docker and Kubernetes](http://container.training/kube-selfpaced.yml.html)
* [What happens when ... Kubernetes edition!](https://github.com/jamiehannaford/what-happens-when-k8s/blob/master/README.md)
* [Kubernetes Production Patterns](https://github.com/gravitational/workshop/blob/master/k8sprod.md)
* [Kubernetes production best practices](https://learnk8s.io/production-best-practices)
* [A Guide to the Kubernetes Networking Model](https://sookocheff.com/post/kubernetes/understanding-kubernetes-networking-model)
## Infrastructure as code / Configuration management
* [Terraform](https://learn.hashicorp.com/terraform)
* [Ansible](https://github.com/leucos/ansible-tuto)
## CI/CD
* [7 Pipeline Design Patterns for Continuous Delivery](https://www.singlestoneconsulting.com/blog/7-pipeline-design-patterns-for-continuous-delivery)
* [CI/CD patterns](https://continuousdelivery.com/implementing/patterns)
* [Six Strategies for Application Deployment](https://thenewstack.io/deployment-strategies)
## Clouds
* [The Open Guide to Amazon Web Services](https://github.com/open-guides/og-aws)
## Programming
### Go (Golang)
* [A tour of Go](https://tour.golang.org)
* [Go by Example](https://gobyexample.com)
* [Learn Go with Tests](https://quii.gitbook.io/learn-go-with-tests/)
* [Getting up and running with Go](http://www.golangprograms.com)
* [Effective Go](https://golang.org/doc/effective_go.html)
* [Go Design Patterns](https://github.com/tmrts/go-patterns)
* [Go Memory Management](https://povilasv.me/go-memory-management)
### Big O Notation, Algorithms and Data Structures
* [AlgoExperts](https://www.algoexpert.io)
* [Hacking a Google Interview Handout 1](http://courses.csail.mit.edu/iap/interview/Hacking_a_Google_Interview_Handout_1.pdf)
* [Hacking a Google Interview Handout 2](http://courses.csail.mit.edu/iap/interview/Hacking_a_Google_Interview_Handout_2.pdf)
* [Hacking a Google Interview Handout 3](http://courses.csail.mit.edu/iap/interview/Hacking_a_Google_Interview_Handout_3.pdf)
## System design
* [SystemsExpert course from AlgoExpert](https://www.algoexpert.io/se/product)
* [Grokking the System Design Interview](https://www.educative.io/collection/5668639101419520/5649050225344512)
* [The System Design Primer](https://github.com/donnemartin/system-design-primer)
* [Crack the System Design Interview](https://www.puncsky.com/blog/2016/02/14/crack-the-system-design-interview)
* [System design interview for IT companies](https://github.com/checkcheckzz/system-design-interview)
* [Web Architecture 101](https://engineering.videoblocks.com/web-architecture-101-a3224e126947)
* [What's in a Production Web Application?](https://stephenmann.io/post/whats-in-a-production-web-application)
## Monitoring
* [SLOs & You: A Guide To Service Level Objectives](https://www.circonus.com/2018/07/a-guide-to-service-level-objectives)
## Processes
* [Incident Response](https://response.pagerduty.com)
* [Postmortems](https://postmortems.pagerduty.com)
* [Runbooks](https://www.transposit.com/blog/2019.11.14-what-makes-a-good-runbook)
* [Identifying and tracking toil using SRE principles](https://cloud.google.com/blog/products/management-tools/identifying-and-tracking-toil-using-sre-principles)
* [Building SRE from Scratch](https://medium.com/ibm-garage/building-sre-from-scratch-485e23985bbd)
## Interview
### SRE interview process
* [How to hire talent](https://syedali.net/2014/04/01/how-to-hire-talent)
* [Recruitment process for a Google job (SRE, Site Reliability Engineer)](http://lambda-startup.com/recruitment-process-for-a-google-job-sre-site-reliability-engineer)
### Interview Questions
* [A collection of questions to practice with for SRE interviews](https://github.com/michael-kehoe/sre-interview)
* [SRE Interview Questions](https://syedali.net/engineer-interview-questions)
* [Sysadmin Test Questions](https://github.com/trimstray/test-your-sysadmin-skills)
* [Kubernetes job interview questions](https://enterprisersproject.com/article/2019/2/kubernetes-job-interview-questions-how-prepare)
* [DevOps Guide](https://github.com/Tikam02/DevOps-Guide)
* [Questions I ask in SRE interviews](https://dev.to/logan/questions-i-ask-in-sre-interviews-a9j)
* [DevOps Roadmap: Learn to become a DevOps Engineer or SRE](https://roadmap.sh/devops)
### Blogposts
* [SRE Interviews in Silicon Valley](http://blog.marc-seeger.de/2015/05/01/sre-interviews-in-silicon-valley)
* [Preparing the SRE interview](https://blog.balthazar-rouberol.com/preparing-the-sre-interview)
* [How to Get Into SRE](https://blog.alicegoldfuss.com/how-to-get-into-sre)
* [My Job Interview at Google](https://catonmat.net/my-job-interview-at-google)
* [Path to Site Reliability Management](https://danrl.com/blog/2019/path-to-srm)
* [Becoming a Site Reliability Engineer](https://tik.dev/becoming-an-sre)
## Books
### SRE books
* [Site Reliability Engineering](https://landing.google.com/sre/sre-book/toc/index.html)
* [The Site Reliability Workbook](https://landing.google.com/sre/workbook/toc/)
* [Seeking SRE](https://books.google.ru/books?id=tmhqDwAAQBAJ)
* [Building Secure and Reliable Systems](https://sre.google/static/pdf/building_secure_and_reliable_systems.pdf)
* [Implementing Service Level Objectives](https://learning.oreilly.com/library/view/implementing-service-level/9781492076803)
### Linux
* [Linux Kernel Development (3rd Edition)](https://www.amazon.com/Linux-Kernel-Development-Robert-Love/dp/0672329468)
* [UNIX and Linux System Administration Handbook (5th Edition)](https://www.amazon.com/UNIX-Linux-System-Administration-Handbook/dp/0134277554)
* [Linux Pocket Guide, 3rd Edition](http://shop.oreilly.com/product/0636920040927.do)
### Networking
* [TCP/IP Illustrated, Volume 1](https://www.amazon.com/TCP-Illustrated-Protocols-Addison-Wesley-Professional/dp/0321336313)
### Troubleshooting and Performance
* [Systems Performance: Enterprise and the Cloud](https://www.amazon.com/Systems-Performance-Enterprise-Brendan-Gregg/dp/0133390098)
* [Systems Performance, 2nd Edition](https://www.informit.com/store/systems-performance-9780136820154?ranMID=24808)
## Courses
* [Site Reliability Engineering: Measuring and Managing Reliability](https://www.coursera.org/learn/site-reliability-engineering-slos)
* [School of SRE](https://linkedin.github.io/school-of-sre)