From 69c85007a39d8461e32c830367b8abe05cc54426 Mon Sep 17 00:00:00 2001 From: John Washam Date: Mon, 30 Aug 2021 20:25:24 -0700 Subject: [PATCH] Moved system design to additional info. --- README.md | 226 +++++++++++++++++++++++++++--------------------------- 1 file changed, 113 insertions(+), 113 deletions(-) diff --git a/README.md b/README.md index b058f74..4bed706 100644 --- a/README.md +++ b/README.md @@ -165,7 +165,6 @@ software development/engineering roles. - [Unicode](#unicode) - [Endianness](#endianness) - [Networking](#networking) -- [System Design, Scalability, Data Handling](#system-design-scalability-data-handling) (if you have 4+ years experience) - [Final Review](#final-review) - [Coding Question Practice](#coding-question-practice) - [Coding exercises/challenges](#coding-exerciseschallenges) @@ -182,6 +181,7 @@ software development/engineering roles. ## Additional Resources - [Additional Books](#additional-books) +- [System Design, Scalability, Data Handling](#system-design-scalability-data-handling) (if you have 4+ years experience) - [Additional Learning](#additional-learning) - [Compilers](#compilers) - [Emacs and vi(m)](#emacs-and-vim) @@ -1113,118 +1113,6 @@ Graphs can be used to represent many problems in computer science, so this secti - [ ] [Java - Sockets - Introduction (video)](https://www.youtube.com/watch?v=6G_W54zuadg&t=6s) - [ ] [Socket Programming (video)](https://www.youtube.com/watch?v=G75vN2mnJeQ) -## System Design, Scalability, Data Handling - -**You can expect system design questions if you have 4+ years of experience.** - -- Scalability and System Design are very large topics with many topics and resources, since - there is a lot to consider when designing a software/hardware system that can scale. - Expect to spend quite a bit of time on this -- Considerations: - - Scalability - - Distill large data sets to single values - - Transform one data set to another - - Handling obscenely large amounts of data - - System design - - features sets - - interfaces - - class hierarchies - - designing a system under certain constraints - - simplicity and robustness - - tradeoffs - - performance analysis and optimization -- [ ] **START HERE**: [The System Design Primer](https://github.com/donnemartin/system-design-primer) -- [ ] [System Design from HiredInTech](http://www.hiredintech.com/system-design/) -- [ ] [How Do I Prepare To Answer Design Questions In A Technical Interview?](https://www.quora.com/How-do-I-prepare-to-answer-design-questions-in-a-technical-interview?redirected_qid=1500023) -- [ ] [8 Things You Need to Know Before a System Design Interview](http://blog.gainlo.co/index.php/2015/10/22/8-things-you-need-to-know-before-system-design-interviews/) -- [ ] [Database Normalization - 1NF, 2NF, 3NF and 4NF (video)](https://www.youtube.com/watch?v=UrYLYV7WSHM) -- [ ] [System Design Interview](https://github.com/checkcheckzz/system-design-interview) - There are a lot of resources in this one. Look through the articles and examples. I put some of them below -- [ ] [How to ace a systems design interview](http://www.palantir.com/2011/10/how-to-rock-a-systems-design-interview/) -- [ ] [Numbers Everyone Should Know](http://everythingisdata.wordpress.com/2009/10/17/numbers-everyone-should-know/) -- [ ] [How long does it take to make a context switch?](http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html) -- [ ] [Transactions Across Datacenters (video)](https://www.youtube.com/watch?v=srOgpXECblk) -- [ ] [A plain English introduction to CAP Theorem](http://ksat.me/a-plain-english-introduction-to-cap-theorem) -- [ ] [MIT 6.824: Distributed Systems, Spring 2020 (20 videos)](https://www.youtube.com/watch?v=cQP8WApzIQQ&list=PLrw6a1wE39_tb2fErI4-WkMbsvGQk9_UB) -- [ ] Consensus Algorithms: - - [ ] Paxos - [Paxos Agreement - Computerphile (video)](https://www.youtube.com/watch?v=s8JqcZtvnsM) - - [ ] Raft - [An Introduction to the Raft Distributed Consensus Algorithm (video)](https://www.youtube.com/watch?v=P9Ydif5_qvE) - - [ ] [Easy-to-read paper](https://raft.github.io/) - - [ ] [Infographic](http://thesecretlivesofdata.com/raft/) -- [ ] [Consistent Hashing](http://www.tom-e-white.com/2007/11/consistent-hashing.html) -- [ ] [NoSQL Patterns](http://horicky.blogspot.com/2009/11/nosql-patterns.html) -- [ ] Scalability: - - You don't need all of these. Just pick a few that interest you. - - [ ] [Great overview (video)](https://www.youtube.com/watch?v=-W9F__D3oY4) - - [ ] Short series: - - [Clones](http://www.lecloud.net/post/7295452622/scalability-for-dummies-part-1-clones) - - [Database](http://www.lecloud.net/post/7994751381/scalability-for-dummies-part-2-database) - - [Cache](http://www.lecloud.net/post/9246290032/scalability-for-dummies-part-3-cache) - - [Asynchronism](http://www.lecloud.net/post/9699762917/scalability-for-dummies-part-4-asynchronism) - - [ ] [Scalable Web Architecture and Distributed Systems](http://www.aosabook.org/en/distsys.html) - - [ ] [Fallacies of Distributed Computing Explained](https://pages.cs.wisc.edu/~zuyu/files/fallacies.pdf) - - [ ] [Jeff Dean - Building Software Systems At Google and Lessons Learned (video)](https://www.youtube.com/watch?v=modXC5IWTJI) - - [ ] [Introduction to Architecting Systems for Scale](http://lethain.com/introduction-to-architecting-systems-for-scale/) - - [ ] [Scaling mobile games to a global audience using App Engine and Cloud Datastore (video)](https://www.youtube.com/watch?v=9nWyWwY2Onc) - - [ ] [How Google Does Planet-Scale Engineering for Planet-Scale Infra (video)](https://www.youtube.com/watch?v=H4vMcD7zKM0) - - [ ] [The Importance of Algorithms](https://www.topcoder.com/community/competitive-programming/tutorials/the-importance-of-algorithms/) - - [ ] [Sharding](http://highscalability.com/blog/2009/8/6/an-unorthodox-approach-to-database-design-the-coming-of-the.html) - - [ ] [Engineering for the Long Game - Astrid Atkinson Keynote(video)](https://www.youtube.com/watch?v=p0jGmgIrf_M&list=PLRXxvay_m8gqVlExPC5DG3TGWJTaBgqSA&index=4) - - [ ] [7 Years Of YouTube Scalability Lessons In 30 Minutes](http://highscalability.com/blog/2012/3/26/7-years-of-youtube-scalability-lessons-in-30-minutes.html) - - [video](https://www.youtube.com/watch?v=G-lGCC4KKok) - - [ ] [How PayPal Scaled To Billions Of Transactions Daily Using Just 8VMs](http://highscalability.com/blog/2016/8/15/how-paypal-scaled-to-billions-of-transactions-daily-using-ju.html) - - [ ] [How to Remove Duplicates in Large Datasets](https://blog.clevertap.com/how-to-remove-duplicates-in-large-datasets/) - - [ ] [A look inside Etsy's scale and engineering culture with Jon Cowie (video)](https://www.youtube.com/watch?v=3vV4YiqKm1o) - - [ ] [What Led Amazon to its Own Microservices Architecture](http://thenewstack.io/led-amazon-microservices-architecture/) - - [ ] [To Compress Or Not To Compress, That Was Uber's Question](https://eng.uber.com/trip-data-squeeze/) - - [ ] [When Should Approximate Query Processing Be Used?](http://highscalability.com/blog/2016/2/25/when-should-approximate-query-processing-be-used.html) - - [ ] [Google's Transition From Single Datacenter, To Failover, To A Native Multihomed Architecture]( http://highscalability.com/blog/2016/2/23/googles-transition-from-single-datacenter-to-failover-to-a-n.html) - - [ ] [The Image Optimization Technology That Serves Millions Of Requests Per Day](http://highscalability.com/blog/2016/6/15/the-image-optimization-technology-that-serves-millions-of-re.html) - - [ ] [A Patreon Architecture Short](http://highscalability.com/blog/2016/2/1/a-patreon-architecture-short.html) - - [ ] [Tinder: How Does One Of The Largest Recommendation Engines Decide Who You'll See Next?](http://highscalability.com/blog/2016/1/27/tinder-how-does-one-of-the-largest-recommendation-engines-de.html) - - [ ] [Design Of A Modern Cache](http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html) - - [ ] [Live Video Streaming At Facebook Scale](http://highscalability.com/blog/2016/1/13/live-video-streaming-at-facebook-scale.html) - - [ ] [A Beginner's Guide To Scaling To 11 Million+ Users On Amazon's AWS](http://highscalability.com/blog/2016/1/11/a-beginners-guide-to-scaling-to-11-million-users-on-amazons.html) - - [ ] [A 360 Degree View Of The Entire Netflix Stack](http://highscalability.com/blog/2015/11/9/a-360-degree-view-of-the-entire-netflix-stack.html) - - [ ] [Latency Is Everywhere And It Costs You Sales - How To Crush It](http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it) - - [ ] [What Powers Instagram: Hundreds of Instances, Dozens of Technologies](http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances) - - [ ] [Salesforce Architecture - How They Handle 1.3 Billion Transactions A Day](http://highscalability.com/blog/2013/9/23/salesforce-architecture-how-they-handle-13-billion-transacti.html) - - [ ] [ESPN's Architecture At Scale - Operating At 100,000 Duh Nuh Nuhs Per Second](http://highscalability.com/blog/2013/11/4/espns-architecture-at-scale-operating-at-100000-duh-nuh-nuhs.html) - - [ ] See "Messaging, Serialization, and Queueing Systems" way below for info on some of the technologies that can glue services together - - [ ] Twitter: - - [O'Reilly MySQL CE 2011: Jeremy Cole, "Big and Small Data at @Twitter" (video)](https://www.youtube.com/watch?v=5cKTP36HVgI) - - [Timelines at Scale](https://www.infoq.com/presentations/Twitter-Timeline-Scalability) - - For even more, see "Mining Massive Datasets" video series in the [Video Series](#video-series) section -- [ ] Practicing the system design process: Here are some ideas to try working through on paper, each with some documentation on how it was handled in the real world: - - review: [The System Design Primer](https://github.com/donnemartin/system-design-primer) - - [System Design from HiredInTech](http://www.hiredintech.com/system-design/) - - [cheat sheet](https://github.com/jwasham/coding-interview-university/blob/main/extras/cheat%20sheets/system-design.pdf) - - flow: - 1. Understand the problem and scope: - - Define the use cases, with interviewer's help - - Suggest additional features - - Remove items that interviewer deems out of scope - - Assume high availability is required, add as a use case - 2. Think about constraints: - - Ask how many requests per month - - Ask how many requests per second (they may volunteer it or make you do the math) - - Estimate reads vs. writes percentage - - Keep 80/20 rule in mind when estimating - - How much data written per second - - Total storage required over 5 years - - How much data read per second - 3. Abstract design: - - Layers (service, data, caching) - - Infrastructure: load balancing, messaging - - Rough overview of any key algorithm that drives the service - - Consider bottlenecks and determine solutions - - Exercises: - - [Design a random unique ID generation system](https://blog.twitter.com/2010/announcing-snowflake) - - [Design a key-value database](http://www.slideshare.net/dvirsky/introduction-to-redis) - - [Design a picture sharing system](http://highscalability.com/blog/2011/12/6/instagram-architecture-14-million-users-terabytes-of-photos.html) - - [Design a recommendation system](http://ijcai13.org/files/tutorial_slides/td3.pdf) - - [Design a URL-shortener system: copied from above](http://www.hiredintech.com/system-design/the-system-design-process/) - - [Design a cache system](https://www.adayinthelifeof.nl/2011/02/06/memcache-internals/) - --- ## Final Review @@ -1482,6 +1370,118 @@ You're never really done. - [Computer Architecture, Sixth Edition: A Quantitative Approach](https://www.amazon.com/dp/0128119055) - For a richer, more up-to-date (2017), but longer treatment +## System Design, Scalability, Data Handling + +**You can expect system design questions if you have 4+ years of experience.** + +- Scalability and System Design are very large topics with many topics and resources, since + there is a lot to consider when designing a software/hardware system that can scale. + Expect to spend quite a bit of time on this +- Considerations: + - Scalability + - Distill large data sets to single values + - Transform one data set to another + - Handling obscenely large amounts of data + - System design + - features sets + - interfaces + - class hierarchies + - designing a system under certain constraints + - simplicity and robustness + - tradeoffs + - performance analysis and optimization +- [ ] **START HERE**: [The System Design Primer](https://github.com/donnemartin/system-design-primer) +- [ ] [System Design from HiredInTech](http://www.hiredintech.com/system-design/) +- [ ] [How Do I Prepare To Answer Design Questions In A Technical Interview?](https://www.quora.com/How-do-I-prepare-to-answer-design-questions-in-a-technical-interview?redirected_qid=1500023) +- [ ] [8 Things You Need to Know Before a System Design Interview](http://blog.gainlo.co/index.php/2015/10/22/8-things-you-need-to-know-before-system-design-interviews/) +- [ ] [Database Normalization - 1NF, 2NF, 3NF and 4NF (video)](https://www.youtube.com/watch?v=UrYLYV7WSHM) +- [ ] [System Design Interview](https://github.com/checkcheckzz/system-design-interview) - There are a lot of resources in this one. Look through the articles and examples. I put some of them below +- [ ] [How to ace a systems design interview](http://www.palantir.com/2011/10/how-to-rock-a-systems-design-interview/) +- [ ] [Numbers Everyone Should Know](http://everythingisdata.wordpress.com/2009/10/17/numbers-everyone-should-know/) +- [ ] [How long does it take to make a context switch?](http://blog.tsunanet.net/2010/11/how-long-does-it-take-to-make-context.html) +- [ ] [Transactions Across Datacenters (video)](https://www.youtube.com/watch?v=srOgpXECblk) +- [ ] [A plain English introduction to CAP Theorem](http://ksat.me/a-plain-english-introduction-to-cap-theorem) +- [ ] [MIT 6.824: Distributed Systems, Spring 2020 (20 videos)](https://www.youtube.com/watch?v=cQP8WApzIQQ&list=PLrw6a1wE39_tb2fErI4-WkMbsvGQk9_UB) +- [ ] Consensus Algorithms: + - [ ] Paxos - [Paxos Agreement - Computerphile (video)](https://www.youtube.com/watch?v=s8JqcZtvnsM) + - [ ] Raft - [An Introduction to the Raft Distributed Consensus Algorithm (video)](https://www.youtube.com/watch?v=P9Ydif5_qvE) + - [ ] [Easy-to-read paper](https://raft.github.io/) + - [ ] [Infographic](http://thesecretlivesofdata.com/raft/) +- [ ] [Consistent Hashing](http://www.tom-e-white.com/2007/11/consistent-hashing.html) +- [ ] [NoSQL Patterns](http://horicky.blogspot.com/2009/11/nosql-patterns.html) +- [ ] Scalability: + - You don't need all of these. Just pick a few that interest you. + - [ ] [Great overview (video)](https://www.youtube.com/watch?v=-W9F__D3oY4) + - [ ] Short series: + - [Clones](http://www.lecloud.net/post/7295452622/scalability-for-dummies-part-1-clones) + - [Database](http://www.lecloud.net/post/7994751381/scalability-for-dummies-part-2-database) + - [Cache](http://www.lecloud.net/post/9246290032/scalability-for-dummies-part-3-cache) + - [Asynchronism](http://www.lecloud.net/post/9699762917/scalability-for-dummies-part-4-asynchronism) + - [ ] [Scalable Web Architecture and Distributed Systems](http://www.aosabook.org/en/distsys.html) + - [ ] [Fallacies of Distributed Computing Explained](https://pages.cs.wisc.edu/~zuyu/files/fallacies.pdf) + - [ ] [Jeff Dean - Building Software Systems At Google and Lessons Learned (video)](https://www.youtube.com/watch?v=modXC5IWTJI) + - [ ] [Introduction to Architecting Systems for Scale](http://lethain.com/introduction-to-architecting-systems-for-scale/) + - [ ] [Scaling mobile games to a global audience using App Engine and Cloud Datastore (video)](https://www.youtube.com/watch?v=9nWyWwY2Onc) + - [ ] [How Google Does Planet-Scale Engineering for Planet-Scale Infra (video)](https://www.youtube.com/watch?v=H4vMcD7zKM0) + - [ ] [The Importance of Algorithms](https://www.topcoder.com/community/competitive-programming/tutorials/the-importance-of-algorithms/) + - [ ] [Sharding](http://highscalability.com/blog/2009/8/6/an-unorthodox-approach-to-database-design-the-coming-of-the.html) + - [ ] [Engineering for the Long Game - Astrid Atkinson Keynote(video)](https://www.youtube.com/watch?v=p0jGmgIrf_M&list=PLRXxvay_m8gqVlExPC5DG3TGWJTaBgqSA&index=4) + - [ ] [7 Years Of YouTube Scalability Lessons In 30 Minutes](http://highscalability.com/blog/2012/3/26/7-years-of-youtube-scalability-lessons-in-30-minutes.html) + - [video](https://www.youtube.com/watch?v=G-lGCC4KKok) + - [ ] [How PayPal Scaled To Billions Of Transactions Daily Using Just 8VMs](http://highscalability.com/blog/2016/8/15/how-paypal-scaled-to-billions-of-transactions-daily-using-ju.html) + - [ ] [How to Remove Duplicates in Large Datasets](https://blog.clevertap.com/how-to-remove-duplicates-in-large-datasets/) + - [ ] [A look inside Etsy's scale and engineering culture with Jon Cowie (video)](https://www.youtube.com/watch?v=3vV4YiqKm1o) + - [ ] [What Led Amazon to its Own Microservices Architecture](http://thenewstack.io/led-amazon-microservices-architecture/) + - [ ] [To Compress Or Not To Compress, That Was Uber's Question](https://eng.uber.com/trip-data-squeeze/) + - [ ] [When Should Approximate Query Processing Be Used?](http://highscalability.com/blog/2016/2/25/when-should-approximate-query-processing-be-used.html) + - [ ] [Google's Transition From Single Datacenter, To Failover, To A Native Multihomed Architecture]( http://highscalability.com/blog/2016/2/23/googles-transition-from-single-datacenter-to-failover-to-a-n.html) + - [ ] [The Image Optimization Technology That Serves Millions Of Requests Per Day](http://highscalability.com/blog/2016/6/15/the-image-optimization-technology-that-serves-millions-of-re.html) + - [ ] [A Patreon Architecture Short](http://highscalability.com/blog/2016/2/1/a-patreon-architecture-short.html) + - [ ] [Tinder: How Does One Of The Largest Recommendation Engines Decide Who You'll See Next?](http://highscalability.com/blog/2016/1/27/tinder-how-does-one-of-the-largest-recommendation-engines-de.html) + - [ ] [Design Of A Modern Cache](http://highscalability.com/blog/2016/1/25/design-of-a-modern-cache.html) + - [ ] [Live Video Streaming At Facebook Scale](http://highscalability.com/blog/2016/1/13/live-video-streaming-at-facebook-scale.html) + - [ ] [A Beginner's Guide To Scaling To 11 Million+ Users On Amazon's AWS](http://highscalability.com/blog/2016/1/11/a-beginners-guide-to-scaling-to-11-million-users-on-amazons.html) + - [ ] [A 360 Degree View Of The Entire Netflix Stack](http://highscalability.com/blog/2015/11/9/a-360-degree-view-of-the-entire-netflix-stack.html) + - [ ] [Latency Is Everywhere And It Costs You Sales - How To Crush It](http://highscalability.com/latency-everywhere-and-it-costs-you-sales-how-crush-it) + - [ ] [What Powers Instagram: Hundreds of Instances, Dozens of Technologies](http://instagram-engineering.tumblr.com/post/13649370142/what-powers-instagram-hundreds-of-instances) + - [ ] [Salesforce Architecture - How They Handle 1.3 Billion Transactions A Day](http://highscalability.com/blog/2013/9/23/salesforce-architecture-how-they-handle-13-billion-transacti.html) + - [ ] [ESPN's Architecture At Scale - Operating At 100,000 Duh Nuh Nuhs Per Second](http://highscalability.com/blog/2013/11/4/espns-architecture-at-scale-operating-at-100000-duh-nuh-nuhs.html) + - [ ] See "Messaging, Serialization, and Queueing Systems" way below for info on some of the technologies that can glue services together + - [ ] Twitter: + - [O'Reilly MySQL CE 2011: Jeremy Cole, "Big and Small Data at @Twitter" (video)](https://www.youtube.com/watch?v=5cKTP36HVgI) + - [Timelines at Scale](https://www.infoq.com/presentations/Twitter-Timeline-Scalability) + - For even more, see "Mining Massive Datasets" video series in the [Video Series](#video-series) section +- [ ] Practicing the system design process: Here are some ideas to try working through on paper, each with some documentation on how it was handled in the real world: + - review: [The System Design Primer](https://github.com/donnemartin/system-design-primer) + - [System Design from HiredInTech](http://www.hiredintech.com/system-design/) + - [cheat sheet](https://github.com/jwasham/coding-interview-university/blob/main/extras/cheat%20sheets/system-design.pdf) + - flow: + 1. Understand the problem and scope: + - Define the use cases, with interviewer's help + - Suggest additional features + - Remove items that interviewer deems out of scope + - Assume high availability is required, add as a use case + 2. Think about constraints: + - Ask how many requests per month + - Ask how many requests per second (they may volunteer it or make you do the math) + - Estimate reads vs. writes percentage + - Keep 80/20 rule in mind when estimating + - How much data written per second + - Total storage required over 5 years + - How much data read per second + 3. Abstract design: + - Layers (service, data, caching) + - Infrastructure: load balancing, messaging + - Rough overview of any key algorithm that drives the service + - Consider bottlenecks and determine solutions + - Exercises: + - [Design a random unique ID generation system](https://blog.twitter.com/2010/announcing-snowflake) + - [Design a key-value database](http://www.slideshare.net/dvirsky/introduction-to-redis) + - [Design a picture sharing system](http://highscalability.com/blog/2011/12/6/instagram-architecture-14-million-users-terabytes-of-photos.html) + - [Design a recommendation system](http://ijcai13.org/files/tutorial_slides/td3.pdf) + - [Design a URL-shortener system: copied from above](http://www.hiredintech.com/system-design/the-system-design-process/) + - [Design a cache system](https://www.adayinthelifeof.nl/2011/02/06/memcache-internals/) + ## Additional Learning I added them to help you become a well-rounded software engineer, and to be aware of certain