diff --git a/courses/databases_nosql/key_concepts.md b/courses/databases_nosql/key_concepts.md index 8034b93..0c95d9c 100644 --- a/courses/databases_nosql/key_concepts.md +++ b/courses/databases_nosql/key_concepts.md @@ -30,10 +30,7 @@ When data is distributed across nodes, it can be modified on different nodes at * **Vector Clocks** A vector clock is defined as a tuple of clock values from each node. In a distributed environment, each node maintains a tuple of such clock values which represent the state of the nodes itself and its peers/replicas. A clock value may be real timestamps derived from local clock or version no. - - -

@@ -77,9 +74,6 @@ When the amount of data crosses the capacity of a single node, we need to think Sharing refers to dividing data in such a way that data is distributed evenly (both in terms of storage & processing power) across a cluster of nodes. It can also imply data locality, which means similar & related data is stored together to facilitate faster access. A shard in turn can be further replicated to meet load balancing or disaster recovery requirements. A single shard replica might take in all writes (single leader) or multiple replicas can take writes (multi-leader). Reads can be distributed across multiple replicas. Since data is now distributed across multiple nodes, clients should be able to consistently figure out where data is hosted. We will look at some of the common techniques below. The downside of sharding is that joins between shards is not possible. So an upstream/downstream application has to aggregate the results from multiple shards. - - -

@@ -118,8 +112,6 @@ Consistent hashing is a distributed hashing scheme that operates independently o Say that our hash function h() generates a 32-bit integer. Then, to determine to which server we will send a key k, we find the server s whose hash h(s) is the smallest integer that is larger than h(k). To make the process simpler, we assume the table is circular, which means that if we cannot find a server with a hash larger than h(k), we wrap around and start looking from the beginning of the array. - -

@@ -164,10 +156,6 @@ In a 5 node cluster, you need 3 nodes for a majority, In a 6 node cluster, you need 4 nodes for a majority. - - - -

@@ -182,8 +170,6 @@ Eg: In a 5 node cluster, consider what happens if nodes 1, 2, and 3 can communic Below diagram demonstrates Quorum selection on a cluster partitioned into two sets. - -