Database Sharding

Collating some of the resources which talks about Database Sharding.

https://en.wikipedia.org/wiki/Shard_(database_architecture)
[Feb 2019] http://highscalability.com/blog/2019/2/19/intro-to-redis-cluster-sharding-advantages-limitations-deplo.html
Redis Cluster is the Native Sharding implementation available within Redis that allows your to automatically distribute your data across multiple nodes without having to rely on external tools and utilities. Its covers Sharding with Redis Cluster where Redis Clusters is divided in 16384 slots and these slots are assigned to multiple Redis Nodes.

The Redis Cluster Specification is the definitive guide to understanding the internals of the technology, while the Redis Cluster Tutorial provides deployment and administration guidelines.

[ Jan 2019 ] https://scalegrid.io/blog/scalegrid-hosting-adds-support-for-highly-available-redis-clusters-with-automated-sharding/
ScaleGrid : Fully Managed Database as a Service. Fully Managed Native Sharding Implementation available within Redis.
Scale Grid Hosting Adds support for Highly Available Redis Clusters with Automated Sharding.

[October 2010] Problems in Sharding : Lessons from Four Square Incident 17 Hours Downtime.

FourSquare uses MongoDB on 2 EC2 Nodes with 66 GB Ram.
Data Replicates to slaves for redundancy.
MongoDB : JSON Format , Document oriented database, indexes can be made on any field and it supports Auto Sharding.
Data did not got distributed evenly over two Shards. One got 67 GB larger than RAM and other to 50 GB.
Page Faults started occuring due to Demand Paging as RAM is now exhausted.
Whole system got slow as Disks are slow.
Queries became slow, causing backlog of operations and causing more demand paging and thus whole system went down.
Third Shard was added : yet First was still hitting disk.
Reason : Foursquare checkins are small, 300 bytes, MongoDB uses 4K Pages.
When you move a checking, 4K Page would be allocated , Memory would be available only when data is moved off a page. i.e
For Writing 300 Bytes, 4K Memory was getting paged-out.
Compaction Feature of MongoDB was turned-on. But compaction was slow because of the size of the data and becuase EC2's network disk storage.
EBS is relatively slow.
Solution : Was to reload the system from failures from backups so that data was resharded across three shards. All data now fits in physical RAM.

[October 2010] Cache and System Design Blog with Redis : Focusing on Demand Paging and Page Faults.
The Goal of sharding is to continually structure a system to ensure data is chunked in small enough units and spread across enough nodes that data operations are not limited by resource constraints.
Some of the issues which we should be aware of Sharding

Node Exhaustion : Hardware resources exhausted
Shard Exhaustion : Capacity of Shard is exceeded.
Shard Allocation Imbalance : Uneven spread of data across a shard set.
Hot Key : Excessive access for a particular key or set of Keys.
Small Shard Set : Data is stored in too few shards.
Slow Recovery Operations
Cloud Bursting : How quickly can your system respond to cloud bursts ?
Fragmentation :
Naive Disaster Handling , e.g when you add shards are you required to add all the data again?
Excessive Paging
Single Points of Failure
Data Opacity : Data is not observable
Reliance on developer omniscience
Excessive Locking
Slow IO
Transnational Model Mismatch
Poor Operations Model
Insufficient Pool Capacity
Slow bulk import
Object Size and other limits
High Latency and Poor Connection sensitivity

Below 2 are exact copies of one another, Digital Ocean Came first.
{
https://www.digitalocean.com/community/tutorials/understanding-database-sharding
https://medium.com/@varshney.shivam786/database-sharding-system-design-1-63203f494fc0
}
https://www.educative.io/courses/grokking-the-system-design-interview/mEN8lJXV1LA
Articles with Sharding as Tag
[January 2008] http://highscalability.com/blog/2008/1/10/sharding-with-cookie-based-session-storage.html
[April 2008] How PostreSQL was used as Production DB in Skype ?
It has some interesting related resources. Worth reading.

ATul Singh