Skip to main content

Database Sharding

Collating some of the resources which talks about Database Sharding.


  1. https://en.wikipedia.org/wiki/Shard_(database_architecture)
  2. [Feb 2019] http://highscalability.com/blog/2019/2/19/intro-to-redis-cluster-sharding-advantages-limitations-deplo.html
    Redis Cluster is the Native Sharding implementation available within Redis that allows your to automatically distribute your data across multiple nodes without having to rely on external tools and utilities. Its covers Sharding with Redis Cluster  where Redis Clusters is divided in 16384 slots and these slots are assigned to multiple Redis Nodes.

    The Redis Cluster Specification is the definitive guide to understanding the internals of the technology, while the Redis Cluster Tutorial provides deployment and administration guidelines.
    1. [ Jan 2019  ] https://scalegrid.io/blog/scalegrid-hosting-adds-support-for-highly-available-redis-clusters-with-automated-sharding/
      ScaleGrid : Fully Managed Database as a Service. Fully Managed Native Sharding Implementation available within Redis.
      Scale Grid Hosting Adds support for Highly Available Redis Clusters with Automated Sharding.
  3. [October 2010] Problems in Sharding : Lessons from Four Square Incident 17 Hours Downtime.
    1. MongoDB : Database which FourSquare uses contributed to the incident.
      1. FourSquare uses MongoDB on 2 EC2 Nodes with 66 GB Ram.
        Data Replicates to slaves for redundancy.
        MongoDB : JSON Format , Document oriented database, indexes can be made on any field and it supports Auto Sharding.
      2. Data did not got distributed evenly over two Shards. One got 67 GB larger than RAM and other to 50 GB.
      3. Page Faults started occuring due to Demand Paging as RAM is now exhausted.
        Whole system got slow as Disks are slow.
        Queries became slow, causing backlog of operations and causing more demand paging and thus whole system went down.
      4. Third Shard was added : yet First was still hitting disk.
        Reason : Foursquare checkins are small, 300 bytes, MongoDB uses 4K Pages.
        When you move a checking, 4K Page would be allocated , Memory would be available only when data is moved off a page. i.e
        For Writing 300 Bytes, 4K Memory was getting paged-out.
      5. Compaction Feature of MongoDB was turned-on. But compaction was slow because of the size of the data and becuase EC2's network disk storage.
        EBS is relatively slow.
      6. Solution : Was to reload the system from failures from backups so that data was resharded across three shards. All data now fits in physical RAM.
    2. [October 2010] Cache and System Design Blog with Redis : Focusing on Demand Paging and Page Faults.
    3. The Goal of sharding is to continually structure a system to ensure data is chunked in small enough units and spread across enough nodes that data operations are not limited by resource constraints.
    4. Some of the issues which we should be aware of Sharding
      1. Node Exhaustion : Hardware resources exhausted
      2. Shard Exhaustion  : Capacity of Shard is exceeded.
      3. Shard Allocation Imbalance : Uneven spread of data across a shard set.
      4. Hot Key : Excessive access for a particular key or set of Keys. 
      5. Small Shard Set : Data is stored in too few shards.
      6. Slow Recovery Operations
      7. Cloud Bursting : How quickly can your system respond to cloud bursts ?
      8. Fragmentation : 
      9. Naive Disaster Handling  , e.g when you add shards are you required to add all the data again?
      10. Excessive Paging 
      11. Single Points of Failure 
      12. Data Opacity : Data is not observable
      13. Reliance on developer omniscience 
      14. Excessive Locking 
      15. Slow IO
      16. Transnational Model Mismatch
      17. Poor Operations Model
      18. Insufficient Pool Capacity
      19. Slow bulk import
      20. Object Size and other limits
      21. High Latency and Poor Connection sensitivity
    5. Possible Responses to Troubled Shards
      1. Use more powerful nodes.
      2. Use a larger number of nodes.
      3. Move a shard to a different node.
      4. Move keys from a shard to another shard or into its own shard.
      5. Add more shards and move existing data to those shards.
      6. Condition Traffic.
      7. Dark-mode switches
      8. Read-only disaster mode
      9. Reload from a backup
      10. Key Choice
      11. Key Mapping : There are ways to map other than hashing. Example Flickr
      12. Replicate the data and load balance requests across replicas
      13. Monitor
      14. Automation
      15. Background Reindexing and defragmentation
      16. Capacity Planning
      17. Selection
      18. Test
      19. Figure a way to get out of Technical Debt
      20. Better algorithm selection
      21. Separate Historical and Real Time data
      22. Use more compact data structures
      23. Faster IO
      24. Think about SPOF.
      25. Figure out how you will handle downtime.
  4. Below 2 are exact copies of one another, Digital Ocean Came first.
    {
    https://www.digitalocean.com/community/tutorials/understanding-database-sharding
    https://medium.com/@varshney.shivam786/database-sharding-system-design-1-63203f494fc0
    }
  5. https://www.educative.io/courses/grokking-the-system-design-interview/mEN8lJXV1LA
  6. Articles with Sharding as Tag
  7. [January 2008] http://highscalability.com/blog/2008/1/10/sharding-with-cookie-based-session-storage.html
  8. [April 2008] How PostreSQL was used as Production DB in Skype ?
    It has some interesting related resources. Worth reading.

Comments

Popular posts from this blog

Penetration Testing Basics

Penetration testing, often called “pentesting”,“pen testing”, or “security testing”, is the practice of attacking your own or your clients’ IT systems in the same way a hacker would to identify security holes. Of course, you do this without actually harming the network. The person carrying out a penetration test is called a penetration tester or pentester. The difference between Penetration Testing and Hacking is that you have the system owner's permission to do testing and to identfiy security holes. If you want to do penetration testing u should better ask for his/her permission. Basic Security Concepts Vulnerability: It is a security hole in a piece of software, hardware of Operating system that provides a way to attack the system.A vulnerabilty is as simple as weak passwords and as complex as buffer overflows as well as SQL injection. Security Research: Vulnerabilities are typically searched by security researchers who finds the flaws in the system. Security Research can ...