Sharding, eh? There's so many questions on a daily basis about sharding -
Here's the official documents page (on sharding) and Kristina's blog, which is simply excellent on so many levels - I recommend reading both links (btw, it'll take a while :). Kristina uses some awesome analogies to explain sharding.
This blog post isn't about the technicalities of sharding, there are much more intelligent people than me who can explain that. I wrote a simple script to learn a bit more sharding and for reproducing issues and I thought I'd share it. It's written in bash because I didn't want to worry about dependencies :)
After you run the script, you should be able to run
Hopefully it's of interest or help to someone :)
- What is sharding?
- How do I do shard?
- When do I do shard?
- How do I know I need to shard?
- How many shards do I need?
- What shard key should I use?
- Can I change my shard key?
- What's a hotspot?
- How many shards do I need?
- Do I have a replica set within a shard?
- etc
Here's the official documents page (on sharding) and Kristina's blog, which is simply excellent on so many levels - I recommend reading both links (btw, it'll take a while :). Kristina uses some awesome analogies to explain sharding.
This blog post isn't about the technicalities of sharding, there are much more intelligent people than me who can explain that. I wrote a simple script to learn a bit more sharding and for reproducing issues and I thought I'd share it. It's written in bash because I didn't want to worry about dependencies :)
After you run the script, you should be able to run
$ mongo twitter --eval 'sh.status()'
from your local shell and you should see something like the following indicating that you have now created a sharded cluster with a sharded database "twitter", a sharded collection "tweets" MongoDB shell version: 2.0.6
connecting to: twitter
--- Sharding Status ---
sharding version: { "_id" : 1, "version" : 3 }
shards:
{ "_id" : "shard0000", "host" : "localhost:10000" }
{ "_id" : "shard0001", "host" : "localhost:10001" }
{ "_id" : "shard0002", "host" : "localhost:10002" }
databases:
{ "_id" : "admin", "partitioned" : false, "primary" : "config" }
{ "_id" : "twitter", "partitioned" : true, "primary" : "shard0000" }
twitter.tweets chunks:
shard0000 1
{ "query" : { $minKey : 1 }, "max_id" : { $minKey : 1 } } -->> { "query" : { $maxKey : 1 }, "max_id" : { $maxKey : 1 } } on : shard0000 { "t" : 1000, "i" : 0 }
Hopefully it's of interest or help to someone :)
Comments
Post a Comment