Skip to content

Instantly share code, notes, and snippets.

@karmi
Created June 5, 2012 12:30
Show Gist options
  • Star 11 You must be signed in to star a gist
  • Fork 2 You must be signed in to fork a gist
  • Save karmi/2874702 to your computer and use it in GitHub Desktop.
Save karmi/2874702 to your computer and use it in GitHub Desktop.
Three Nodes and One Cluster: Demo of ElasticSearch's Distributed Features
.DS_Store
logs/
data/
*.html
require 'tire'
Tire.configure { logger STDERR }
Tire.index('my_index') do
delete
create settings: { number_of_shards: 3, number_of_replicas: 0 }
end
STDOUT.sync = true
require 'tire'
Tire.configure { logger STDERR }
last_batch = 0
count = (ENV['COUNT'] || 100_000).to_i
pids = []
4.times do |process|
pids << Process.fork do
batch = count / 4
STDOUT.puts "Generating #{batch} documents in process #{process+1}..."
documents = (1..batch).map do |i|
{ id: "#{process}-#{i}-#{Time.now.to_i}-#{rand(10000)}", title: "Document" }
end
Tire.index('my_index') { import documents }
STDOUT.puts "\r"+"Process #{process+1} stored \e[1m#{batch}\e[0m documents.\n"
end
end
pids.each { |pid| Process.wait }
require 'timeout'
require 'tire'
Tire.configure { logger STDERR }
Timeout::timeout(1) do
Tire.index('my_index') do
store( {title: "Hello Red Cluster!"} )
end
end
require 'tire'
r = (ARGV.pop || 1).to_i
puts %Q|curl -X PUT 'http://localhost:9200/my_index/_settings' -d '{"index" : { "number_of_replicas" : #{r} }}'|
Tire::Configuration.client.put "#{Tire::Configuration.url}/my_index/_settings",
{ :index => { :number_of_replicas => r } }.to_json

Three Nodes and One Cluster: Demo of ElasticSearch's Distributed Features

The Setup

Terminal:

Tab for each node.

Browser:

  1. Bigdesk
  2. Paramedic

Cleanup:

curl -X DELETE localhost:9200

Scripts:

  • 01_create_index.rb
  • 02_import_documents.rb
  • 03_store_document.rb
  • 04_change_replicas.rb

The Script

[KAREL]  Start node 1
         Inspect cluster state in Paramedic

[KAREL]  Setup index (shards = 3, replicas=0)
         Inspect state and stats in Paramedic (CPU spike, shard distribution, split evenly across 3 nodes)

[LUKAS]  Start node 2
         Inspect cluster node stats in BigDesk

[KAREL]  Import 100,000 documents
         Paramedic, show CPU spike, HTTP connections increase, ...
         Count documents (90,000)
         
[SHAY]   Start node 3
         
[KAREL]  Inspect cluster state in Paramedic

[KAREL]  Import another 200,000 documents
         
[SHAY]   Kill node 3
         Paramedic, health is _red_
         Count documents (~ 60,000)
         (We are missing data, but the cluster is operational)
         
[KAREL]  Index new document, `{"title" : "Hello Red Cluster!"}`
         
[LUKAS]  Search for the newly added document, "hello"
         
[SHAY]   Start node 3
         Paramedic, health is _green_, data are back (300,001 documents)

         * How to add high availability? Add replicas (= copies of the index). *

[LUKAS]  Increase `index.number_of_replicas` to 1

         ... wait for relocation ...

[KAREL]  Paramedic, show shard distribution/allocation across shards
         (primaries and replicas)

[SHAY]   Kill node 3

[KAREL]  Paramedic, cluster state is _green_

[SHAY]   Launch node 3 again

[KAREL]  Show re-allocation of shards

         * How to force the shards from a node? Eg. to use it for other index, etc? With allocation excludes *

[LUKAS]  curl -XPUT localhost:9200/my_index/_settings -d '{ "index.routing.allocation.exclude._id" : "NODEID" }'

[KAREL]  Show nodes missing from node 3 in Paramedic

http://git.io/3n1c

cluster:
name: berlin_buzzwords
node:
name: "Node 1"
path:
logs: ./logs
data: ./data
cluster:
name: berlin_buzzwords
node:
name: "Node 2"
path:
logs: ./logs
data: ./data
cluster:
name: berlin_buzzwords
node:
name: "Node 3"
path:
logs: ./logs
data: ./data
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment