MongoDB, Elastic Search, Setup
I've been working on another project recently and have decided to stray away from typical relational databases (IE SQL), and get involved in the NoSQL revolution.... Primarily because I was sold the idea by a colleague!
This naturally brought up the question of searching, and I found myself looking at solutions such as Lucene, but finally landed on ElasticSearch (which, is based on Lucene).
Getting this to work however with the latest version of MongoDb (2.4.6), proved to be some what of a challenge. Therefore I've wrote this guide for anyone else suffering! We will be installing and configuring the following on Ubuntu Server 12.04 LTS:
- ElasticSearch 0.90.5
- Plugin: MapperAttachments 0.90.5
- Plugin: MongoDb River 1.7.1 (currently the master unreleased branch)
Prerequisites
This guide is not going to show you how to set up MongoDb. You're on your own for that one, it's rather easy though.
Note: As I discovered, the version of the components you use is extremely important. Therefore please stick to the verions I use in order to get it working correctly.
Configuration of MongoDB
ElasticSearch is kept up to date with MongoDb through a "river". In order for this to work you need to configure a ReplicaSet, even if you're using a standalone instance. To do this, follow the instructions at: http://docs.mongodb.org/manual/tutorial/convert-standalone-to-replica-set/
Installation of Elasticsearch
The following script will download and install ElasticSearch, and the required plugins. It will also install ElasticSearch-Head, which is the best GUI interface I've found to date.
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.5.deb
dpkg -i elasticsearch-0.90.5.deb
sudo /usr/share/elasticsearch/bin/plugin -install elasticsearch/elasticsearch-mapper-attachments/1.9.0
sudo /usr/share/elasticsearch/bin/plugin --url "http://jambr.blob.core.windows.net/articledownloads/elasticsearch-river-mongodb-1.7.1-SNAPSHOT.zip" --install elasticsearch-river-mongod
sudo /usr/share/elasticsearch/bin/plugin -install mobz/elasticsearch-head
sudo service elasticsearch restart
Once this is done, you should be able to access the GUI at the following URL, there will be a default index called "_river": http://localhost:9200/_plugin/head/
We don't need this index, so remove it with the following command:
curl -XDELETE localhost:9200/_river
Setting up a River
The next thing we need to do is create our "River" and Index. This script is a simple example of how to do this:
curl -XPUT "localhost:9200/_river/artist/_meta" -d'
{
"type": "mongodb",
"mongodb": {
"db": "DatabaseManager",
"collection": "CollectionName"
},
"index": {
"name": "NameForYourIndex",
"type": "NameForYourObjectType"
}
}'
You should get a response like this:
{"ok":true,"_index":"_river","_type":"NameForYourObjectType","_id":"_meta","_version":1}
Done
That should be it, if you go to the Head GUI I linked further up the post, you should see your index populated away. Test it with a query on the "Any Query" page, try:
{"query":{"match":{"_all":{"query":"YourQuery","operator":"or"}}}}