Elasticsearch is a great search engine, but using JSON and curl does not fit python. Fortunately there are two libraries that you can use - and in today's article I'll focus on that :) Check out!

Elastic-Logo

S0-E21/E30 :)

Elasticsearch python wrappers

How to start with Python Wrapper for Elasticsearch engine?

That's pretty easy. First, we need to setup our elasticsearch engine by using a jar file and use java, or for Docker to handle this heavy topic - which you can find in my previous post.

When you do that, we can start tinkering with it using two python wrappers.

First one is the Official one and it's called Low Level because it uses simple api that makes just requests with json data (converted from python-dicts). This one is called Elasticsearch-py.

Second one is a bit more complex but also very user-friendly and rather easy to use and learn. It uses a different approach. Reusing official library, makes use of python's lambdas making a very data-flow-friendly api.

Let's check them out!

Elasticsearch-py

Dependencies

The only one is the library itself - you can install it with:

pip install elasticsearch

Example

This is an example of data that you could put in a blog-post:

import datetime
from elasticsearch import Elasticsearch
client = Elasticsearch('localhost')
blog_post = {
    "author": "Anselmos",
    "date": datetime.datetime.now().strftime("%Y-%m-%d"),
    "content": "A new Blog-Post Content!",
}
res = client.index(index="test-index", doc_type='blogpost', id=1, body=blog_post)
print res

What do you think what will be output of this script?

And the Searching :

from elasticsearch import Elasticsearch
client = Elasticsearch('localhost')
res = client.get(index="test-index", doc_type='blogpost', id=1)
print res

How to you think, will it output my data?

Elasticsearch-dsl-py

Dependencies

The only one is the library itself - you can install it with:

pip install elasticsearch-dsl

Search Query!

This library has a more advanced features that can come in handy especially in a advanced search with filtering.

To make a better use of this library, let's make a list of data for searching:

from elasticsearch import Elasticsearch
import random
import datetime
from elasticsearch import helpers
client = Elasticsearch('localhost')

tag = ['blog', 'anselmos', 'elk', 'elastic', 'elasticsearch', 'elasticstack', 'elasticsearch-py', 'elasticsearch-dsl']
author = ['Anselmos', 'Bartosz Witkowski']
docs = [
    {

        "_index": "blogpost-{}".format(datetime.datetime.now().strftime("%Y-%m-%d")),
        "_type": "blogpost",
        "_id": x,
        "_source": {
            "author": author[random.randrange(0, len(author))]
            "date": datetime.datetime.now().strftime("%Y-%m-%d"),
            "content": "A new Blog-Post Content!",
            "tag": tag[random.randrange(0, len(tag))]
        }
    }
    for x in range(10)
]
helpers.bulk(client, docs)

That's an elasticsearch-py code that makes 10 blogposts with the same content, but different tags (randomized ).

Let's say, we wanted to know which tag have blog and author equals Anselmos.

with low-level api this would be:

from elasticsearch import Elasticsearch
import json
client = Elasticsearch('localhost')
queries = []
queries.append(
    {"query": {"bool": {"should": [{"match": {"tag": {"query": "blog"}}}]}}}
)
queries.append(
    {"query": {"bool": {"should": [{"match": {"author": {"query": "Anselmos"}}}]}}}
)
request = ''
for each in queries:
    request += '%s \n' % json.dumps(each)
res = client.msearch(index="blogpost-2018-02-23", doc_type='blogpost', body=queries)
print(res)

And how about hour High Level ?

Check-out

from elasticsearch import Elasticsearch
from elasticsearch_dsl import Search
client = Elasticsearch('localhost')

s = Search(using=client).query('match', tag='blog').query('match', author='Anselmos')
response = s.execute()
print response

Acknowledgements

Thanks!

That's it :) Comment, share or don't :)

If you have any suggestions what I should blog about in the next articles - please give me a hint :)

See you tomorrow! Cheers!



Comments

comments powered by Disqus