Elasticsearch Search Engine on your server Aravind Putrevu Developer | Evangelist @aravindputrevu | aravindputrevu.in elastic.co/community 1
A presentation at DigitalOcean Webinar Series in June 2018 in by Aravind Putrevu
 
                Elasticsearch Search Engine on your server Aravind Putrevu Developer | Evangelist @aravindputrevu | aravindputrevu.in elastic.co/community 1
 
                Agenda 2 1 Terms 2 Talking to Elasticsearch 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning
 
                Agenda 3 1 Terms 2 Talking to Elasticsearch 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning
 
                Agenda 4 1 Terms 2 Talking to Elasticsearch 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning
 
                Agenda 5 1 Terms 2 Talking to Elasticsearch 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning
 
                Agenda 6 1 Terms 2 Talking to Elasticsearch 3 Mappings 4 Analyzers and Aggregations 5 Capacity Planning
 
                Security Alerting Monitoring Elastic Stack No enterprise edition All new versions with 6.2 X-Pack Reporting Machine Learning Graph 7
 
                Why it is Popular? Speed 8 Scale Relevance
 
                Terms An index is a stores collection documents that A cluster is a collection of one or more nodes A node (servers) is a single server that is part of your cluster, yourofdata, and have somewhat similar characteristics participates in the cluster’s indexing and search capabilities Index Node Cluster Type Deprecated in 6.0.0 A type used to be a logical category/partition of your index to allow you to store different types of documents in the same index Document A document is a basic unit of information that can be indexed. This document is expressed in JSON (JavaScript Object Notation) which is a ubiquitous internet data interchange format. Elasticsearch provides the ability to subdivide your index into multiple pieces called shards Shard 9 Replica To this end, Elasticsearch allows you to make one or more copies of your index’s shards into what are called replica shards, or replicas for short https://www.elastic.co/guide/en/elasticsearch/reference/current/glossary.html
 
                Elasticsearch Node Types Nodes can play one or more roles, for workload isolation and scaling Elasticsearch • – • • Coordinating (X) • Machine Learning (2+) • Route requests, handle search reduce phase, distribute bulk indexing All nodes function as coordinating nodes Ingest Nodes – Ingest (X) Hold indexed data and perform data related operations Differentiated Hot and Warm Data nodes can be used Coordinating Nodes – – Data – Warm (X) Control the cluster, requires a minimum of 3, one is active at any given time Data Nodes – – Master (3) Data – Hot (X) Master Nodes Use ingest pipelines to transform and enrich before indexing Machine Learning Nodes – Run machine learning jobs X-Pack 10 All product names, logos, and brands are property of their respective owners and are used only for identification purposes. This is not an endorsement.
 
                What powers Elasticsearch? ● A Java library ● Great for full-text search But 11 ● Challenging to use ● Not designed for scale https://www.elastic.co/blog/found-elasticsearch-top-down
 
                Talking to Elasticsearch 12 https://www.elastic.co/guide/en/elasticsearch/client/index.html
 
                Indexing a document 13 https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
 
                Inserting data _bulk 14
 
                Where will my data go? The default value used for _routing is the document’s _id. 0 < shard < number_of_primary_shards - 1 15 https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-routing-field.html
 
                Mappings 16
 
                Full Text Analysis Inverted Index 17
 
                Analyzer Helps in converting text into tokens for better search capability 1 Character filters 18 2 Tokenizer 3 Token Filters
 
                Aggregations 19 ● Metrics ● Bucket ● Pipeline ● and so on...
 
                Querying Data 20 ● Full Text Queries ● Term Level Queries ● Compound Queries ● Geo Queries
 
                Query DSL Match Query 21
 
                Query DSL Term Queries 22
 
                Query DSL Nested queries 23
 
                Query DSL Geo queries 24
 
                Beats Elasticsearch Master Nodes (3) Log Files Metrics Custom UI Logstash Ingest Nodes (X) Wire Data Kibana your{beat} Data Nodes – Hot (X) Kafka Instances (X) Datastore Web APIs Redis Social Sensors Messaging Queue Data Notes – Warm (X) Nodes (X) X-Pack LDAP Hadoop Ecosystem 25 ES-Hadoop AD X-Pack SSO Authentication Notification
 
                Capacity Planning It depends... 26 https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
 
                Capacity Planning What is your use case? ● Full text search ● Logging/Metrics ● Complex Aggregations with lot of users Each use case needs a different cluster configuration. 27 https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
 
                Capacity Planning Let us take Logging.. ● Inflow of data per day ○ ○ ○ ● 28 15 days High Availability (Replication factor) ○ ● Master Node : X Data Retention ○ ● Per day : 10GB Per Month : 300GB Per Year: 3600GB Data Node : X 1 i.e., 7200GB Per Year Type of Queries https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
 
                Capacity Planning Hardware Recommendations 29 ● SSD’s are the best ● Local Disk is king! ● Prefer Medium size machine’s over Large size machine’s ● Only 50% of your RAM to Elasticsearch ● Don’t Cross 32GB Java Heap Space https://www.elastic.co/elasticon/conf/2016/sf/quantitative-cluster-sizing
 
                Beats Elasticsearch Master Nodes (3) Log Files Metrics Custom UI Logstash Ingest Nodes (X) Wire Data Kibana your{beat} Data Nodes – Hot (X) Kafka Instances (X) Datastore Web APIs Redis Social Sensors Messaging Queue Data Notes – Warm (X) Nodes (X) X-Pack LDAP Hadoop Ecosystem 30 ES-Hadoop AD X-Pack SSO Authentication Notification https://www.elastic.co/blog/hot-warm-architecture-in-elasticsearch-5-x
 
                training.elastic.co 31
 
                Resources • https://www.elastic.co/learn • https://www.elastic.co/blog/category/engineering • https://discuss.elastic.co/ • https://fb.com/groups/ElasticIndiaUserGroup • https://elastic.co/community 32
 
                Fin! discuss.elastic.co | aravind@elastic.co | @aravindputrevu 33
