Elasticsearch is a distributed search and analytics engine that is built on Apache Lucene. Lucene is a full-text search engine library written in Java. This technology was released in 2010, and since then, it has become one of the most popular search engines in the world.
It provides scalable search and data analytics capabilities. It stores large amounts of data in schema-free JSON documents that are optimised for search operations.
Elasticsearch provides RESTful APIs that support multiple data formats for data visualisation and analysis.
HOW IT WORKS?
Lucene stores documents that are indexed into keyword data structures to group similar document data. Then, inverted indexing is applied:
"Lucene uses an 'inverted indexing' of data – instead of mapping pages to keywords, it maps keywords to pages just like a glossary at the end of any book.
This allows for faster search responses, as it searches through an index, instead of searching through text directly." Introduction to Apache Lucene | Baeldung
An inverted index is a data structure that maps keywords to one or multiple document locations.
FEATURES
Elasticsearch is a powerful open-source technology designed to handle large datasets that can be distributed across multiple nodes to ensure scalability. It supports cross-cluster replication (CCR), which replicates changes from an index cluster to a different index cluster to ensure high availability and low latency.
It provides client libraries to work with multiple programming languages, including Java, .NET, Python, Node.js, PHP and Ruby. It's also compatible with various plugins and integrations, such as API extensions, security plugins, data recovery integrations, etc.
"Elasticsearch indexing supports near real-time monitoring and its powerful search capabilities help IT administrators maintain complete transparency across an entire network to quickly uncover and address potential threats as they arise." What is Elasticsearch? | IBM
The Elastic Stack is a set of open-source tools for real-time data search, analysis, and visualization. This includes Kibana, a data visualization dashboard; and Logstash, a log analysis platform.
Elastic Stack products can be deployed on-premises and are available as Software as a Service (SaaS) on cloud providers: Amazon Web Services, Microsoft Azure and Google Cloud.
USE CASES
Elasticsearch is a feature-rich technology that is used by a variety of companies, including Netflix, Uber, eBay, Udemy, Walmart, and more.
Search-based solutions
Applications that require high-speed and accurate search capabilities to retrieve data and generate reports. Websites with high-volume content that require performant and accurate results. Search engine systems for enterprise solutions like Marketing, Finance, E-Commerce, Accounting and much more.
Analytics solutions
The Elastic Stack tool, Logstash, is particularly useful for ingesting and analyzing data in real time. Security Information and Event Management (SIEM) and security analytics capabilities provide cloud security and threat detection.
Monitoring metrics
Elastic Stack provides various monitoring metrics allowing users to view health and performance data in real-time.
BENEFITS
Schema-Free JSON Documents: Elasticsearch stores data in JSON-based documents. This makes it easier to build applications and provides flexibility to handle documents dynamically.
High-performance: A distributed search engine capable of processing large amounts of data quickly and effectively.
Near Real-time Search: Elasticsearch provides near real-time indexing and search capabilities, suitable for systems where data needs to be displayed almost immediately.
Powerful Query Language: It's possible to perform complex search, filtering and aggregation operations on your data.
Text analysis: Elasticsearch offers text-analysis capabilities such as tokenization and stemming. It also supports language analysers.
DRAWBACKS
Documentation: Elasticsearch offers a range of powerful features, but its documentation can be confusing and unintuitive. Additionally, it's crucial to verify the documentation version.
Learning Curve: Elasticsearch is a complex and powerful tool. So, it can be challenging to work with and requires time to acquire a certain level of expertise.
No transactions and rollbacks: Elasticsearch doesn't support these functionalities. So, It's not ACID compliant making it less ideal as a primary persistence store.
Check out my video:
Commentaires