We're often asked "How big a cluster do I need?", and it's usually hard to be more specific than "Well, it depends!". There are so many variables, where knowledge about your application's specific workload and your performance expectations are just as important as the number of documents and their average size. In this article we won't offer a specific answer or a formula, instead we will equip you with a set of questions you'll want to ask yourself, and some tips on finding their answers.
Learn and play with Elasticsearch
ArticlesGet the RSS
Elasticsearch's Fuzzy query is a powerful tool for a multitude of situations. Username searches, misspellings, and other funky problems can oftentimes be solved with this unconventional query. In this article we clarify the sometimes confusing options for fuzzy searches, as well as dive into the internals of Lucene's FuzzyQuery.
Elasticsearch does not perform authentication or authorization, leaving that as an exercise for the developer. This article gives an overview of things to keep in mind when you configure the security settings for your Elasticsearch cluster, providing users with (limited) access to your cluster when you cannot necessarily (entirely) trust them.
One of the trickiest parts of integrating Elasticsearch into an existing app is figuring out how to manage the flow of data from an authoritative data source, such as an SQL database, into Elasticsearch. In most cases this means utilizing Elasticsearch's bulk API. It also implies designing an application to effectively make data available in an efficient, robust, on-time manner. This usually requires modifying an application's workflow to replicate data in batches. This article is a survey of various patterns for accomplishing this task.
Elasticsearch easily lets you develop amazing things, and it has gone to great lengths to make Lucene's features readily available in a distributed setting. However, when it comes to running Elasticsearch in production, you still have a fairly complicated system on your hands: a system with high demands on network stability, a huge appetite for memory, and a system that assumes all users are trustworthy. These articles cover some of the lessons we've learned from securing and herding hundreds of Elasticsearch clusters.
Can Elasticsearch be used as a "NoSQL"-database? NoSQL means different things in different contexts, and interestingly it's not really about SQL. We will start out with a "Maybe!", and look into the various properties of Elasticsearch as well as those it has sacrificed, in order to become one of the most flexible, scalable and performant search and analytics engines yet.
I am not a programmer, so I endeavoured to understand what Found's developers were talking about by starting to read up on the basics of search engine indexing. Finding documents that describe search engine indexing was easy. Understanding them at a beginner's level was anything but. I therefore decided to set off writing articles that explain search engine indexing from a beginner's perspective.