Edit me

Elasticsearch beta

NOTE: This module is currently in beta and the documentation is still being written

This page how you can install and run the Elasticsearch module on your thirty bees store. It only covers the basics of installing Elasticsearch. If you want to know more about how to install or use your Elasticsearch server directly please refer to the comprehensive Elasticsearch documentation.

Compatibility

Module requirements

This module deviates a bit from the basic thirty bees system requirements. The additional requirements are:
- PHP 5.6 or higher instead of PHP 5.5 - PHP cURL extension -- not required for thirty bees, but is needed in this case in order to communicate with the Elasticsearch server

Beta NOTE: For beta 1 and beta 2, make sure you have disabled the CCC JavaScript option. It is not supported by these versions.

Elasticsearch requirements

In order to provide an extremely fast and easy search we stick with the latest versions of Elasticsearch, in order to provide you with all the state-of-the-art search functionality Elasticsearch has to offer.

Therefore, at the moment of writing, you will need a version of Elasticsearch that is:
- Version 5.4 or higher, including 6.0

Anything lower is not supported and will likely not work.

Please double check if your system satisfies these requirements before proceeding.

Installing Elasticsearch

thirty bees cloud (Cloudways)

An easy way to use Elasticsearch + thirty bees is by hosting both with Cloudways. If you already have an instance on Cloudways, you can simply navigate to your Cloudways dashboard, select Server mode (top-left) > Server Magement > Settings & Packages > Packages. On this tab you can choose to enable Elasticsearch 5.4.

Cloudways servers

Plesk + docker

This is a basic configuration example to show you how you can configure your server for Elasticsearch. Elasticsearch is a two-part system. You will need to run an Elasticsearch server (separate Java application) and install the Elasticsearch module in thirty bees.

We will be proxying Elasticsearch via nginx in this example. Basically the module itself connects directly to the Elasticsearch server and the frontend JavaScript widget will connect via the nginx proxy. This ensures that you will get the fastest possible search. To make things easier we will be setting up this server via Docker and Plesk. If you would like to go cli style, you might also be able to follow this documentation, but you will have to figure out the command line alternatives to what Plesk does.

First, start by searching and launching an Elasticsearch docker: docker

Plesk will make this instance run on a random port number. Remember this number, you will need it later.

Add a search subdomain of your choice via Plesk. Make sure it does not run in proxy mode (Proxy mode unticked). These are the additional nginx directives you will need to secure and expost Elasticsearch:

add_header Access-Control-Allow-Origin * always;
add_header Access-Control-Allow-Methods GET,HEAD,OPTIONS,POST,PUT always;
add_header Access-Control-Allow-Headers Access-Control-Allow-Headers,Origin,Accept,X-Requested-With,Content-Type,Access-Control-Request-Method,Access-Control-Request-Headers,Authorization always;

location ~* ^/thirtybees_\d+_\d+(/_search|/_analyze) {
  limit_except HEAD OPTIONS {
    auth_basic           "Protected Elasticsearch";
    auth_basic_user_file /var/www/vhosts/example.com/elasticread;
  }

  proxy_pass http://localhost:9200;
  proxy_http_version 1.1;
  proxy_set_header Connection "Keep-Alive";
  proxy_set_header Proxy-Connection "Keep-Alive";
}

location ~ / {
  limit_except HEAD OPTIONS {
    auth_basic           "Protected Elasticsearch";
    auth_basic_user_file /var/www/vhosts/example.com/elasticadmins;
  }

  proxy_pass http://localhost:9200;
  proxy_http_version 1.1;
  proxy_set_header Connection "Keep-Alive";
  proxy_set_header Proxy-Connection "Keep-Alive";
}

We have added two .htpasswd files in the directory /var/www/vhosts/example.com. On this page you can create an .htpasswd file. The first location directive contains the index name, follow by two integers. Those two numbers are respectively the shop ID and language ID. Make sure the name corresponds with your index name, otherwise there will be no access from the frontend. Save these settings and go!

CLI

We do not provide a tutorial for raw installation since it is basically expert mode and expert generally know what they are doing. To give you a few guidelines, here are your possibilities:

Directly

You can grab a copy from the Elasticsearch download page and run that one directly. It will by default run on port 9200.

Docker

You can grab a docker container directly, we recommend elasticsearch/elasticsearch. More instructions about using Elasticsearch with docker can be found at: https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html You can control the ports of this instance and optionally place an nginx proxy in front.

After launching an Elasticsearch instance, feel free to continue with (a part of) the instructions from the Plesk section. It contains an nginx configuration that allows you to expose a read-only URL of the Elasticsearch instance directly, so there is no need for a proxy via your thirty bees instance. In practice this decreases the search delay with 50-200ms per query.

Installing the module

Install the Elasticsearch module and configure the servers. By default the module picks one server located at http://localhost:9200. This is Elasticsearch's default port and proxied by default so it is ready to be used in the frontend right from the start if you just installed Elasticsearch with the default settings on the same system. This also works out of the box when you use the cloud version of thirty bees (Cloudways).

After installing the module some conflicting default modules might still be active. They also cause the Elasticsearch module to be displaced from the start. In order to fix this, disable both the blocksearch and blocklayered modules.

To make the search block show in the previous position, navigate to Modules and Services > Positions and search for the displayTop hook. Make sure the Elasticsearch module is at the top.

Search facets will be shown on the left and right column. By default thirty bees will add the new blocks at the bottom. Search for displayLeftColumn & displayRightColumn and reposition the module however you like.

Also make sure you have disabled the default search options on the page "Preferences > Search". They should look like this: search settings

This should be enough to make the module show properly and on the right spot with supported themes. If your theme is not supported by default or your theme does not take the Elasticsearch module into account you might have to grab one or more of the template (.tpl) and adjust them to your likings.

Upgrading to beta 2 from beta 1

Note that there is no upgrade script to upgrade from beta 1 to beta 2. There is only one way to upgrade from beta 1. Resets and changing files will not work; you will have to reinstall the module in order to upgrade, even if you have previously uninstalled and/or removed beta 1 you will need to follow these guidelines:
- Uninstall beta 1 - Upload beta 2 via you back office (not FTP!) - Install beta 2 - Uninstall beta 2 - And install it again (do not use the reset button)

Configuring the module

After installing the module you should be presented with the module's configuration page that has several options already configured for you by default. In this section we will navigate through every tab and show what you can configure.

Settings tab

On the settings tab you will find generic Elasticsearch settings that do not belong on the other tabs. You generally only touch these if you want to run multiple Elasticsearch instances, optimize them for performance or configure stop words.

Number of shards

The number can be configured right from the module. It will be passed to Elasticsearch when creating the index. It is part of the basic concepts of Elasticsearch. Having an optimum value is vital to providing a fast search experience.

We recommend to never set this number below 2.

Number of replicas

Like the number of shards, this can be changed right from the module as well and will be updated when creating the new index. Optimize this number for the best experience. Due to the faceted search this module provides (and internal post-filter), we recommend to never set this number below 2.

Index prefix

You can override the index prefix used. It is not necessary to assign a different one for every language or store you have. thirty bees will automatically attach those numbers, meaning that if your basic prefix is thirtybees, for a shop with ID = 2 and language with ID = 1 it will automatically index at thirtybees_2_1. This feature is useful if you run multiple instances of thirty bees on the same host that are not part of the same multistore instance.

Stop words

For every language you can configure the stop words used by Elasticsearch. This is updated on every full reindex of Elasticsearch (click Erase index first). You can extend or replace the default libraries used by Elasticsearch. I.e., the code _english_ refers to the default list of stop words for English.

Enable logging

Beta 2 remark: This option was supposed to provide more logging capabilities. As of beta 2 this feature does nothing and might get removed in a future release.

Connection

The connections tab allows you to configure the connection with the Elasticsearch server. This module also supports clusters (from both the frontend and backend). From both the backend and frontend it uses round-robin for load-balancing.

Ajax proxy

The proxy button proxies all the servers to your frontend via the module's endpoint (/module/elasticsearch/proxy by default), otherwise the frontend search is sent to your servers directly.

Servers

If you want to update the URLs, for example with the read-only URLs based on the the .htpasswd username/password combo you configured, you can update them on the Connections tab. Example of a configuration: server config

The write and read setting tell the module which servers should be used for read operations and which should be used with write permissions. DO NOT MIX THE TWO: when a server/URL has write permission, DO NOT add the read permission!

Indexing

This section allows you to configure how the module indexes your products.

Note that there is a limitation on the features and attributes that can be indexed. If you have a feature or attribute that after running it through Tools::rewrite (URL rewriting) generates the same code as an already existing property, feature or attribute it will be ignored by thirty bees, since every single code has to be unique because of the way this module was built. For example, the data structures used by this module and direct URLs to searches all require unique codes. If you want to avoid collissions, make sure you rename your feature or attribute so it becomes visible and can be used by the module

Beta 2 note: This limitation is probably the number one cause why the module does not index your products. If you cannot index your products, start searching for duplicates! We will improve error reporting with subsequent updates to make this process easier.

The module comes with a few default settings already prepared for optimal results. In case you need an example, this is what has been configured for a miniature car store:

basic fields

On this tab you can select which fields should be used for search (YES / NO). Note that all these fields need to have the data type text if you are using fuzzy search, otherwise you can use both text and keyword. It will generally be hard to use this this with other fields, so if Elasticsearch generates errors after changing this setting double-check if your fields all support search.

To make the search as solid as possible, sometimes the module decides to silently ignore fields. For instance, if you have configured reference to become a long (number) and you enable fuzzy search, then the module knows that his will not work with Elasticsearch and will not request the Elasticsearch server to use this field for search.

By default search suggestions (the autocomplete dropdown beneath the search bar) are sorted by relevance. Elasticsearch has a smart scoring system that determines which results are more important. The raw query editor on this search page will allow you to even further optimize this, but do note that the weights are very important. Choose them wisely. More information on the scoring system of Elasticsearch can be found on this page.

An example for the miniature car store:
searchable fields

You can further tweak the raw search query. Do note that the placeholders ||QUERY|| and ||FIELDS|| will always have to be added, so if you want to adjust the query, find a way to place these essential variables first. The default query can be found in the file /modules/elasticsearch/data/defaultquery.json:

{
  "bool": {
    "must": [
      {
        "multi_match": {
          "query": ||QUERY||,
          "fields": ||FIELDS||,
          "type": "best_fields",
          "operator": "OR",
          "fuzziness": "auto"
        }
      }
    ]
  }
}

Be aware that this file does not contain valid JSON, but rather is a template for the module to use. The default one is suitable for most cases. The module shows what the placeholders will be replaced with:

  • ||QUERY|| will be replaced with a string containing the search query: "search"
  • ||FIELDS|| will be replaced with the selected fields and their weights in an array, i.e.: ["name^4", "description_short^2"]

Filter

Since this module provides faceted search you can also configure the facets or filters used on this page. Depending on the data type this page presents you with several configuration options.

You can enable/disable a filter with the YES/NO toggle. By default all filters are shown in a scrollable facet, meaning a scrollbar will show when there are a lot of options to choose from. By default Elasticsearch returns an estimate of the amount of products that belong to a filter and not always all filters are shown. This is due to performance reasons and often the case if you have to show thousands of filters. If you have a facet type that can return thousands of filters, then it is recommended to limit the amount of results shown seen the fact that a human brain cannot process that much information at once. Don't overwhelm your visitor!

As of beta 2 there are several filter styles available: - Checkbox: the most common one, a simple checkbox - Slider: A range slider, which is especially useful for prices - Color: Shows a color checkbox, can only be used with color attributes

A filter can be either conjunctive (AND) or disjunctive (OR). If you want the visitor to select multiple values, you can use a disjunctive filter.

Every filter can be dragged and dropped on the Indexing, Search and Filter page. This changes the filter's position on the result columns on the front office when searching.

An example from the model car store:

filter example

Display

This section is used to configure what features are activated and how they are displayed on the front office.

Use a list as default product layout

By default, the first time a visitors uses the search products are shown in a grid. If the visitor prefers to show them in a list, this is then locally stored in the browser and on every subsequent search/site visit the module will always show the list instead. When you want to display a list by default to visitors, right from the start, you can enable this option.

Replace native pages

This option replaces the native category, manufacturer and supplier pages, allowing to show you the blazing fast Elasticsearch results instead of the previous old-fashioned search options.

Search in subcategories on replaced pages

By default subcategories are shown but not searched in. If you want to show results from subcategories on the category pages, be sure to enable this option.

Autocomplete

The module can show search suggestions beneath the search bar. To show the 5 most popular options while searching, enable this option.

The option Instant search will directly replace the current page's content with instant search results. Every layout with a main column is supported.

Infinite scroll

If you would rather use infinite scroll instead of regular pagination, you can enable this option.

Price slider tax rules group

There is a limitation to which taxes can be calculated real-time for the current user. The module can only index prices for one tax rules group at a time. Therefore this option becomes important when you use a price slider. It will apply the selected tax rules group. Note that products which belong to a different tax rules group might get incorrectly shown in the search result. In order to avoid confusing your visitors it is recommended to disable price sliders if you have mixed tax rates in your store.