Apache Solr is an open source search platform that enables fast search and filtering on data sets. This tutorial is a very basic introduction to Apache Solr and demonstrates its capabilities by running a simple Python Flask web application that enables the user to perform a search against a Solr back-end and return the result to the web page. It is intended to serve as a first stepping-stone introduction to Apache Solr that can lay the groundwork for future work and improvements.

Disclaimer

Per the usual pattern, do not attmept to use this functionality as-is in a production setting. There are many aspects (security, code formatting, edge condition testing, etc.) of this setup that are sub-optimal but are done in a rapid development fashion to enable focusing on the various pieces of functioanlity that are being learned.

Prerequisites

This tutorial primarily relies on the Docker engine and docker-compose. Before proceeding, ensure you have a Docker engine running and docker-compose installed on the device you wish to use for this tutorial.

Project Folder

First, let’s get the root project folder set up for our experiment:

# construct root project folder
$ mkdir solr-demo
$ cd solr-demo/

Flask Application

The search app will be a Python Flask web-based application with a simple search field that enables running a query against the Solr back-end. We will create the Docker image for this application first, which will then be used in the docker-compose.yaml file (specified later) to create and orchestrate the interaction between the web application and the Solr search platform. Let’s create the main project directory for flask and populate it with a requirements.txt which will specify the required libraries we need:

# create flask app folder
$ mkdir flask-app
$ cd flask-app/

# create the templates folder to be used
# later in this tutorial
$ mkdir templates

# specify libraries required for app
$ vim requirements.txt
# ensure contains the following:
#   flask
#   simplejson

Next, we’ll create the main Flask application functionality. Create a file named app.py containing the following code, which handles incoming requests and attempts to query the Solr back-end (if a query is submitted):

from flask import Flask, render_template, request
from urllib.request import urlopen
import simplejson

app = Flask(__name__)

BASE_PATH='http://solr:8983/solr/demo/select?wt=json&df=name&rows=250&q='

@app.route('/', methods=["GET","POST"])
def index():
    query = None
    numresults = None
    results = None

    # get the search term if entered, and attempt
    # to gather results to be displayed
    if request.method == "POST":
        query = request.form["searchTerm"]

        # return all results if no data was provided
        if query is None or query == "":
            query = "*:*"

        # query for information and return results
        connection = urlopen("{}{}".format(BASE_PATH, query))
        response = simplejson.load(connection)
        numresults = response['response']['numFound']
        results = response['response']['docs']

    return render_template('index.html', query=query, numresults=numresults, results=results)

if __name__ == '__main__':
    app.run(host='0.0.0.0')

As you’ll see in the above code, the main route handles both GET requests (simple request for the web page) as well as POST requests where it is expected that the request contains the query string that the user wishes to search the Solr search platform for. Once records are retrieved, the results are returned and rendered in the template index.html.

One other important thing to note is the construction of the BASE_PATH, which uses a hostname of solr. This is possible because the service name for Solr within the context of our Docker network will be automatically resolvable by the flask-app once deployed using docker-compose.

Let’s create the template file to be rendered as templates/index.html containing the following:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">

    <link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.5.0/css/bootstrap.min.css"
                           integrity="sha384-9aIt2nRpC12Uk9gS9baDl411NQApFmC26EwAOH8WgZl5MYYxFfc+NcPb1dKGj7Sk" crossorigin="anonymous">

    <title>Flask Solr Tutorial</title>
  </head>
  <body>
    <div class="container">
      <h1>Flask Solr Tutorial!</h1>

      <form class="form-inline" action="/" method="post">
        <div class="form-group mx-sm-3 mb-2">
          <input type="text" class="form-control" name="searchTerm" value="" placeholder="Enter search term(s)">
        </div>
        <button type="submit" class="btn btn-primary mb-2">Search</button>
      </form>

      <div class="numresults" style="font-weight: bold;">
        {% if numresults is not none %}
        Number of Results:
        <span style="margin-left: 12px;">{{ numresults }}</span>
        {% endif %}
      </div>

      {% if results and results|length > 0 %}
        <table class="table">
          <thead>
            <tr>
              <th>ID</th>
              <th>Name</th>
              <th>In Stock?</th>
              <th>Price</th>
            </tr>
          </thead>
          <tbody>
            {% for document in results %}
              <tr>
                <td>{{ document['id'] }}</td>
                <td>{% if document['name'] %}{{ document['name'][0] }}{% endif %}</td>
                <td>{% if document['inStock']%}{{ document['inStock'][0] }}{% endif %}</td>
                <td>{% if document['price']%}${{ document['price'][0] }}{% endif %}</td>
              </tr>
            {% endfor %}
          </tbody>
        </table>
      {% endif %}
    </div>
  </body>
</html>

You’ll notice in the above that certain specific fields are pulled out of the result set. These fields are present in the example core that will be instantiated and populated as part of the section dealing with linking the Flask app and the Solr search platform later in this tutorial.

Create Docker Image

Now that we have a working Flask application (you can test it locally by running python app.py and attempting to visit the page, but the search functionality will not work because we do not yet have a Solr search platform stood up and configured), we will package the application as a Docker image. Create a Dockerfile with the following contents in the Flask application directory flask-app:

FROM python:3-alpine

WORKDIR /app

COPY requirements.txt ./
RUN pip install --no-cache-dir -r requirements.txt

COPY . .

CMD [ "python", "./app.py" ]

Next, create the Docker image via the following command:

$ docker build -t flask-app:latest .

You can inspect that the image was created by running the following command, which should print out the image details:

$ docker image ls flask-app

Linking Solr and Flask Application

We’ll now configure a docker-compose.yaml file that will enable us to launch and interact with our application, which will communicate with a back-end Solr search platform running in a container and seeded with demo data. Create a file in the root project directory solr-demo/docker-compose.yaml with the following contents:

version: '3.7'

services:
  flask-app:
    image: flask-app:latest
    container_name: flask-app
    ports:
      - 5000:5000
    networks:
      - solr
    depends_on:
      - solr

  solr:
    image: solr:8.5
    container_name: solr
    ports:
      - "8983:8983"
    networks:
      - solr
    command:
      - solr-demo

networks:
  solr:

The above file creates a network (for connectivity between the applications), launches the Solr search platform, and launches the Flask application. Let’s go ahead and create this ecosystem:

$ docker-compose up

It may take a minute or two for Solr to come up cleanly - once it does, you can visit the flask-app web page by visiting http://localhost:5000 in a browser. Perform a search (something like *:*, which should return all results) and you should see results populated on the page when you click “Search”, indicating that your application is working correctly with the Solr search platform on the back end!

As a note, you can also visit http://localhost:8983 to see the Apache Solr web interface directly, including configuration information and other information about the demo core that was automatically created as part of the bootstrap process.

Conclusion

There is still a lot of content and capabilities to discuss around Apache Solr as a search platform, as well as many improvements in the code and flow demonstrated in this post. Feel free to use this test layout to expand your knowledge and learning of the Apache Solr platform, including experimenting with Solr Cloud and clustering using ZooKeeper for high availability and other beneficial features.

Credit

The above tutorial was pieced together with some information from the following sites/resources: