Solr Fundamentals

Whilst this book is focussed on the Panl server and its configuration, a high level understanding of the Solr server configuration is required to understand how this influences the Panl server and its configuration options.  (The Understanding Solr Configuration And Panl Integration chapter has additional details)

IMPORTANT: If you do not configure the Solr fields correctly, then, irrespective of the Panl configuration, you will not be able to surface the correct functionality.


Solr Configuration File Changes

As the Solr server release version increments, the Solr configuration files may have changes within the solrconfig.xml and managed-schema.xml files, which will need to match the version of the Solr server that you use.  There are also other changes within the files which should be inspected and merged into your configuration if necessary.

Tips: When updating the configuration in the files it is recommended to use a source code control system (e.g. Git) and then diff them and merge the changes.  It is always recommended to test any changes.

solrconfig.xml File

In this file, the Lucene match version XML element changes frequently as the underlying Lucene search indexer is updated.  In this file look for the XML element below and confirm that it is correct for use with your Solr Server.

  <luceneMatchVersion>9.11</luceneMatchVersion>

managed-schema.xml File

This schema file does not change as frequently as the Solr config file above, however the version attribute does get updated and should be checked.

<schema name="mechanical-pencils" version="1.7">

IMPORTANT: Whilst Solr version 9.8.0 uses a schema version of 1.7, the examples in this book use a schema version of 1.6 - this is done for maximum backwards compatability.

Summary

Below is a table that summarises the various configuration settings for each of the Solr versions.

Solr Version

solrconfig.xml

<luceneMatchVersion />

managed-schema.xml

<schema /> version attribute

Solr Version 9

9.8.*

9.7.*

9.11

version ="1.7" [17]

9.6.*

9.10

version ="1.6"

9.5.*

9.9

version ="1.6"

9.4.*

9.8

version ="1.6"

9.3.0

9.7

version ="1.6"

Solr Version 8 (Last updated version at time of writing this book)

8.11.4

8.11.4

version ="1.6"

Solr Version 7 (Last updated version at time of writing this book)

7.7.3

7.7.3

version ="1.6"

IMPORTANT: There are different schema and luceneMatchVersion values in different versions of Solr-9.x.x, you may need to edit the solrconfig.xml and managed-schema.xml configuration files for your specific versions.

This book uses the managed schema version of 1.6 NOT 1.7, despite the fact that Solr version 9.8.0 uses this version of the schema.

The  solrconfig.xml and managed-schema.xml configuration files are generally forward compatible, but not always backwards compatible - i.e. a previous version of the XML files will probably work on newer versions of Solr, especially where the major version does not change.

Fields and Field Types

For any Solr field that is defined through the <field /> XML element, the following attributes need to be understood.

  • name - this is the name of the Solr field
  • type - this will determine how the data is stored and whether it is analysed (which means that a keyword search may be performed on it)
  • indexed - a boolean value to configure whether this field is indexed and able to be faceted on
  • stored - a boolean value to configure whether this field is stored and able to be retrieved verbatim
  • multiValued - a boolean value to configure whether this field accepts and holds more than one value

Determining If A Solr Field Type Is Analysed

Within the managed schema, search for the <fieldType /> XML element with a name attribute that matches the type attribute of the <field /> XML element.  If the <fieldType /> XML element has <analyzer /> child elements, then it is analysed.

The below diagram shows the relationship between the <field /> XML element type attribute to the <fieldType /> XML element name attribute, showing the <analyzer /> XML child element.



Image: The lookup of a fieldType to determine whether it is analysed.


There are different field types with different analysers for both the index and query.  Which analyser is the best use case for your requirements is left up to you.  In this book, the only analysed field type is text_general.

Examples of analysed fields that are within the included managed schema files are:

  • text_ws
  • managed_en
  • text_general (this is the only field type that this book uses)
  • text_gen_sort
  • text_en
  • text_en_splitting
  • text_en_splitting_tight
  • text_general_rev
  • alphaOnlySort
  • phonetic
  • payloads
  • lowercase
  • descendent_path
  • ancestor_path
  • preanalyzed

Overview of Indexed / Stored / Analysed Fields

When designing the managed schema you need to account for ALL uses of the Solr fields across CaFUPs - remember that this allows you to present the Solr field in different ways (i.e. it will be a field in one CaFUP and a facet in another).

These are the things to remember:

If your field will be used in a CaFUP to be...

  • Displayed in the returned documents - then it MUST BE stored (i.e. "stored"="true")
  • Used as a Facet - then it MUST BE indexed (i.e. "indexed"="true")
  • Used in the default search - then it MUST BE analysed (i.e. a Solr field type that has an analyzer child element)
  • Used as a Specific Solr Search Field then it MUST BE analysed and stored (i.e. "stored"="true", and a Solr field type that has an analyzer child element)

IMPORTANT: From Solr version 9.7.* upwards AND using a schema version of 1.7 or greater, document values are set to true (i.e. docValues="true") by default on a number of fields.  This will affect the rules above as now there are a number of fields which will have this automatically applied and means that they are AUTOMATICALLY INDEXED AND STORED - overwriting the field definitions.

THIS WILL DEFINITELY IMPACT THE WAY THE SOLR FIELDS ARE PRESENTED BY THE PANL SERVER

See the section on The Impact Of docValues (Schema Version 1.7+) for more detailed information and how to edit a 1.7 version schema to revert the changes.

Highlighting

For the moment, you can ignore the highlighting configuration for Solr, as it is only  useful for a limited range of use cases.  There is a more in-depth section on configuring the Highlighting should you be interested, or have specific requirements.  Rest assured that the highlighting in this book will just work as expected (provided the Solr fields are configured correctly).

~ ~ ~ * ~ ~ ~