Appendices

Definitions

#

42: The meaning of life

404: An HTTP status code returned when a resource could not be located by the server.

500: An HTTP status code returned by the server when an error occurred that stopped the processing of a request.

A

Apache: Refers to the Apache Software Foundation, the producers of an HTTP Server and a fine collection of software and libraries, also the author of the Solr search server.

B

Boolean facet: A Solr field which can only be one of two values, 'true' or 'false'.

Boolean value replacement: For a 'true' or 'false' value from a Solr field can have their 'true' or 'false' value replaced with more meaningful text, which will be converted by the Panl server.

C

CaFUPs: An acronym for Collections and FieldSet URL Paths.  This is the URL path that the Panl server will handle to return results from the Solr collection with the configured field names.

Collection: A collection is either a single logical search index that uses a single Solr configuration file (solrconfig.xml), or a collection of properties and field sets for the Panl server. (See also: Solr collection, and Panl collection).

D

DATE range facet: A Solr facet of type solr.DatePointField that can be used to filter results from the time period NOW to some arbitrary number of hours, days, months, or years, or from some arbitrary number of hours, days, months, years to NOW.

Document: A single result returned from the Solr search server.

E

F

FieldSet(s): A list of Solr field names that will be returned by the Panl server in the result documents.

Facet: Also referred to as a regular facet, a defined field in the Solr and Panl configuration that allows a user to filter the results by selecting a particular value for this field that the document must have.  The  returned facets have a Solr field name, a Panl field name, a value, and a count. (See also: Regular facets, OR facets, RANGE facets, DATE range facets, and Boolean facets)

Facet count: The number of documents which have a particular facet value assigned to them.

Facet index: Solr nomenclature for the value of a facet.

Facet value: The value of the Solr facet field as indexed by the Solr search server.

G

Hierarchical facet: A facet that will only appear if another facet or parameter has already been selected.  This is defined by the panl.when.<lpse_code> property.

H

Hierarchical facet: A facet that will only appear if another facet or parameter has already been selected.  This is defined by the panl.when.<lpse_code> property.

Highlighting: When configured, highlighting will return the matching text surrounded by markup (e.g. HTML) so that it may be rendered in a different fashion to bring attention to the text.

I

Infix: Text that is placed between the range values. Note: this is only available for RANGE facets.

J

JSON: JavaScript Object Notation, an open standard file and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute-value pairs.

K

L

LPSE code: The Last Path Segment Encoded string that the Panl server uses to parse the preceding URL path.

M

Mechanical Pencil: (or a clutch pencil) is a pencil with a replaceable and mechanically extendable graphite core (or "lead"). The lead is not attached to the outer casing, and can be  mechanically extended as its point is worn away from use.

N

Number per  page parameter: Defines the number of results to return per page from the Solr server.  This may be set to any positive integer value greater than 0 (zero).

O

OR facet: A configured facet that will allow returning multiple values from a facet that normally would allow the user to select only one.

P

Page parameter: A parameter which defines the page number of results that the user will be shown.

Panl collection: A collection of properties and field sets that control how the facets are configured and how the search results are returned.

Panl collection URL: Unlike a Solr collection of documents, a Panl collection is a collection of field sets and configuration.  There can be multiple field sets per configuration file, and multiple Collection URLs can be connected to a Solr search collection.  The Panl collection URL is made up of the selected Panl collection and selected FieldSet.

Panl field: A one-to-one mapping between a Solr field to a Panl field which is referenced throughout the configuration.

Panl generator: A utility function within the Panl server that will generate a starting panl.properties file and a  panl_collection_url.panl.properties file from a managed-schema.xml Solr collection configuration file.

Panl server: The HTTP server that acts as the proxy between LPSE encoded URL parts and the underlying Solr Server

Panl release package: A binary release as a .zip or .tgz file that can be downloaded from the Github releases page for Synapticloop Panl project.

Parameters: Parameters are specialised forms of LPSE codes that do not directly filter the results, but alter the way in which the results are returned.

Prefix: Text that is prepended to a facet value for the URL path, and optionally for the display of the value.

Properties: A text file format where each line defines a parameter which is stored as a pair of strings, one storing the name of the parameter (called the key), and the other storing the value

Q

q: The  query operator URL parameter that the Solr server responds to when searching text within the collection.

q.op: The query operand URL parameter that the Solr server responds to to determine whether the search term should be found in one of the fields (OR), or all of the fields (AND).

Query term: Text that is either a word or a phrase that is used to search across the Solr collection index.  Unlike facets, this is not limited to the values of the Solr facets.

Query parameter: The parameter that is passed through as a URL parameter from a text input field by an HTML form.

R

RANGE facet: A specialised form of facet which allows a search to be performed on a range of values.  If the Solr document value matches between the minimum and maximum range values, then it will be included with the results.

Replacement value: A value that replaces another value, used in BOOLEAN facets to replace a "true" or "false" value with another more human-readable value, and in RANGE facets for minimum and maximum values.

Regular facet: The default type of facet that allows filtering on a particular value.

S

SEO: An acronym for Search Engine Optimisation which is a set of practices designed to improve the appearance, positioning, and usefulness of content in the search results for search engines.

Solr collection: A collection of documents that Solr has indexed and is searched upon to return results. (See also: Panl collection)

Solr field: The name of a field that is defined within the managed-schema.xml file, which may be indexed, stored, and.or analysed.

Solr server: The Solr server

Solr schema: The XML file that defines the fields, their types, what

Sort order:  The field, or fields of a document which will be either ascending or descending ordered.

Suffix: Text that is appended to a facet value for the URL path, and optionally for the display of the value.  This also applies to a RANGE facet when appending the text to the end of a range query.

Synapticloop: The people behind the Panl server and generator.

T

.tar: A bundle of files placed together into the Tape ARchive format

Token: Panl parses and decodes the incoming request URL path and turns each of the path parts into a token. If there was an error in either the parsing or decoding, then this token will be marked as invalid and will not be passed through to the Solr server.

U

URI: Uniform Resource Identifier which is a unique sequence of characters that identifies an abstract or physical resource.

URL: Uniform Resource Locator is a subset of URI which specifically references a webpage.

URL path part: A part of the path of a URL, in the examples in the book the path part is taken as everything after the hostname.  A URL is of the form:

scheme ":" ["//" authority] path ["?" query] ["#" fragment]

V

W

Wildcard: Designates that a value that is used will match all values - used by the Panl server within a range query to indicate that it should be any number either below or above the selected minimum/maximum value.

X

x: x marks the spot.

XML: eXtensible Markup Language which is the format the Solr uses for the managed schema and configuration files.

Y

y: Why not?

Z

Zip: A file compression format allowing multiple files and directories to be easily packaged into a single file.

Command Line Options

The Panl release package has two modes

  1. Panl server - serving up the search results, and
  2. Panl generator - to generate configuration files for an existing collection.

For both of the modes, the usage describes the invocation through the Java JRE, rather than the supplied executable files (i.e. bin/panl and bin\panl.bat), however the command line options are the same.

IMPORTANT: Throughout this section, the filesystem paths are described using the *NIX nomenclature of a forward slash '/' between the directories, rather than the backslash of Microsoft Windows systems '\'.  So please take care when copying the commands.

Panl Server

The Panl server may be started by

bin/panl server

Which will start the server with default values:

  • Looking for a panl.properties file in the directory that the command was executed
  • Binding to the default port of 8181.

To set the panl.properties file that will be referenced, use the -properties command line option.

To set the port that you want to bind the properties to, use the -port command line option.

Panl Generator

The Panl generator can be invoked by

bin/panl generate -schema path/to/solr/managed-schema.xml

You MUST, at a minimum, pass through the path to the Solr managed schema file with the -schema command line option.

The default output directory and file name is the directory where this command was invoked and the filename will be panl.properties.  Should you wish to change this, use the -properties command line option with the filename.  If the option points to a directory, the default filename of panl.properties will be used.  Note: This will also be the directory that the <panl_collection_url>.panl.properties will be written to as well.

Panl generator will fail if the files already exist, however you may pass through the command line switch of -overwrite with a value of true to overwrite the files (the default is false).

Usage Text

You can see the complete usage (and possibly updated) instructions on the GitHub repository:

https://github.com/synapticloop/panl/blob/main/src/main/resources/usage.txt

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

                             __

         .-----.---.-.-----.|  |

         |  _  |  _  |     ||  |

         |   __|___._|__|__||__|

         |__|     ... .-..

                ~ ~ ~ * ~ ~ ~

Usage:

  java -jar panl.jar \

    [command] \

    [-properties properties_file_location] \

    [-overwrite true_or_false] \

    [-config solr_config_location] \

    [-port port_number]

Where:

  [command] is one of:

       server - start the server, or

     generate - generate an example panl.properties file from an existing Solr

                managed-schema.xml file

  To start the Panl server

    java -jar panl.jar \

      server \

      [-properties properties_file_location] \

      [-port port_number]

  To generate the panl.properties file for each collection:

    java -jar panl.jar \

      generate \

      [-properties properties_file_location] \

      [-overwrite true_or_false] \

      -schema solr_schema_location

If you choose the 'server' command, the following command line options are

available:

  [-properties properties_file_location] (optional - default panl.properties)

    the properties file to load, if this property is not included, the default

    panl.properties file will be used which __MUST__ reside in the same

    directory as the server start command

  [-port port_number] (optional) the port number to start the server on.  The

    default port number is 8181

If you choose the 'generate' command, the following command line options are

available:

  [-properties properties_file_location] (optional - default panl.properties)

    the base properties file to write the generated configuration out to. If

    this property is not included, the default panl.properties filename will

    be used with each collection file that is generated named:

      <panl_collection_url>.panl.properties

            NOTE: If the files exist, the generation will TERMINATE, you will need to

    remove those files before the generation - unless you have the

    -overwrite true command line option present

  [-overwrite true_or_false] (optional - default false)

  -schema solr_schema_location(s) (mandatory) the managed-schema.xml

    configuration file(s) to read and generate the panl.properties file from.

    NOTE: For multiple files, use comma separated values


Sample .properties Files

The Sample panl.properties Files

There are multiple sample properties files that are included in the downloaded packaged and can be viewed online:

Mechanical pencils

The default collection for running the examples through the book.

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/mechanical-pencils/panl.properties 

All

This file will start the Panl server with all included examples - however for it to display results correctly, the all datasets must be indexed.

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/all/panl.properties

Book store

Just the book store demo dataset with hierarchical and facet ordering.

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/book-store/panl.properties

Mechanical pencils OR

The mechanical pencils collection with the manufacturer as an OR facet

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/mechanical-pencils-or/panl.properties

Simple date

The simple date collection with DATE range facets.

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/simple-date/panl.properties

The contents of these files are unlikely to change between publishing this book and updates to the codebase.  To see the most up-to-date comments and instructions for usage, the template file that the Panl generator utility uses as a base can be found here:

https://github.com/synapticloop/panl/blob/main/src/main/resources/panl.properties.template

The Sample <panl_collection_url>.panl.properties Files

The sample file for the mechanical pencils example used in this book can be viewed online:

Mechanical pencils

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/mechanical-pencils/mechanical-pencils.panl.properties 

Additional properties files are also included in the download package

Book store (with Hierarchical facets and facet ordering)

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/book-store/book-store.panl.properties 

Mechanical pencils OR faceting

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/mechanical-pencils-or/mechanical-pencils-or.panl.properties 

Simple date (with DATE range facets)

https://github.com/synapticloop/panl/blob/main/src/dist/sample/panl/simple-date/simple-date.panl.properties 

The contents of this file is unlikely to change between publishing this book and updates to the codebase.  To see the most up-to-date comments and instructions for usage, the template file that the Panl generator utility uses as a base can be found here:

https://github.com/synapticloop/panl/blob/main/src/main/resources/panl_collection_url.panl.properties.template


Solr Version 8 & 7 Integration Notes

SolrJ

The SolrJ connectors are changed, with the available list of SolrJ clients for version 8:

  • HttpSolrClient
  • Http2SolrClient
  • LBHttpSolrClient
  • LBHttp2SolrClient
  • CloudSolrClient
  • CloudHttp2SolrClient

And for version 7

  • HttpSolrClient
  • LBHttpSolrClient
  • CloudSolrClient

Solr Configuration files

The managed-schema.xml file has now been renamed to simply managed-schema.

The Solr configuration file solrconfig.xml files change between versions.

JSON Response Object

The JSON response object has also changed

IMPORTANT: The returned Solr JSON object has changed, however the returned Panl response JSON object has not.  This will only ever affect your integration when dealing with the Solr JSON.  The Panl Results Viewer web app has taken this into account to support all versions.


In Solr version 7 and 8:

01

02

03

04

05

{

  "facet_counts": { ... },

  "response": [ ... ],

  "responseHeader": { ... }

}

Note:  The response key is a JSON array of results (line 3)

Whilst in Solr 9:

01

02

03

04

05

06

07

08

09

10

11

{

  "facet_counts": { ... },

  "response": {

    "docs": [ ... ]

    "numFound": 55,

    "start": 10,

    "maxScore": 1,

    "numFoundExact": true

  },

  "responseHeader": { ... }

}

Note:  The response key is a JSON Object of results (line 3 to 9 in bold above) with additional details.

Setting up a Solr 7 or 8 server

The main difference between Solr 7, 8 and Solr 9 is that the configuration MUST be uploaded to the zookeeper instance first, before the collection is created.

Additionally, there is no bin\post command for Windows machines so this is done through a Java command.

Windows Commands

IMPORTANT: Each of the commands - either Windows or *NIX must be run on a single line - watch out for  continuations.

  1. Create an example cloud instance

This requires no interaction, will use the default setup, two replicas, and two shards under the 'example' cloud node.

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin\solr start -e cloud -noprompt

  1. Create the configuration for the mechanical pencils

This will create and set up the mechanical pencil schema so that a collection can be created and the data can be indexed.

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin\solr zk upconfig -d ↩

PANL_INSTALL_DIRECTORY\sample\solr\mechanical-pencils -n mechanical-pencils ↩

-z localhost:9983

  1. Create the mechanical pencils collection

This will create and set up the mechanical pencil collection and schema so that the data can be indexed.

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin\solr create -c mechanical-pencils -n mechanical-pencils ↩

-s 2 -rf 2

  1. Index the mechanical pencils data

This will index the included sample mechanical pencil data into the Solr instance in the mechanical-pencils collection.

Command(s)

cd SOLR_INSTALL_DIRECTORY

java -Dc=mechanical-pencils -Dtype=application/json -jar example\exampledocs\post.jar ↩

PANL_INSTALL_DIRECTORY\sample\data\mechanical-pencils.json

  1. Start the Panl Server

This will start the server and be ready to accept requests.

Command(s)

cd PANL_INSTALL_DIRECTORY

bin\panl.bat -properties ↩
PANL_INSTALL_DIRECTORY\sample\panl\mechanical-properties\panl.properties

  1. Start searching and faceting

Open the link http://localhost:8181/panl-results-viewer/ in your favourite browser, choose a collection/FieldSet and search, facet, sort, paginate and view the results

For the simple-date dataset, the commands are almost identical, and, assuming that the Solr cloud is set up:

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin\solr zk upconfig -d ↩

PANL_INSTALL_DIRECTORY\sample\solr\simple-date -n simple-date ↩

-z localhost:9983

bin\solr create -c simple-date -n simple-date -s 2 -rf 2

cd SOLR_INSTALL_DIRECTORY

java -Dc=simple-date -Dtype=application/json -jar example\exampledocs\post.jar ↩

PANL_INSTALL_DIRECTORY\sample\data\simple-date.json

*NIX Commands

The *NIX commands are as per the windows section above with the file path delimiter changed from '\' to '/'

IMPORTANT: Each of the commands - either Windows or *NIX must be run on a single line - watch out for  continuations.

  1. Create an example cloud instance

This requires no interaction, will use the default setup, two replicas, and two shards under the 'example' cloud node.

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin/solr start -e cloud -noprompt

  1. Create the configuration for the mechanical pencils

This will create and set up the mechanical pencil schema so that a collection can be created and the data can be indexed.

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin/solr zk upconfig -d ↩

PANL_INSTALL_DIRECTORY/sample/solr/mechanical-pencils -n mechanical-pencils ↩

-z localhost:9983

  1. Create the mechanical pencils collection

This will create and set up the mechanical pencil collection and schema so that the data can be indexed.

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin/solr create -c mechanical-pencils -n mechanical-pencils ↩

-s 2 -rf 2

  1. Index the mechanical pencils data

This will index the included sample mechanical pencil data into the Solr instance in the mechanical-pencils collection.

Command(s)

cd SOLR_INSTALL_DIRECTORY

java -Dc=mechanical-pencils -Dtype=application/json -jar example/exampledocs/post.jar ↩

PANL_INSTALL_DIRECTORY/sample/data/mechanical-pencils.json

  1. Start the Panl Server

This will start the server and be ready to accept requests.

Command(s)

cd PANL_INSTALL_DIRECTORY

bin/panl -properties ↩
PANL_INSTALL_DIRECTORY/sample/panl/mechanical-properties/
panl.properties

  1. Start searching and faceting

Open the link http://localhost:8181/panl-results-viewer/ in your favourite browser, choose a collection/fieldset and search, facet, sort, paginate and view the results

For the simple-date dataset, the commands are almost identical, and, assuming that the Solr cloud is set up:

Command(s)

cd SOLR_INSTALL_DIRECTORY

bin/solr zk upconfig -d PANL_INSTALL_DIRECTORY/sample/solr/simple-date -n simple-date ↩

-z localhost:9983

bin\solr create -c simple-date -n simple-date -s 2 -rf 2

java -Dc=simple-date -Dtype=application/json -jar example/exampledocs/post.jar ↩

PANL_INSTALL_DIRECTORY/sample/data/simple-date.json

Additional Solr Version 7 Integration Notes

In the Panl response object, the num_results_exact (line 3 in bold below) will ALWAYS be true, as this version of Solr does not have this data available.

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

{

  "num_pages": 6,

  "num_results_exact": true,

  "page_uris": {

    "next": "/page-2/p/",

            "before": "/page-",

     "after": "/p/"

    },

  "page_num": 1,

  "num_results": 55,

  "num_per_page": 10,

  "num_per_page_uris": {

    "before": "/",

    "after": "-per-page/n/"

  }

}

Solr Versions 6 and below

Unfortunately version 6 and earlier have no release packages to be built, the only way to have these built is to edit and compile the code from source.

https://github.com/synapticloop/panl/

Starting with the branch solr-panl-7 is probably the wisest - these are the integration points that will need to be looked at.

  1. Create a new branch with your version of Panl - e.g. Version 6 would be branched as  solr-panl-6
  2. Edit the /build.gradle file
  1. change the distributionBaseName - e.g. solr-panl-6

distributions {

    main {

      distributionBaseName = 'solr-panl-6'

    }

}

  1. Edit the Solrj dependencies, the dependency for SolrJ can be found on any maven repository:

    https://mvnrepository.com/artifact/org.apache.solr/solr-solrj 

dependencies {

  ...

  // solrj

  implementation 'org.apache.solr:solr-solrj:?.?.?'

  ...

}

  1. Edit the Solr schema configuration and managed-schema
  1. Copy over the version schema from the Solr version that you are using. (this can be found in the SOLR_INSTALL_DIRECTORY/example/files/conf directory  - or equivalent for your Solr version).
  2. Edit the file to include the relevant fields that you want to index and search on
  1. Look at the com.synapticloop.panl.server.client package, adding in the Solr Clients that are applicable to your version
  2. Update the com.synapticloop.panl.server.client.PanlClient class to reference the correct clients.
  3. Update the com.synapticloop.panl.server.handler.helper.CollectionHelper#getPanlClient() factory method to return the correct client

IMPORTANT: The returned JSON object may have changed between versions which may cause problems with generation and returning the Panl response.

There may be other integration points for version 6.x.x and lower, however the instructions above where the changed integration points from version 9 to version 8 and 7.

~ ~ ~ * ~ ~ ~


Getting Started With Synapticloop Panl
A rather pleasing companion to the Apache® Solr® Faceted Search Engine.


[1] 'Sensibly' is a bit of a vague term... Panl strips out any unexpected characters and ensures that it is valid.  For example, if Solr (and therefore) Panl is expecting an integer parameter and the value 5gs6 is passed through, Panl will remove any non-numeric characters and parse the number - returning 56.  For values that cannot be converted, the value will be ignored and not passed through to the Solr server.

[2] Thanks for checking out this footnote.

[3] A LPSE length of 3 with the five mandatory codes would provide 185,193 facets, a length of 4 would provide 10,556,001

[4] A LPSE length of 3 with the five mandatory codes and one optional code would provide 175,616 facets, a length of 4 would provide 9,834,496

[5] Examples with specific dates are notoriously hard to put into examples as by the time you read this book, the example dates will be well out of range.  There is an example data set (simple-date) which is included within the release package which has random dates spanning 2014 to 2032 which can be used to test out the features, however you will need to index the data set with separate commands.

[6] This is probably not the fairest of comparisons, as a lot of the underlying Solr query implementation could be hidden behind the scenes anyhow.  However, what Panl can do is automatically have CaFUPs for multiple FieldSets, facets, and queries which will automatically build the query, the returned facets, the fields, and more.

[7] The exception to this rule are any defined OR facets, which will increase the number of results that are returned.

[8] The example data 'techproducts' included with the Apache Solr instance is a reasonable test dataset, however, the way the schema and collections are designed places an emphasis more on testing ingestion and searching, rather than on a functional search set.

[9] I am using an Apple Mac system, but it is the same for Linux

[10] Commands weren't included in this as a recursive force (i.e. rm -rf or rmdir /S /Q) deletion of directories can be a very dangerous thing.

[11] Historically, Java based examples for servers seem to have been based on the ubiquitous Pet Store, time for something new.

[12] Or, the mistakes that were made with the implementation.

[13] At the time of writing no useful results were returned by any of the large search engines for 'Solr Panl'.

[14] In some instances, the properties file layout would have been better suited to JSON, however, comments are not allowed in JSON files, which makes explaining the file a lot harder.  Admittedly, HJSON could have been used, and parsed on the way into Panl, but this would reduce portability - sigh - these are the decisions which can reverberate through time and code.

[15] This started off as a simple way to test the Panl configuration and how it interacted with the Solr search server, over time, it became a little more complex.  It also became an incredibly useful tool when adding features to the returned JSON object so that integration and implementation became a more developer friendly experience.

[16] Once again, a simple explainer turned into a more complex application as time and requirements became more involved.

[17]Admittedly this is rather annoying having to know the value ranges ahead of time, however there are some niceties built into Panl to use the minimum and maximum values.

[18]Once again, annoying to have to know the values.

[19] Although German users should be fine with the default implementation from Panl.