README.1ST

Book Versioning

The Synapticloop Panl project uses semantic versioning - i.e. major.minor.micro, with the following rules:

  • major - the major version will increment when there is a BREAKING CHANGE to the Panl LPSE URL generation or a BREAKING CHANGE to the Panl response JSON Object.  Upon increment of the major version, both the minor and micro version number will be reset to 0 (zero).
  • minor - the minor version will increment when there is additional functionality added to the release.  Upon increment of the minor version, the micro number will be reset to 0 (zero).
  • micro - the micro version will increment for bug fixes only.

The book version always matches the release version of the Synapticloop Panl server code version.  Any changes to the book without any changes to the underlying codebase will be updated on the main branch and the website based on the ghpages GitHub branch will be updated.

The book release number will be updated on change i.e. Version 2.0.0 (Release 1) will become Version 2.0.0 (Release 2).

Any Panl release packages on the GitHub release page will only include the version of the book that was available at the time of the code release.

Solr Schema Versions

This book uses a Solr schema version of 1.6 (i.e. set in the managed-schema.xml file), which covers Solr server versions from 7.7.3 (and possibly earlier version) up to 9.6.* to provide greater coverage of Solr versions.  The examples in this book, especially around faceting, will behave differently on Solr version 9.7.0 upwards if the schema version is set to 1.7.

There are two options:

  1. Use a Solr managed-schema.xml file from a previous version, or
  2. Use the updated schema version, but be aware of the differences in the return of faceted information - see the section on the differences between the schema versions The Impact Of docValues (Schema Version 1.7+)

IMPORTANT: The examples in this book will still work as expected, just be aware that the schema version used in the examples are from a previous Solr version.

This was deliberately done to ensure compatibility with the widest range of Solr implementations.


About This Book

This book describes and explains the functionality of the Panl server, how to configure the server, and how this impacts the generated URL paths.

To start with, this book will take you through setting up and running a new Solr instance in cloud mode, creating a new collection, and indexing the included sample data.  Then the Panl server will be started with the sample configuration and the functionality of the results and facets can be seen with the in-built Panl Results Viewer web application.

The book then continues to explain the details of configuring the Panl server, with the assumption that there is already a running Solr instance behind it. This will take you through all the various configuration options so that you can implement your faceted search pages and associated URLs, that are tailored to your specific requirements.

This book is not designed to be an introduction into Solr configuration, administration, or schema design best-practices, however there are hints and tips throughout the book which may be of interest.  These hints and tips relate to items that will affect the results that you retrieve from the Panl server, Solr configuration, and the integration and implementation.

Nomenclature Used Throughout This Book

When implementing any faceted search interface the following terms are the foundation and are used throughout the book:

Attributes

Attributes are information that is attached to each document.  For example if you were searching on mechanical pencils (as the example shows in this book), the attributes of a mechanical pencil include the brand, model name, colour, whether this pencil has an in-built sharpener, length, weight, and many others.  

Documents

Solr nomenclature for the individual result that is indexed and that can be returned by the Solr search.  You can also think of these as the rows of results that will be returned.

Facets

Facets are specific attributes that are extracted from the data and are attached to the index. Each of these attributes can then be used to filter the results such that only the documents that contain those attributes are returned.  The image below shows the different parts of the facets (including information that Panl includes in the returned object).



Image: A screenshot of the Panl Simple Results Viewer showing the 'Mechanism Type' facet and describing the parts of the returned facet information

In the above image:

  • The Solr field name (which is not shown in the image above) is mechanism_type, the Panl name is being rendered to the page i.e. 'Mechanism Type' , the (m) after the name is the Panl LPSE code, the [REGULAR] is the type of facet that is configured within Panl.  This information is output for reference in the Panl Results Viewer web app and can be omitted (or kept) in your implementation.
  • The facet values represent the indexed attribute values that are attached to the document, they are:
  • Clutch
  • Click
  • Magnetic
  • None
  • The facet counts represent the number of documents that contain these values, respectively they are:
  • 30
  • 23
  • 1
  • 1
  • The  'add' link is generated by the Panl Simple Results web app from the returned JSON results object.  This link is in the Panl LPSE form.

Keyword Search term

The text (either a word or phrase) that is submitted from a form on the web page through to the Panl server, which is passed to the Solr search server to query against the collections' indices. (also known as 'search query', 'search term', 'search phrase', or some other combination or words).

Additional introductions to common words and phrases used throughout the book are below.  Terms and names are generally defined where they first appear, for a full list - see the Appendices - Definitions at the end of the book.

CaFUPs

An acronym for Collection and FieldSets URL Paths - Panl allows many different groups of fields (the FieldSet) to be bound to a specific Solr Collection.  Each CaFUP has a unique URL that is served by Panl.

CaFUPs allow you to configure multiple ways in which the search results and fields are returned for any specific Solr Collection.

Collection(s)

There are two types of collections referenced in this book, the Solr Collection, and the Panl Collection.  

Solr collections are an index of documents that can be filtered or searched upon.

Panl collections are collections of URL paths and FieldSets - that are configured to connect to a Solr collection with defined fields.

LPSE codes

The foundation of how the Panl server decodes and parses a URL path to convert it to a form that the underlying Solr server can understand.  A LPSE code is either a number, or an uppercase or lowercase letter of the alphabet (i.e. a-z, A-Z, 0-9).  These codes are placed in the last path part of the URL.

LPSE path

The LPSE path is a string of URL path values, which, in conjunction with the LPSE codes above is how the Panl server decodes the URL into a Solr server query.

Panl field

This is the field definition that contains the configuration that determines how this field is mapped to the Solr field and what translations should be done on the incoming value before passing it through to the Solr server.

Panl generator

A stand-alone utility to quickly generate a panl.properties file and <panl_collection_url>.panl.properties file from an existing Solr managed schema file that can be used as a starting point for configuring the Panl server.  

NOTE: The generator does not interact or interfere with the Panl server and the generator codebase is not used when serving production content.

Panl name

This is the nicer, human-readable field name that the UI can display in preference to the Solr field name

Panl server

The server that responds to specific URLs, parses the URL path, builds the Solr request object, connects to the Solr search server, executes the query, parses the results, builds the JSON response and passes it back to the caller.

Solr field

The definition of the field in the managed-schema.xml Solr configuration file which determines how Solr indexes, searches, and presents the information.  The Solr field type determines what features can be configured to be used by the Panl server.

Solr query

The HTTP query string that is sent to the Solr search server, an example of this is:

q=*:*&q.op=OR&facet.limit=100&fl=brand,name&facet.mincount=1&rows=10&facet.field=le
ad_size_indicator&facet.field=
colours&facet.field=brand&facet.field=mechanism_type&
facet.field
=hardness_indicator&facet.field=lead_grade_indicator&facet.field=in_buil
t_sharpener&facet.field=
disassemble&facet.field=category&facet.field=lead_length&fa
cet.field=in_built_eraser&facet.field=grip_shape&facet.field=
weight&facet=true&fq=i
d:"53"&start=0

(Panl's entire raison d'être is to abstract this away from the implementers and end-users.)

Solr search server

The Apache Solr search server that is queried for results.

Tokens

The incoming LPSE code and any associated URL path values for each of the codes.  Tokens will be parsed, prefixes and suffixes removed, and validation performed on the incoming value.  If any parsing or validation fails, then the token will be marked as invalid, ignored, and not passed through to the Solr server.

Book Format Conventions

Normal Text

Normal paragraph text is Libre Baskerville, 11pt, other formatting conventions are detailed below.

Sidebars

IMPORTANT: Important notes are within a red side-bordered box, with an exclamation icon, and red background.  Careful note should be made of the information contained within these boxes as this will affect the running of the Panl server, and there may also be non-obvious side-effects.

Notes: Notes are within a black side-bordered box, with a pencil icon, and grey background.  This is something to look out for when you are reading the book, executing a task, or looking at an image or URL.

Tips: Notes are within a black bordered box, with a lightbulb icon, and white background.  This is something which is a handy idea to know for the functioning or configuration of the Panl or Solr servers.

Code / Output Related Snippets

Inline code, or text related snippets are in monospaced text (Inconsolata Normal 11pt), and highlighted in grey, for example: /Caran d'Ache/true/Black/bDW/. This indicates that the text is exact and should be used as a reference.

For multi-line code or text related snippets (including console and logging output) the text appears in a black-bordered grey box prefixed by a line number so that they can be referenced within the description text.

Note: that within any line, there may be a line continuation character (↩) which should not be included in the command.  Unfortunately, for electronic viewers this means that it is a little more difficult to simply copy and paste the text - apologies, however readability and explanation of the text over cut-and-paste-ability.

01

02

03

# The text file that may be included, with some information or processing ↩
directives which will not fit on one line, so it includes the continuation marker

# a line of commented text

# this is another line of text


Commands

Any commands that should be run in your terminal or command line prompt will appear in a formatted table with a white header on a dark grey background.  Note the '' character which means a line continuation and should not be included in the command.  (Reasoning is as per above with the copy and paste-ability of the lines).

Command(s)

\the\command\that\needs\to\be\run -with "a parameter" -and-another ↩
"long parameter that may span many lines"


Links

Links are designated by underlining the text of the link.  If the text is underlined, it is either a link to an external website, or another section of this book.

External links to websites (either local or remote) are in the standard blue underline, indented, and will ALWAYS match the URL that will open in your browser (i.e. the URLs are never truncated, even if they take multiple lines):

http://localhost:8181/panl-results-viewer/mechanical-pencils/firstfive/Manufactured by Koh-i-Noor Company/Green/Cylindrical Grip/120mm/17 grams/bWGLw/ 

Links to other sections or chapters within this documentation are bold, underlined, and in black text and always match the heading text:

Integrating An Existing Solr Schema

Images

Images are bounded by a dotted border inside a bordered section with a caption.



Image: The Panl header for the in-built Simple Results Viewer web app.

Footnotes[1]

Footnotes aren't used very often, but when they appear they can be safely ignored - these are more background thoughts as to why things were implemented the way they were.  This will not impact the running of the Panl Server.

Welcome To Faceted Search

Initially, search engines only utilised keyword searches, i.e. you typed in the words that you wanted to find, and then the search engine would return results that contained those keywords within the indexed documents.  

Keyword searches are still widely used and a good use-case for this is to search within a range of documents and documentation (web search engines are a very good example).  Keyword searching is still an important part of any search engine implementation, faceting adds a mechanism that enhances the keyword search function.

Faceted search engines include functionality that allows users to further narrow down their search results by applying one or more filters (or facets) to a set of data. These facets can be based on various attributes or characteristics of the data, such as categories, tags, authors, dates, or other metadata.

In a faceted search system, the search results are typically displayed along with a set of facets or filters that can be applied to the results. The user can then select one or more facets to refine their search, and the system will update the results in real-time to reflect the new filters.

One of the most well known and popular faceted search engines is the open source Apache SolrTM  faceted search engine, powered by the Apache LuceneTM search engine.

Synapticloop Panl is designed to integrate with the Apache Solr faceted search engine and remove the complex URL parameters and paths that a Solr search engine requires.

The Solr Server uses URL parameters to interrogate the search index and return the results.  For example, to search a mechanical pencils collection for all pencils brands that have a blue and black pencil within their range, the query that is sent to Solr is:

q=*:*&q.op=OR&facet.limit=100&fl=brand,name&facet.mincount=1&rows=10&facet.field=le
ad_size_indicator&facet.field=colours&facet.field=brand&facet.field=mechanism_type&
facet.field=hardness_indicator&facet.field=in_built_sharpener&facet.field=disassemb
le&facet.field=category&facet.field=lead_length&facet.field=in_built_eraser&facet.f
ield=grip_shape&facet.field=weight&facet=true&fq=colours:"Black"&fq=colours:"Blue"&
stats.field=weight&stats=true&start=0

Which, decoding the URL parameters becomes:

  • q=*:* - The fields that a keyword search query would be performed on - in this case, all indexed fields
  • &q.op=OR - Use the query operand of OR on between the individual keywords
  • &facet.limit=100 - Return only the first 100 facet values for each facet
  • &fl=brand,name - Return the Brand and Name of the pencil in the results documents
  • &facet.mincount=1 - A facet value must have a count of at least 1 for it to be returned by Solr
  • &rows=10 - Return 10 documents (results) at a time
  • &facet.field=lead_size_indicator - Facet on the Lead Size Indicator field
  • &facet.field=colours - Facet on the Colours field
  • &facet.field=brand - Facet on the Brand field
  • &facet.field=mechanism_type - Facet on the Mechanism Type field
  • &facet.field=hardness_indicator - Facet on the Lead Hardness Indicator field
  • &facet.field=in_built_sharpener - Facet on the In-built Sharpener field
  • &facet.field=disassemble - Facet on the Disassembly field
  • &facet.field=category - Facet on the Category field
  • &facet.field=lead_length - Facet on the Lead Length field
  • &facet.field=in_built_eraser - Facet on the In-built Eraser field
  • &facet.field=grip_shape - Facet on the Grip Shape field
  • &facet.field=weight - Facet on the Weight field
  • &facet=true - Turn on faceting
  • &fq=colours:"Black" - Only select results with a value of "Black" in the Colours field
  • &fq=colours:"Blue" - Only select results with a value of "Blue" in the Colours field
  • &stats.field=weight - Return statistics for the Weight field
  • &stats=true - Turn statistics reporting on
  • &start=0 - Start at the first (i.e. index=0) result

Compare this to the Panl URL:

http://localhost:8181/panl-results-viewer/mechanical-pencils/brandandname/Black/Blue/WW/

Which is much cleaner and shorter (note that the only part of the URL which is important to Panl is /mechanical-pencils/brandandname/Black/Blue/WW/.

For additional help see Decoding the Solr Query Parameters.

A Simple Faceted Search Example

Below is an image of the DuckDuckGo search engine which allows both a keyword search and some very simple faceting options.



Image: A DuckDuckGo search with the keyword of 'Solr Panl' which will show relevant results and some simple faceting options.[2]


Large scale web search engines attempt to index and make sense of the huge amount of data that is available on the myriad of websites.  Each of these websites have different taxonomies and types of information, consequently, being able to extract attributes from each of the pages (and thus adding them as facets) becomes far more complex.  

In the above image, apart from the keyword search of 'solr panl', the following can be thought of as facets:

  • Country of origin (set to 'United Kingdom'),
  • The safety of the search results (set to 'moderate'), and
  • The time that the content was published (set to 'Any time')

The facets will help to guide the search results, but they are broad facets which are useful for this interface. Note: there are advanced options to only search one site, or to search the file type as well.

For our examples, we will be using a smaller and more targeted search index, consequently there is greater potential to add facets that help the user get to the correct details more quickly.

A More Complex Faceted Search Example

Most online shopping sites use a faceted search engine for their results.  Arguably the most-well known online store is Amazon.com. Below is a screenshot of the landing page

https://www.amazon.com/



Image: The amazon.com landing page with a keyword search box


Whilst the second level navigation can be thought of as facets in that they do narrow down the search (i.e. by selecting a shopping 'Department'), by performing a keyword search - in this case 'mechanical pencils' will provide with the search results page with a lot more facets:

https://www.amazon.com/s?k=mechanical+pencils&crid=30FR4CP8REGL9&sprefix=mechanical+pencil%2Caps%2C360&ref=nb_sb_noss_1



Image: The amazon.com search page with a keyword search of 'mechanical pencils'

On the left hand side of the screen, the facets are displayed and selectable, allowing the user to refine their search.

The amazon URL for Staedtler branded pencils in black or blue is:

https://www.amazon.com/s?k=mechanical+pencils&i=office-products&rh=n%3A1064954%2Cp_123%3A312915%2Cp_n_feature_thirteen_browse-bin%3A23895069011%257C23895080011&dc&crid=30FR4CP8REGL9&qid=1737957243&rnid=23895064011&sprefix=mechanical+pencil%2Caps%2C360&ref=sr_nr_p_n_feature_thirteen_browse-bin_2&ds=v1%3Aa5ndrDc%2BY5GtuLwrNYKbz%2Bh0w2isRh0xRRfsHDxWqCU

The URL is 352 characters, with Panl this might become

https://www.amazon.com/Manufactured%20by%20Staedtler/Black/Blue/mechanical-pencils/bWWpnq/

The Panl URL is only 90 characters, and far more human readable.

From the Amazon URL, there is a query parameter and value of i=office-products (meaning the department), which does not exist in the Panl example search engine, should an implementation be created with the departments, then the Panl URL might have the form of:

https://www.amazon.com/office-products/Manufactured%20by%20Staedtler/Black/Blue/mechanical-pencils/ibWWpnq/

With 107 characters.

Note: There are a variety of configuration options that determine how Panl will generate a URL, the above is just an example.



About Apache Solr

From the Apache Solr website (https://solr.apache.org/)


Solr is the blazing-fast, open source, multi-modal search platform built on the full-text, vector, and geospatial search capabilities of Apache Lucene™.


And:


Solr is highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration and more. Solr powers the search and navigation features of many of the world's largest internet sites.


By implementing the Panl server you can abstract away the complex Solr query options and focus on delivering URLs to your implementation that are more user and SEO friendly.

Welcome To Synapticloop Panl

Synapticloop Panl is a light-weight application server that is designed to sit between your web application and Solr search server instance(s) seamlessly converting human-readable, SEO friendly URLs into complex Solr search queries, and returning an enhanced JSON object for ease of integration and implementation.

It abstracts away the complexities of the Solr search parameters and building/translating of URLs so that you get the benefit of a human readable (and SEO friendlier) URLs without having to have a deep understanding of the mechanics behind it.

Some examples contained in this book also detail the conversions that Panl performs, and the Solr query that is executed, should you wish to delve deeper into the inner-workings of the Panl server's integration with Solr.

Apart from the default implementation that is expected of any search engine - i.e. keyword searching and faceting, Panl also provides some additional niceties for more human readable and SEO friendlier URLs.

Additional Panl Niceties

  1. MULTIPLE ways to 'SLICE and DICE' - From one Solr collection, the Panl server can present the results and facets in multiple different ways, providing individual use cases for specific needs.
  2. PREFIXES and SUFFIXES - Panl can be configured to add prefixes and suffixes to the values within the URL path to increase readability, for example,

    The LPSE URL path of

            
    /Caran d'Ache/true/Black/bDW/ 

    could also have the
    brand Solr field prefixed with 'Manufactured By ' and suffixed by ' Company' to produce the URL path

            
    /Manufactured By The Caran d'Ache Company/true/Black/bDW/ 

    Which will be parsed by the Panl server and converted to the correct value for the Solr server.
  3. BOOLEAN value translations - For any Solr field that is defined as a solr.BoolField, then, in addition to prefixes and suffixes, the  'True' and 'false' values can be replaced with any arbitrary text, which will then be transparently converted between Panl and Solr.  

    For the LPSE URL path of

            
    /Caran d'Ache/true/Black/bDW/ 

    the true value (which is defined as whether the mechanical pencil can be disassembled or not) could be replaced with '
    Able to be disassembled' for true values, and 'Cannot be disassembled' for false values.  The above URL path for 'true' values would then become

            
    /Caran d'Ache/Able to be disassembled/Black/bDW/
  4. BOOLEAN checkboxes - Whilst this may seem obvious to have a checkbox for a true/false value, the checkboxes work in a subtly different way.  By selecting the checkbox, one of either the 'true' or 'false' facet values will be selected. When deselected, the BOOLEAN facet is in a "don't care" state - i.e. the facet value can be either of the values.
  5. CONDENSE multiple field values - Rather than having a forward slash URL path separator for multiple values of the same facet (used in OR Facets and Multivalued REGULAR facets), Panl can be configured to condense these values into a single path part, saving URL characters, and reducing URL length, and making the URL far more human readable.

    For example, selecting pencils manufactured by
    Faber-Castell OR Koh-i-Noor would have the URI path of

            
    /Manufactured by Koh-i-Noor Co./Manufactured by Faber-Castell Co./bb/

    with condensed multiple field values - this could be configured (with a value separator of '
    , or ') to become

            
    /Manufactured by Koh-i-Noor, or Faber-Castell Co./b/

    Saving 15 characters in the URL, the more multivalued fields values that are selected, the more URL space is saved (With 3 values selected, the saving becomes 30 characters).
  6. SEARCH ALL OR SPECIFIC SOLR FIELDS - Any Solr field that is analysed can be selected to be searched on, for example, in the Bookstore Walkthrough, the user can select to search within the title, the author, the description, none (i.e. the default search) or all of them.
  7. FIELD VALUE validation - By default, Solr will error (or give an erroneous result) when an invalid value is passed through - for example, if Solr is expecting a numeric value and it could not parse the passed in value, it will throw an exception.  Panl protects against this by attempting to parse the value as best it can, and silently dropping the parameter if it cannot be sensibly[3] parsed. This is done for numeric types (integer, long, float, and double) and boolean values.
  8. HIERARCHICAL facets - Only show facets if a separate facet is first selected, allowing you to lead users through the search journey, by only displaying facets that help the user narrow down their results.

    For example, you may be presenting a search page for Cars and you may only want to show the car models once the brand of cars has been selected first.
  9. UNLESS facets - Continue to show a facet unless another specified facet is selected.  This can be thought of as the inverse of a hierarchical facet.
  10. SORTED facets - Each facet can be individually configured to order the facet results by either the facet count (which is the default), or the facet value (e.g. alphabetic/numeric based on the value of the facet).  
  11. MORE facets - Solr (and Panl) configures a limit for the maximum number of facet values that are returned, this functionality enables you to dynamically load additional facet values if they are available but weren't returned with the results by default.
  12. RESULTS SORTING options - Sort by any of the Solr fields that are configured in Panl, either ascending, or descending and with multiple sub-sorting available - e.g. sorting by a brand name, then the model number. Additionally Panl generates URLs for the inverse of the sorting without impacting any sub-sorting.
  13. INTEGRATED TYPEAHEAD/LOOKAHEAD - Retrieve results suggestions as you type in the query search box.  In the implementation included within this book the typeahead functionality is enabled after typing 3 or more characters.
  14. PAGINATION - Panl will return all of the variables required to easily generate pagination URL paths giving you options and control over your own implementation.  

    The returned variables are the number of pages of results, the number of results per page, the total number of results, the current page number and whether the returned results are an exact number.
  15. STATIC SITE GENERATION - With the exception of a query parameter, all available links for every conceivable URL path can be statically generated ahead of time, with canonical URLs.  Additionally, for search results which do not change frequently, the Panl JSON Response Object can be cached for faster lookups.

    Be warned that the number of possible pages that can be generated can quickly become incredibly large.
  16. STATELESS - No state is stored in the Panl server. All state is held within the requested URL path part.  No sessions, no memory, nothing to backup, easy to update, and quick to start and restart.
  17. CACHE-ABLE - Unless the underlying Solr search document index changes,    each Solr request is able to be cached.
  18. 100% TEXT CONFIGURATION - All configuration for Panl is based on text files (Java .properties) files so they can be stored in a source code management system.  Additionally, upgrades to the Panl server are easy, just drop in the new Panl release package, and with quick restart times, the configuration changes will be seen almost instantly.

About Panl Server

The Panl server is an interface between your web app into the Solr search server converting human-readable, SEO friendly URL paths into complex Solr search queries. Rather than adding query parameters to the URL, Panl automatically generates and returns complete URL path links that can be rendered by your web application.

The Panl server uses last path segment encoding (LPSE) to parse and decode the full URL path, converting a URL path from

                                                |Last Path|
                                               | Segment |
/Manufactured by Koh-i-Noor Company/Clutch/Green/bmWsb-sN-/
|----------------LPSE PATH----------------------|LPSE code|

To a search query that will return a list of mechanical pencils that

  • Are manufactured (the brand) by Koh-i-Noor (b) - Note that the configured prefix of 'Manufactured by ' and suffix of ' Company' will be removed by the Panl server before being passed through to the Solr server.
  • Have a Clutch mechanism (m), and
  • Are Green (W) in colour

And, of the 8 results that are returned, the results will be sorted

  • By brand name descending (sb-), then by
  • Pencil Model (sN-)

Which is then passed through to the Solr search server as the following query:

q=*:*&q.op=OR&facet.limit=100&fl=brand,name&facet.mincount=1&rows=10&facet.field=lea
d_size_indicator&facet.field=
colours&facet.field=brand&facet.field=mechanism_type&fa
cet.field
=hardness_indicator&facet.field=lead_grade_indicator&facet.field=in_built_s
har
pener&facet.field=disassemble&facet.field=category&facet.field=lead_length&facet.
field=in_built_eraser&facet.field=grip_shape&facet.field=
weight&facet=true&fq=brand:
"Koh-i-Noor"&fq=mechanism_type:"Clutch"
&fq=colours:"Green"&sort=brand+desc,name+desc
&start
=0

When extracted, the Panl release package contains everything required for running the server including in-built web applications to view all functionality.


How Many Facets Does Panl Support?

The number of supported facets depends on the LPSE code length (which, if not set, defaults to 1). A LPSE code is a letter (either uppercase or lowercase) or number which maps to a parameter, operand, field, or filter.  There are five mandatory (and one optional) LPSE codes:

  1. The query parameter,
  2. The page number,
  3. The number of results per page,
  4. The query operand,
  5. The sort order operand(s), and
  6. (optionally) The pass through parameter

The above LPSE codes can be configured to your specifications and cannot be registered as facet LPSE codes.

Note: The above LPSE codes always have a length of 1, Solr field definitions have the configured LPSE code length.

With a LPSE length of 1:

  • With the five mandatory codes, Panl will support up to 57 facets.
  • With the five mandatory codes and the one optional code, Panl will support up to 56 facets.

With a LPSE length of 2:

  • With the five mandatory codes, Panl will support up to 3,249 facets.
  • With the five mandatory codes and the one optional code, Panl will support up to 3,136 facets.

The formula for working out what the maximum number of supported facets for the LPSE code is the number of available LPSE codes to the power of the LPSE length:

  • With the five mandatory codes[4]:
    (62 - 5)^lpse_length = 57^lpse_length
  • With the five mandatory codes and the one optional code[5]:
    (62 - 6)^lpse_length = 56^lpse_length

A LPSE length of 2 should provide more than enough facets for the majority of implementations.  Once the LPSE length gets above 2, the LPSE URL path becomes much longer, more quickly, subtly negating the value of the encoding of the URL to be compact and readable.

Remember that you can define multiple Panl collections with CaFUPs for a Solr collection, and each of the CaFUPs can have different LPSE codes.  You may have over 56 fields for your documents in your indexed Solr collection, but you may wish to have a LPSE length of 1 and just use a subset of the fields for each of the CaFUPs.

Panl URL Structure

Collection Request Handlers

The URLs that the collection request handler responds to are determined by the Panl server configuration and returns an HTTP response of a JSON object that contains the results of the search query with all available facets and documents with the defined FieldSets.

The defined CaFUPs will respond to URLs in the form

/<panl_collection_url>/<fieldset>/<lpse_path>/<lpse_codes>/

Where:

  • <panl_collection_url> Is the Panl collection (which is configured to map to a Solr collection)
  • <fieldset> Is the fields that are returned with the documents
  • <lpse_path> Is the encoded values for the facts, queries, parameters, and operands, and
  • <lpse_codes> Is the LPSe codes which determine how the LPSE path will be decoded by Panl

And will uniquely return facets and documents with the configured FieldSets.  These URLs are configured through the <panl_collection_url>.panl.properties files and will be able to serve a virtually unlimited[6] number of URLs.

Additionally, there are in-built Panl server URLs which provide additional functionality, the predefined URLs are as follows:

Panl Single Page Handler URLs

For each collection that is registered with Panl, the single page handler URLs return a JSON object that will allow building a single page search interface, they are bound to the following URLs:

/panl-single-page/<panl_collection>/

Where <panl_collection> is the Panl Collection that the single page search UI should be built from.

Note: Do not confuse this handler and URL with the in-built Panl Single Page Search UI example bound to the /panl-single-page-search/<panl_collection>/ URL - this returns a simple sample implementation of a Single Page Search user interface.

Panl More Facets Handler URLs

For each collection that is registered with Panl, the more facets handler URLs return a JSON object that will provide additional facet values for a specific facet and are bound to the following URLs in the form of:

/panl-more-facets/<panl_collection>/<fieldset>/<lpse_path>/<lpse_codes>/?code=<lpse_code>&limit=<limit>

Where:

  • panl-more-facets is the start of the URL that the Panl server has bound the 'More Facets' handler to
  • <panl_collection> is the Panl collection
  • <fieldset> is the FieldSet for the fields that will be returned with the Solr documents Note: This value is ignored and replaced by the More Facets handler with the 'empty' FieldSet as there is no need to return any other documents
  • <lpse_path> is the encoded values for the facets
  • <lpse_codes> are the LPSE codes for the <lpse_path> above
  • code=<lpse_code> is the query parameter LPSE code for the facet that the additional facet values are requested
  • limit=<limit> is the query parameter for the maximum number of facet values to return.  Note: If this is set to -1, then all facets will be returned.

This will perform the search and only return the additional facet values for the Solr facet designated by the LPSE code <lpse_code> up to the limit designated by <limit> (or all if the limit is set to -1).

Panl Lookahead Handler URLs

For each collection that is registered with Panl, the lookahead handler URLs return a JSON object that provides a snippet of search results:

/panl-lookahead/<panl_collection>/<fieldset>/?search=<keywords>

Where:

  • panl-more-facets is the start of the URL that the Panl server has bound the 'More Facets' handler to
  • <panl_collection> is the Panl collection
  • <fieldset> is the FieldSet for the fields that will be returned with the Solr documents
  • search is the URL parameter that the Panl server will respond to.  Note: this URL parameter name is configurable, however throughout the book it has been configured to be 'search'.
  • <keywords> are the keywords that the user is searching for.

This will perform a search on the documents and return any results that it finds.  It will not return any facets, only documents.

IMPORTANT: You __CANNOT__ configure a Panl collection to start with the prefix 'panl', it is reserved for internal use.  If one is attempted to be registered, it will be rejected and the server __WILL_NOT__ start.

In-Built Web Apps (Viewer / Explainer / Single Search Page)

For testing and debugging of the configured properties, the Panl Results Viewer, Panl Results Explainer, and Panl Single Search Page web apps are included in the Panl release package.  This surfaces all Panl functionality and allows integrators and implementers to understand and test the Panl configuration without having to integrate with a separate web application.

Tips: The recommendation is to either turn off the Panl Results Viewer / Explainer / Single Page Search web apps, or to not allow public access to these URLs.  This can be done by setting the panl.results.testing.urls=false property in the panl.properties file.


'Simple' Panl Results Viewer Web App

What started as a relatively simple page for testing and debugging turned into a page that has a fully functional faceted search interface, able to highlight all of the functionality of the Panl server and surface most of the Solr search server functionality.  It still remains an excellent way to test configuration options.

Below is a screenshot of the in-built Panl Results Viewer web demonstrating the features and functionality that you would expect from a search page implementation along with some additional features to make searching easier for you and your user.

When the Solr and Panl configuration is set up, the server is up and running, and the testing web app URLS are enabled, it is accessible at:

http://localhost:8181/panl-results-viewer/ 



Image: The In-Build Panl Results Viewer web app, highlighting all of the functionality.

  1. A list of available Collections and FieldSet URL Paths (CaFUPs) that Panl is configured to serve.  CaFUPs enable different Solr fields and facets to be returned from the same Solr collection.
  2. A textual representation of the CaFUP that the Panl Results Viewer web app currently is using.
  3. The canonical URL path (which is returned with the Panl results JSON object) - An important part for search engines to de-duplicate URLs that return exactly the same information.  Multiple Panl LPSE URL paths WILL return exactly the same results.  You SHOULD use this link as either
  • The rel="canonical" link element in the HTML, or
  • The rel="canonical" link HTTP header

There is also an [explain] link that will take you to the in-built Panl Results Explainer web app for this particular canonical URL.

  1. The search query box, by default, Panl responds to the same URL parameter name as The Solr server - i.e. 'q'.  This can be configured to be a different value through the Panl properties file.  In this book, it has been configured to respond to the 'search' query parameter.

    Specific Solr Search Field (not shown[7]) Panl can be configured to search on specific Solr fields, rather than using the default search field.
  2. Active filters, either queries or any of the selected facets that have been used to refine the search results - the  link is the URL path that will remove this query or facet
  3. Active BOOLEAN filters, if the selected facet is a BOOLEAN facet (i.e. either true/false) then a link () can be included to invert this selection (i.e. change the value from true if currently false and vice versa).
  4. BOOLEAN Checkboxes - any facets that have been defined as BOOLEAN   checkboxes, which allows the integrator to emphasise one of the values (either true or false).
  5. Active Sorting, sorting options that are currently ordering the results - the  link is the URI path that will remove this query, facet, or sorting option from the results. If it is an active sorting filter, the Change to DESC or Change to ASC links will invert the sorting order without affecting any further sub-ordering.
  6. RANGE filters, for facets that are defined as ranges - allowing users to select a range of values - the values are inclusive (i.e. include the minimum and maximum values).  RANGE filters also include dynamic maximum and minimum values so that the range that is rendered can be automatically updated.

    DATE Range filters (not shown[8]), enabling searching over a range of dates (but not a specific date) in the form of:

            
    <next/previous> <any_integer> <hours/days/months/years>.

    For example:

Last 30 days
Previous 24 hours
Next 3 years

  1. Available filters, additional facets with links  that can further refine and limit the Solr search results.  This may also display a link to load more facets if the returned number of facets is not the complete set.
  2. Number of results found, and whether this is an exact match.
  3. Query operand, whether the Solr search term query is OR, or AND - this affects the search query for Specific Solr Search Field functionality, not the faceting - i.e. the Solr server q.op parameter.  The eDisMax query analyser implementation (which is not included in this book) will also use this query operand to override the default query operand in the Solr configuration.
  4. Page information, the number of pages, how many results are shown per page, and how many results are shown on this page.
  5. Sorting options, whether to sort by relevance (the default) or by other configured sorting options with ascending and descending options available.  Any Solr field can be configured to be used as a sorting option.  And multi-sort orders are available, allowing progressive sorting on more than one field.
  6. Pagination options, the Panl server returns all information needed to build a pagination system, number of results, number of results shown per page, and the current page number.
  7. Number of results per page, Able to dynamically set the number of results to return for the query. Note: In the above image, the values 3, 5, and 10 are just examples that are hard-coded into the Panl Results Viewer and can be implemented with any positive integer number.
  8. Timing information, About how long the Panl server took to build and return the results (including how much time the Solr server took to find and return the results).
  9. The results, the fields that are returned with the documents and are shown in the results sections which are configured by the CaFUPs. Multiple field sets can be configured for the collection, allowing different groups of fields to be returned for different URL paths.  In the image, only two fields are configured for this CaFUP, namely Brand, and Pencil Model.

IMPORTANT: The Panl Results Viewer Web App is configured to work with the default parameters in configuration files.  The following values are hardcoded within the JavaScript files and will probably not work if they are changed.

  • The host __MUST__ be localhost
  • The port number __MUST__ be 8181
  • The panl.form.query.respondto parameter __MUST__ be search

Of course, the source code is available so you can easily see where this occurs and change it to your needs.

'Simple' Panl Results Explainer

Again, this started as a relatively simple page for testing and debugging of the startup configuration options, rather than trawling through properties files and logs.

Below is a (cut-down) screenshot of the in-built Panl Results Explainer web app with explanations for canonical URL paths, the configuration of the Panl collection URL, and the individually configured properties for each of the fields and how this alters the Solr query. A useful page to see at-a-glance everything that a CaFUP is configured to do.

When the Solr and Panl configuration is set up, the server is up and running, and the testing web app URLS are enabled, it is accessible at:

http://localhost:8181/panl-results-explainer/ 



Image: The In-Build Panl Results Viewer web app

  1. A list of available Collections and FieldSet URL Paths (CaFUPs) that Panl is configured to serve.  CaFUPs enable different Solr fields to be returned in the documents with the same search parameters.  Clicking on these links will populate the 'Configuration Parameters' and 'Field Configuration Explainer' sections.
  2. A textual representation of the CaFUP that the Panl Results Explainer web app is currently using.
  3. The canonical URL path entry field allows you to enter any canonical URL path and have the parsing and tokenising explained to you, including whether the parsed token was valid, the LPSE code found and the original value that Panl attempted to decode.  Note:  The CaFUP that the canonical URL path came from MUST match the CaFUP on the results viewer.
  4. The request token explainer - for any canonical URL entered, this will list the parsing and decoding steps, with the following details
  1. Whether the token is valid (if it is invalid, it will be ignored and not passed through to the Solr search server),
  2. The type of Panl token that was found,
  3. The LPSE code,
  4. The parsed value,
  5. The original value, and
  6. Where pertinent, additional information pertaining to the specific code.
  1. Configuration parameters - parameters that are not fields or facets with information about the value, a description, and the property that set the value.
  2. Field configuration explainer - for each of the fields or facets that are configured in the LPSE order an explanation of their configuration including:
  1. The Java field type,
  2. The LPSE code,
  3. The Solr field name,
  4. The Solr field type, the Panl field name, and
  5. Additional configuration items which may include
  1. Prefixes,
  2. Suffixes,
  3. Ranges,
  4. Facet type, or
  5. Minimum/maximum values
  1. Any configuration warning messages that were found whilst parsing the properties files.

'Simple' Panl Single Search Page Web App

Panl also binds a URL path to enable the building of a single search page interface, and binds a URL path to view a working example of what the single search page could look like.

When the Solr and Panl configuration is set up, the server is up and running, and the testing web app URLS are enabled, it is accessible at:

http://localhost:8181/panl-single-page-search/



Image: The In-Build Panl Single Search Page interface web app for the mechanical pencils collection (Note: image split for sizing reasons)

  1. A list of available example Single Search page interfaces that Panl is configured to serve.  CaFUPs enable different Solr fields to be returned in the documents with the same search parameters.  Clicking on these links will generate a working sample single search page.
  2. The generated LPSE path that the selections from the search interface will apply.
  3. All facets that can be selected, presented for the different types of facets, namely OR, RANGE, DATE Range, BOOLEAN, and REGULAR facets .
  4. The generated LPSE path that the selections from the search interface will apply.
  5. The search button that will take you to the in-built Panl Results Viewer web app so that you can view the results instantly.

IMPORTANT: The Panl Single Search Page Web App is configured to work with the default parameters in configuration files.  The following values are hardcoded within the JavaScript files and will probably not work if they are changed.

  • The host __MUST__ be localhost
  • The port number __MUST__ be 8181

Of course, the source code is available so you can easily see where this occurs and change it to your needs.

Note: The implementation included in this book always links to the 'default' FieldSet.  In your implementation, you may change this to any FieldSet you wish.

About Panl Generator

The Panl generator is a quick and interactive command line utility built into the Panl release package that, from a Solr managed schema file, generates a default panl.properties and <panl_collection_url>.panl.properties files.  This easily and quickly gets things up and running for your existing Solr schema from which you can iterate a solution from.

If you have an existing Solr schema and want to start testing the Panl server integration, then skip to the Integrating An Existing Solr Schema section.  If you are skipping ahead and diving straight into the Panl configuration generator, the rest of the sections of the book will give understanding on how to configure the Panl server to suit the requirements of the search page implementation.

Tip: See the section on Panl Generator command line options for full details on the options available.

~ ~ ~ * ~ ~ ~