Afterword

Firstly, my sincerest thanks for getting this far through the book. I found it quite pleasurable (most of the time) to write, and I hope that you have had the same experience reading it.

Secondly, you do not need to read any of this chapter, this delves deeper into the entire process and project behind the Panl server.

That being said, my most fervent wish is that you find the Panl project to be useful.

On The Tag-Line


A rather pleasing companion to the Apache® Solr® Faceted Search Engine.


I wrote this line as a first draft with a definite understated tone.  Although, as I progressed through writing the book it resonated more with me.  Firstly, I do not like to emphasise my skills or work - believing the work should speak for itself, so it fits nicely with my personal views on life.  Secondly, the Panl project implementation should be understated, something which sits in between a web app and the Solr Search server and just happily does its job, hidden between the layers.

I also like the name Solr Panl, it has a certain ring to it - the original tagline at project inception in 2008 was "Panl - soaking up the Solr goodness".

On Documentation

My view on documentation is that it is something that everyone loves to consume, but few people like to produce.  And when I use the word 'documentation', it comes in many sources, from the actual source code and tests of a project, to the generated documentation, the StackOverflow posts, the Search Engine searches, the blog posts, videos, and books, and, of course, the munged version that is AI.

The thing about writing documentation of any type, is that it makes you a better engineer/developer/coder and leads back to questioning your design principles and architecture.  I have never had any hard and fast rules around architecture and some of the decisions made in coding the Panl project was to make the implementation and integration easier for the engineer when it came to parsing the JSON results and debugging/testing through the Panl viewer and explainer web apps.  This led to some design decisions which coupled the code far too tightly - which, in some cases, was a deliberate decision.  

I am a firm believer in ease of understanding and implementation over architectural purity.

Documentation can be time consuming and difficult to do - perhaps this is why documentation can be such a low priority for people, and it can be a grind to get through, constantly writing and updating text and formatting, adding in new features and having to revise the entire book to ensure that everything gets updated and referenced properly.

Just by writing this book, by having to explain how it all fits together, new ideas and ways of doing things have come to my mind, which leads to even more edits of the book.

When you have to explain a decision to someone, or how something works, you are given a second chance to review what you have created, and have to put yourself in the shoes of the reader and ask yourself the same questions about what you are doing. Questions such as:

  • Why did you decide to do things this way?
  • Would I be able to change the way it is done?
  • How can I configure it to do this?
  • What about if I want my search results page to work like this?

It also means that you are not as easily able to hide functionality that should be there but isn't, by not including something you are publicly saying that you:

  1. Didn't know about it, or
  2. Couldn't be bothered to implement it, or
  3. Hoped that people wouldn't notice.

From this perspective I have changed the underlying code and features and functionality that is present within the Panl server.  Some of the changes include

  • Not having a configurable /panl-results-viewer/ and /panl-results-explainer/ URL path, instead you may either turn on or off this functionality.  This would have been a straightforward change, but the benefits were slight, new properties would have to be added, and this only occurs if there is a collection named exactly panl-results-viewer, or panl-results-explainer.
  • Implementation of pass-through (or ignored) URL paths - and the additional property for keeping this token in the canonical URL generation.
  • Less verbose 404 and 500 error messages
  • Translation of boolean fields from true or false to something more human readable
  • Added in DATE Range facets
  • Highlighting, although in this instance I deliberately chose a subset of functionality to implement
  • Hierarchical facets, only displaying a facet and its value if another facet has already been selected.
  • Better way of implementing OR facets in the way that they work.
  • Sorting of facets by (to use Solr nomenclature) index or count.

There are plenty of additions and suggestions for the codebase as the project went along and are detailed in the next section.

All of this has made the product a better one, and if nothing else, I thoroughly recommend writing a book, or at the very least a HOW-TO on whatever project you are working on.

On This Book

From a book perspective, what started as a 281 page (or 52,000 word) book has now ballooned (or perhaps blossomed is a better word) to over 500 pages with getting close to 100,000 words.  Knowing every line of code and configuration is one thing, explaining and documenting it is another.  Setting the right level for the novice and seasoned professional is difficult, however, I chose to go down the 'rather detailed' path, rather than leaving anything out.

It still is a moving target, and as I write the book, (especially with version 2.0.0 on the release path) new bits of functionality come to mind and I wanted the version 2.0.0 release to contain almost everything doable on my wish-list.  That being said, the time it takes to write the book has given me excellent thinking time about future features and about how to implement them whilst keeping them documented.

The challenge becomes updating the book to ensure that every new feature and functionality is correctly added and documented in the myriad of places that it may appear is very time consuming, and I am reluctant to release a new code package without the associated book version being up to date.

Keeping the code, properties, JSON objects, and anything related to the source or output is still very, very annoying with a lot of copy and pasting and reformatting having to be done.  This is especially true as the book was written with Panl version 1.0.0 and has gone through multiple changes to version 2.0.0, all the while, the Solr versions have changed from 9.6.1 to 9.7.0, to finally 9.8.0 which have changed command line parameters and schema versions.  Having an underlying moving target (Solr) with an overlay of a moving target (Panl) makes it a little difficult.

All in all, the book is the one thing that I know keeps the code well documented.  And that is a good thing.

Additional Functionality in the Pipeline

The codebase for this project started in 2008, with updates and some usages, now after languishing for a long time, 16 years later, it has come a long way, and a release was produced.  Not all features and functionality made it into the code base, some from time and effort of implementation, some from documentation, and finally, something just had to be produced and put out into the wide world.  

In general it came down to drawing a line under the current functionality, after all...


Code is never complete, it is just abandoned.


Not all features will make it into the next release and may be de-prioritised.  The list is not in any particular order

  1. Additional support for Solr field types (Low priority)
    Some will be implemented, some will probably be ignored (anything geospatial is probably not going to be included).
  2. Single search page 
    [Released in version 9-1.1.0]
    Being able to have a search landing page with all options available, with the ability to implement a single search page (the example below is a screenshot of the search page for https://www.realestate.com.au).



Image: A single search page showing all options - from https://www.realestate.com.au

  1. Hierarchical facets based on value (Low priority)
    Being able to only show facets if another facet has been selected with a specific value.  This is probably not the best feature to include as it ties the configuration with the data values, which may change over time.
  2. Facet value replacement (Low priority)
    Being able to replace values for specific values of any facet, although this feature would tie the dataset and the Panl configuration together more tightly than I would like.  This will probably be a simple lookup table for word replacement, however would also need to be parsed on the way out of the Solr results as well.
  3. 'More Like This' functionality (Medium priority)
    The ability to return 'more like this' results on a certain field, or FieldSets.
  4. Dynamic range functionality 
    [Released in version 9-1.1.0]
    Dynamically generate the minimum and maximum value for a range for a facet value.
  5. Suppress range values 
    [Released in version 9-1.1.0]
    For a RANGE facet, provide a configuration option to suppress the values that appear in the range as separate values.
  6. Returning more facets for a specific facet field 
    [Released in version 9-1.2.0]

    By default, the facet limit is set to be 100 facet values per facet, in the instance where the returned number of facets are greater than this value, the facets will be truncated.  The Panl server should be able to return the remaining facets with a simple query, without returning any documents with it.  This should be done on an individual facet and possibly have pagination.
  7. Default empty FieldSet 
    [Released in version 9-1.1.0]

    In addition to the 'default' FieldSet, add another FieldSet always named 'empty' which will return no fields (this links in with the 'Returning more facets for a specific facet field' and 'Single search page' items).
  8. Internationalisation 
    [Released in version 9-1.1.0]
    Floating point digits in particular suffer from using a full stop/period as a decimal place i.e. in the UK, a digit would be formatted e.g. 12,345,678.90, whilst most other European countries use a comma for the decimal place e.g. 12.345.678,90.
  9. DATE Range facet update[37] (Medium priority)
    As an extension to internationalisation, the DATE Range facet could do with an update to ensure that the SEO URL is better suited to international uses.  For example, this facet will respond to <range_identifier><value><range_type> - e.g. 'previous 30 days' however for other languages this is not the most suitable - e.g. in French it might be '30 jours précédents' as in <value><range_type><range_identifier>.[38]
  10. Integrated typeahead 
    [Released in version 9-2.0.0]
    An example and implementation of type-ahead in the search results, this is not the suggester feature of the Solr server, rather a way to return documents with no facet information - just the documents.  In effect this is the opposite of the 'empty' CaFUP - where documents are returned, just no facets, just relying on the Solr index.
  11. Specific Solr field search query options (Medium priority)
    [Released in version 9-2.0.0]
    Add in an LPSE code to be able to search on a specific field, fields, or all fields, rather than the default search field, or search for individual fields.
  12. Panl configuration editor (Low priority)
    A GUI to edit (and validate) the Panl configuration files making it easier for a developer to get the configuration correct.
  13. Arbitrary Solr query addition (Low priority)
    Being able to add arbitrary Solr query params to individual collections - there is some Solr functionality which does not require any configuration parameters to be surfaced through the Panl server.  This could just be a hard-code property in the properties file, although there may be some logic about when the solr query parameters need to be added.
  14. Update RANGE LPSE URL encoding 
    [Released in version 9-2.0.0]
    Changing the RANGE LPSE URL from <lpse_code>(+/-)<lpse_code> to just <lpse_code>(+/-) - not really sure why this wasn't implemented in the initial release, will require backwards compatibility checking.
  15. Always on OR facets 
    [Released in version 9-1.2.0]
    OR facets will not be presented if another facet has been selected, this will force the facet to always be returned, this will allow the results to continue to grow.
  16. Arbitrary DATE Ranges  (Low priority)
    Being able to have a way of having arbitrary ranges - e.g. from 3 to 6 months, or 3 to 6 months before - this almost ties in with the DATE Range facet update element.
  17. Arbitrary Separated Values for OR facets 
    [Released in version 9-2.0.0]
    Rather than having a prefix and/or suffix added to the Solr field value for each OR facet, being able to have a separator character (or characters) between the values.  E.g. for the following example URL:

    /Manufactured By The Caran d'Ache Company/Manufactured By The BIC Company/bb/

    Should be able to be configured to be displayed as (or something along the lines of it):

    /Manufactured By The Caran d'Ache,or BIC Company/b/
  18. Update command line help text (Low priority)
    Currently, the full help is displayed, which should only be done if no command is given.  If a command is given, then only the help for that command should be printed out.
  19. Arbitrary Separated Values for REGULAR multi-valued facets
    [Released in version 9-2.0.0]
    Rather than having a prefix and/or suffix added to the Solr field value for each REGULAR multi-valued facet, being able to have a separator character (or characters) between the values.  E.g. for the following example URL:

    /Black/Blue/WW/

    Should be able to be configured to be displayed as (or something along the lines of it):

    /Colours:Black,Blue/W/
  20. BOOLEAN Facet set checkbox value (Low priority)
    [Released in version 9-2.0.0]
    Have an additional option for whether a BOOLEAN facet should be displayed as a checkbox, the property either doesn't exist or is set to true or false depending on which of the boolean values should be highlighted.

    Thought is required as to whether this becomes an additional key in the available facets, and whether it should be permanent.
  21. Multiple lookaheads based on query fields (Low priority)
    This has a dependency on the 'Specific Solr field search query options' and would allow a lookahead to work only on specific fields.  Although this is probably better as database driven...
  22. JSON configuration file (Very low priority)
    Change the panl.properties file to be JSON based, this makes the parsing a little more difficult, but also easier.

    Step one would be to generate the properties files in memory from the JSON configuration, then completely replace them.  Whilst there are some niceties with JSON files, there are also some parts which make it worse (especially around commenting).
  23. Cached OR Facet Values (Medium Priority)
    Where the user wants to display an OR facet, and also wants to display the original counts (which will display as zero if any of the OR facets are displayed)
  24. Panl generator - attempt to keep existing field LPSE codes (Low Priority)
    For any LPSE codes which are randomly assigned by the generator, keep a lookup map of the codes and attempt to keep this assignment on re-generation.
  25. Update the Explainer (Low Priority)
    The explainer has languished and only returns simple lists of strings - this should return JSON and be able to be inspected.
  26. Specific Search Field Boosting (Low Priority)
    [Released in version 9-2.0.0]
    Being able to boost specific fields when searching on Specific Solr fields - this is done on a Panl collection basis.
  27. DATE Range and RANGE value replacement (Low Priority)
    For both of the range facets, it would be nice to be able to configure to replace the URL with a Panl query, for example: The DATE Range Facet could have a replacement of 'Coming Soon' which would be translate to 'Next 3 months' and the RANGE Facet could have a replacement of 'Inexpensive' to translate to 'From 5 to 10 dollars'.
  28. Unless property (Medium Priority)
    [Released in version 9-2.0.0]
    Display this facet 'unless' another facet within the list of 'unless' facets is selected.  This is the opposite of the panl.when.<lpse_code>.
  29. Remove unneeded Solr JSON Response keys (Low Priority)
    Solr returns JSON keys that are duplicated in the Panl response (think facet_counts) which are duplicated in the values array for Panl.

    Provide a property which removes the unneeded/duplicate keys - need to decide whether this is server level, or CaFUP level.
  30. Add in ordering of facets (Low Priority)
    Panl splits the available facets into facets, range_facets, and date_facets, which is fine for the Panl Web App Results Viewer, users may want to display the facets in the order in which they are defined

    Add a new key with the LPSE order (same as Single Page Search Implementation)

~ ~ ~ * ~ ~ ~

In any case, I hope that you enjoy - and more importantly, implement and use - the Panl server for your Solr search server implementation.

~ ~ ~ * ~ ~ ~