Working With Any Dataset

This section dives deeper into the supported Panl field configurations so that you are able to integrate almost any dataset.

The main driver of available Panl configuration options are the Solr fieldType and whether it is then configured in Panl to be a facet or a field and whether this is to be a Specific Solr Search Field.  For a Solr schema file, this is a two step process.  In the Solr managed schema file:

  1. Look at the field XML element, and the type attribute.
  2. Use the type attribute from point 1 above to look up the Solr fieldType with a value for the name attribute that matches the value of the type attribute above.

Whilst this may sound confusing, this is the way in which Solr is configured and is straight-forward once you have been through the exercise a couple of times.  

In the image below, the relationships between the field XML element's type attribute and the fieldType XML element name attribute, along with how the Panl generator surfaces the configuration.



Image: The relationships between the field type, the fieldType class and the Panl properties

As an example above, the variants Solr field in the mechanical pencils Solr managed schema has the following definition

<field "indexed"="true" "stored"="true" "name"="variants" "type"="string"    
      "multiValued"="true" />

The type attribute with the value string above is then referenced in the managed schema to:

<fieldType name="string" class="solr.StrField" sortMissingLast="true" />

It is the class attribute above which then drives the configuration that Panl can respond to.  The value of this class attribute is then used to lookup the Solr class value from the fieldType element in the managed schema file.  

To be helpful, Panl will output the field definition as a comment in the property file and set a property which references the Solr field type that is looked up (see the property panl.type.v=solr.StrField below).

# <field "indexed"="true" "stored"="true" "name"="variants" "type"="string"
        "multiValued"="true" />
panl.field.v=variants
panl.name.v=Variants
panl.type.v=solr.StrField

The panl.type.v=solr.StrField property's value is used by Panl to determine the available configuration options.

Supported Solr Data Types

The following Solr field types are explicitly supported in Panl, and the configuration options that are available are described.

Additionally, all facets may be hierarchical (i.e. only appear when another facet, operand, or parameter has been selected first), and can be sorted by either their count, or their value.

Solr Field Type

Prefix / Suffix

RANGE

DATE Range

OR

BOOL Value Replace

solr.BoolField

YES

NO

NO

NO

YES

solr.DatePointField

NO

NO

YES

NO

NO

solr.DoublePointField

YES

YES

NO

YES

NO

solr.FloatPointField

YES

YES

NO

YES

NO

solr.IntPointField

YES

YES

NO

YES

NO

solr.LongPointField

YES

YES

NO

YES

NO

solr.StrField

YES

NO [23]

NO

YES

NO

solr.TextField

See Notes Below [24]

solr.UUIDField

YES

NO

NO

YES

NO


Notes: The solr.TextField above should only be used as a Panl field, not a facet (unless you have a specific need for it - see the Panl Cookbook for an example).  If it is configured as a facet, then every word in this field will be included in the facet values.


Unsupported/Partially Supported Solr Field Types

Whilst the following fields aren't officially supported by Panl, they can still be returned within the results documents (i.e. configured to be fields).  If they are configured to be facets, then the operation of Panl is undefined, however, they may work, they are just untested.

  • solr.BBoxField
  • solr.BinaryField
  • solr.CollationField
  • solr.CurrencyFieldType
  • solr.DateRangeField
  • solr.ExternalFileField
  • solr.ICUCollationField
  • solr.LatLonPointSpatialField
  • solr.NestPathField
  • solr.PointType
  • solr.PreAnalyzedField
  • solr.RankField
  • solr.RptWithGeometrySpatialField
  • solr.SortableTextField
  • solr.SpatialRecursivePrefixTreeFieldType

IMPORTANT: The Panl generator will not generate any configuration for the above field types, you will have to manually configure them yourself as fields or facets.


A Note on Prefixes, Suffixes, and Infixes

Depending on the prefix, suffix, or infix that you would like to include in your facet values, will affect the the display in the Browser URL, the way in which it is encoded, and consequently the length.

Any prefix/suffix/infix with a space character in it will have the space URL path encoded with a %20.  For example:

The mechanical pencils collection brand facet with the following configuration:

01

02

03

04

05

06

07

panl.facet.b=brand

panl.or.facet.b=false

panl.range.facet.b=false

panl.name.b=Brand

panl.type.b=solr.StrField

panl.prefix.b=Manufactured by

panl.suffix.b=\ Company


To generate the addition URL for the
Koh-i-Noor brand of pencil

/Manufactured by Koh-i-Noor Company/b/

The prefix of 'Manufactured by ' and suffix of ' Company', will have the space character URL encoded to %20 and will display as:

http://localhost:8181/panl-results-viewer/mechanical-pencils/empty/Manufactured%20by%20Koh-i-Noor%20Company/b/

I.e. the URL encoded becomes:

/Manufactured%20by%20Koh-i-Noor%20Company/b/

In Firefox, the address bar will display the URL with the %20 replaced by a space character - i.e. the URL path is decoded before displaying to the users.



Image: Mozilla Firefox browser showing the %20 URL encoding replaced with a space character.


For Chrome, the address bar will display the URL with the  
%20 intact.



Image: Google Chrome browser showing the %20 URL encoding intact.

Safari on Mac OS doesn't even show the URL unless you really look for it - it will just show the hostname.

Notes: The %20 encoding does make the URL less human readable - but still is SEO friendlier.  You may wish to use a different character, for example the dash character ('-').  Be aware that using this character may interfere with infixes and ranges (especially around negative values), so maybe an underscore... ('_').


Facet and Field Types

Each of the defined fields in the file can be defined as either a 'Field' or a 'Facet Field'.  If the Panl field is configured to be a 'Field', then it will be returned with the documents, and no other configuration options are applicable.  If it is configured as a 'Facet' then, it can be faceted upon and, depending on the Solr field type, additional configuration properties are available.

Either a Field or facet Field may be configured to be a Search Field as well - although only if the underlying Solr field is analysed.

Search Fields

Search fields are available on either Fields or Facet Fields, and this is an additional property that is added.  To configure any Solr field as a Panl Search field, add a panl.search.<lpse_code> property, along with the panl.facet.<lpse_code> or panl.field.<lpse_code> property.

01

02

03

04

# <field "indexed"="true" "stored"="false" "name"="text_author" ↩
        "type"="text_general" "multiValued"="true" />

panl.field.T=text_author

panl.search.T=text_author

panl.name.T=Author

panl.type.T=solr.TextField

Hints/Recommendations:

  • The setup of a Specific Solr Search Search Field crosses both the <panl_collection_url>.panl.properties file and the Solr managed-schema.xml file and is used as an addition to a Field or Facet Field.

  • The Solr field MUST be analysed for the facet or field to be configured in the Panl server.  If the Solr field is not analysed and is configured as a Specific Solr Search Field in Panl, then the results will not be as expected.
  • The recommendation is to only use fields as Specific Solr Search Fields, rather than facets unless you have a specific use case.

Fields

Fields are returned with the documents so that they may be rendered to the results page.  They CAN be sorted on, but they CANNOT be faceted on.  Any Solr field that is stored (i.e. stored="true" in the managed schema) may be a configured as a field, additionally any Solr field that is also indexed (i.e. indexed="true" in the managed schema) may be set as either a facet or a field.

Multiple <panl_collection_url>.panl.properties files can be defined with separate Panl properties files with different configurations of facets and fields all connecting to a single Solr search collection.

To configure any Solr field as a Panl field, use the panl.field.<lpse_code> property, rather than the panl.facet.<lpse_code> property.

01

02

03

04

# <field "indexed"="true" "stored"="true" "name"="diameter" "type"="pint" ↩
        "multiValued"="false" />

panl.field.d=diameter

panl.name.d=Diameter

panl.type.d=solr.IntPointField

The only configuration option for a field is the Panl field name - i.e. panl.name.<lpse_code> - which is a 'nicer' display name for the field.

Hints/Recommendations:

  • Generally, fields that contain a lot of text are better configured as a Panl Field, unless you wish to have the words in the text as individual facets (similar to a word cloud).
  • Use fields for any Solr field that you want to be able to sort on, or be returned with the documents.
  • Any facet can be configured to be a field - remember that you may have multiple CaFUPs configured using it as a field or a facet in different places.
  • Any field can be configured to be returned or ignored with different Panl FieldSets.

Facet Fields

Facet fields can have involved configuration depending on the Solr field type and the Panl field type.  The different type of Facet fields and their configuration are explained in the following heading sections.

REGULAR Facets

If you are going to facet on a Solr field, then the mapped field type should be at least indexed and it is a good idea to have it stored as well, but ensure that the type is not mapped to a Solr field type that is analysed.  Multi valued fields are also good to use as facets as they will allow multiple choices for faceting the results, without the need for an OR facet.

Note: The reason behind not analysing the Solr field is that if the field is also analysed, then the facets that are returned will be broken up into their word forms.[25]


A REGULAR facet definition is straightforward:

01

02

03

04

05

06

# <field "indexed"="true" "stored"="true" "name"="colours" "type"="string"  ↩

         "multiValued"="true" />

panl.facet.W=colours

panl.name.W=Colours

panl.prefix.W=Colours:

panl.multivalue.W=true

panl.type.W=solr.StrField


Hints/Recommendations:

  • REGULAR facets are
  • easy to set up, use, implement,
  • offer prefixes and suffixes,
  • can be used as a sort order, and
  • do not have to be returned in the result documents.
  • If they are multivalued, the end user will be able to select more than one.
  • They can be configured to be an OR facet if they are single valued, which will allow users to select more than one value.

REGULAR Facets - Multivalue

Building on the REGULAR facets above, an additional Panl configuration item is available that allows these facts that are are set as multivalued in the Solr managed schema for the collection (i.e. in the managed-schema.xml file the XML field definition element has a multiValued attribute set to true - multiValued="true").

This Solr XML field configuration will flow through the Panl generator which will add a property to the specified panl field of panl.multivalue.<lpse_code>=true.

If this field exists and the value is 'true', then an additional property of panl.multivalue.separator.<lpse_code> can be added which will be picked up by the Panl server.

An snippet of the configuration for the Colours Facet field from the mechanical-pencils-multi-separator.panl.properties file:

01

02

03

04

05

06

07

# <field "indexed"="true" "stored"="true" "name"="colours" "type"="string"  ↩

         "multiValued"="true" />

panl.facet.W=colours

panl.name.W=Colours

panl.prefix.W=Colours:

panl.multivalue.W=true

panl.type.W=solr.StrField

panl.multivalue.separator.W=,


So that when Panl LPSE URLs are generated, the colours facet values are separated by a comma.

For the first Colour facet selected (Black):

http://localhost:8181/panl-results-viewer/mechanical-pencils-multi-separator/brandandname/Colours:Black/W/

And the subsequent Colour facets (Silver, White):

http://localhost:8181/panl-results-viewer/mechanical-pencils-multi-separator/brandandname/Colours:Black,Silver,White/W/

There is only one LPSE code - 'W' which contains the three values.  Without this multivalue separator, the three colours selected above will generate the following URL:

http://localhost:8181/panl-results-viewer/mechanical-pencils/brandandname/Black/Silver/White/WWW/

With the three colours as separate LPSE paths, with three LPSE codes.

Hints/Recommendations:

  • When there are multiple combinations of MultiValued Solr facet values, this shortens the URL considerably - this is especially true when a prefix or suffix is defined for  the Facet

BOOLEAN Facets

BOOLEAN facets may only have one of two values, namely true or false and can have those values replaced by Panl from a more SEO friendly string to their underlying values.

The only Solr field types that allow true/false value replacement is the solr.BoolField.  The replacement values can be set with the panl.bool.<lpse_code>.true and panl.bool.<lpse_code>.false properties.

You may still assign a prefix and suffix to the BOOLEAN facet.  As an example, the disassemble Solr field from the mechanical pencils configuration has the following properties in the mechanical pencils configuration:

01

02

03

04

05

06

07

# <field "indexed"="true" "stored"="true" "name"="disassemble" "type"="boolean" ↩
        "multiValued"="false" />

panl.facet.D=disassemble

panl.name.D=Disassemble

panl.type.D=solr.BoolField

panl.bool.D.true=able to be

panl.bool.D.false=cannot be

panl.suffix.D=\ disassembled


Without using a suffix, this field definition would be:

01

02

03

04

05

06

# <field "indexed"="true" "stored"="true" "name"="disassemble" "type"="boolean" ↩
        "multiValued"="false" />

panl.facet.D=disassemble

panl.name.D=Disassemble

panl.type.D=solr.BoolField

panl.bool.D.true=able to be disassembled

panl.bool.D.false=cannot be disassembled


There is no difference between the two definitions with respect to implementation or the size of the JSON response object, it is just a matter of preference for the implementor.


Hints/Recommendations:

  • For BOOLEAN facets, use the true and false value replacements where it makes sense.  
  • Not all BOOLEAN facets have to have the value replacement, if this field is not used often, or does not have value from an SEO perspective.
  • If you want to shorten the URL path part further, replace the true/false values with single characters - e.g. 1/0 or y/n
  • BOOLEAN replacement values are case-sensitive

BOOLEAN Facets - Checkbox

This is only useful when you want to select only one of the true/false values or no value at all.  Even though a BOOLEAN facet has only two values, there are actually three states that a BOOLEAN facet can have:

  1. 'True' selected - only those results with a true value will be returned
  2. 'False' selected - only those results with a false value will be returned
  3. Not selected - all results are returned which have either a true or false value.

Using a checkbox is different from using the in-built Panl functionality in that you may select 'true', 'false' or remove the selection.  In the case of a checkbox, you will only be able to select one of the true or false values, or select neither.

This is a good use case if you want to emphasise only one of the values.  In either of the below cases, any BOOLEAN facet can be turned into a checkbox, provided that you understand how this impacts the facet selection.

True Value BOOLEAN Facet Checkbox Example

As an example shopping sites may have a 'Speedy Delivery' checkbox, which will filter those results which are available for speedy delivery, however the shopping site does not wish to highlight the results that do not qualify for speedy delivery.  Hence the facet may be selected as either 'True' or no facet value at all.

This is implemented in the Bookstore example (book-store.panl.properties file)

01

02

03

04

05

06

07

# <field "indexed"="true" "stored"="true" "name"="speedy_delivery"  ↩

         "type"="boolean" "multiValued"="false" />

panl.facet.V=speedy_delivery

panl.name.V=Speedy Delivery

panl.type.V=solr.BoolField

panl.bool.V.true=Speedy Delivery

panl.bool.V.false=Regular Delivery

panl.bool.checkbox.V=true


The
panl.bool.checkbox.V=true will enable Panl to pass through additional JSON keys on this facet so that the user interface (and the implementor) can automatically generate the checkbox with the correct links.

False Value BOOLEAN Facet Checkbox Example

As an example, shopping sites may have items which are on backorder, thus making them unavailable for immediate delivery, but still deliverable once the backorder has been fulfilled.  For the user experience, you may wish to present a 'Exclude items on backorder' checkbox.  In effect this will set the boolean value for 'backorder' to 'false', if unchecked then all items will be shown, both those on backorder, and those not on backorder'.

For the Bookstore implementation, the configuration is highlighting the False value (i.e. those that are NOT on backorder):

01

02

03

04

05

06

07

# <field "indexed"="true" "stored"="true" "name"="on_backorder"  ↩

         "type"="boolean" "multiValued"="false" />

panl.facet.O=on_backorder

panl.name.O=On Backorder

panl.type.O=solr.BoolField

panl.bool.O.true=On Backorder

panl.bool.O.false=In Stock

panl.bool.checkbox.O=false


The
panl.bool.checkbox.O=false will enable Panl to pass through additional JSON keys on this facet so that the user interface can automatically generate the checkbox with the correct links.

Hints/Recommendations:

  • This is a good way to emphasise the positives (or exclude the negatives) within a faceted search.
  • You may still render both the true/false values (including prefixes and suffixes) and ignore the BOOLEAN checkbox altogether. 

RANGE Facets

RANGE facets allow the end user to filter the results of the facet by a range of values and have the most Panl configuration options available.  Whilst ranges are available on String types of data in Solr, the only usage in Panl is with integer or floating point numbers.

RANGE facets will also return the individual values for each of the ranges as a REGULAR facet.  If you do not want the REGULAR facet values to be returned as well then set the property panl.range.suppress.<lpse_code>=true.

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

# <field "indexed"="true" "stored"="true" "name"="weight" "type"="pint" ↩
        "multiValued"="false" />

panl.facet.w=weight

panl.name.w=Weight

panl.type.w=solr.IntPointField

panl.suffix.w=\ grams

panl.range.facet.w=true

panl.range.min.w=10

panl.range.max.w=50

panl.range.prefix.w=weighing from

panl.range.infix.w=\ to

panl.range.suffix.w=\ grams

panl.range.min.value.w=from light

panl.range.max.value.w=heavy pencils

panl.range.min.wildcard.w=true

panl.range.max.wildcard.w=true

panl.range.suppress.w=false



Hints/Recommendations:

  • You MUST set a minimum and maximum value, which can be useful when you know the possible values ahead of time, however, by using the dynamically generated minimum and maximum you can also present an accurate result.
  • Use sparingly, and where it makes sense. Ranges can filter the results down to zero documents if the range of values in the documents falls slightly outside the provided values.
  • If there is a large number of disparate values then a range facet may be useful, if there are only a few values, then a REGULAR facet may suffice.
  • Derived fields and ranges can also be another option for a range facet, with the dataset being used to generate static ranges and then stored in the Solr field.
  • The minimum value replacement will only work if the range value matches those values - i.e. it will not work with dynamic range values unless the dynamic range value matches the minimum value.

DATE Range Facets

Solr stores a date field (of fieldType DatePointField) and stores the date as String representations expressed in Coordinated Universal Time (UTC - i.e. YYYY-MM-DDThh:mm:ssZ). An example value: 1972-05-20T17:33:18Z.

When you choose this for a facet, each of the fields will be returned to the exact second without being able to be rolled up to a day, month, or year.   This leads to a very long list of facet values, one for each of the returned result documents.  Consequently Panl will not return any facetting information from Solr, however it will add information for the configured date range to the returned JSON object.  This will allow the date range to be implemented on the front end.

01

02

03

04

05

06

07

08

09

10

# <field "indexed"="true" "stored"="true" "name"="solr_date" "type"="pdate" ↩
        "multiValued"="false" />

panl.facet.S=solr_date

panl.name.S=Solr Date

panl.type.S=solr.DatePointField

panl.date.S.previous=previous

panl.date.S.next=next

panl.date.S.years=\ years

panl.date.S.months=\ months

panl.date.S.days=\ days

panl.date.S.hours=\ hours


Both the
panl.date.<lpse_code>.previous and panl.date.<lpse_code>.next properties must be set for the DATE Range facet to be active, however they do not have to be implemented on the front-end.

IMPORTANT: Panl will __NOT__ request faceting on any Date field types which means that they will not be returned in the base Solr response object, however they can be returned in the field list of Solr document results.

Date field types that are defined as facets within the properties file can be used to return RANGE facet values from NOW +/- a specific period.


Hints/Recommendations:

  • If there is a Solr fieldType of then solr.DatePointField this will ALWAYS be configured to be a DATE Range facet
  • If you want to have a date range of an arbitrary timeframe  - say 3 months to 6 months ago, then you will need to derive a field based on that value and then set a range for that field.  For example, if you wanted to range on months then derive a field of month and set it to year * 12 + month - i.e. 1 would be January, year 0 whilst 24295 is July 2024.
  • Derived fields based on the dates can be very effective, especially when used as a hierarchy.

OR Facets

OR facets allow an end user to increase the number of results by choosing a single facet value OR another facet value for the same facets.  If there are other facets available for this selection within this facet, then they will appear.

01

02

03

04

05

# <field "indexed"="true" "stored"="true" "name"="brand" "type"="string" ↩
        "multiValued"="false" />

panl.facet.b=brand

panl.or.facet.b=true

panl.name.b=Brand

panl.type.b=solr.StrField


OR facets work in conjunction with each other,  if you have multiple OR facets configured for a Panl collection then they work within their specific facet, not across facets.

Example:

If two facets are set as OR facets, for example Manufacturer and Mechanism Type.  Multiple values for either of the two facets can be selected, provided that the Manufacturers that are selected also have documents with the Mechanism Type.

If you were to choose 'BIC' OR 'OHTO' as pencil manufacturers, and 'Click' OR 'None' for mechanism types then the query would of the form:

Select all pencils that are manufactured by BIC or OHTO AND have pencils that are Click or None mechanism types.

Hints/Recommendations:

  • Use OR facets to increase the number of results that are returned.
  • Remember that OR facets only return more facets if there are additional values within the dataset.
  • OR facets will not return additional facets if any other separate facet is selected.

OR Facets - Separator

OR Facets can change the way that they are represented in the URL by setting text to be used as a separator as opposed to using path segments. If you had the following configuration (based on the mechanical pencils Panl configuration), the brand facet is configured with both a prefix and a suffix:

01

02

03

04

05

06

07

# <field "indexed"="true" "stored"="true" "name"="brand" "type"="string" ↩

         "multiValued"="false" />

panl.facet.b=brand

panl.or.facet.b=true

panl.name.b=Brand

panl.type.b=solr.StrField

panl.prefix.b=Manufactured by

panl.suffix.b=\ Company

Example:

If you were to then select all Brands with Kaweco OR Rotring, the PANL LPSE URL would look like the following:

/Manufactured by Kaweco Company/Manufactured by Rotring Company/bb/

Using the above configuration, if you were to configure the separator as a single comma (', or ') as below

01

02

03

04

05

06

07

08

# <field "indexed"="true" "stored"="true" "name"="brand" "type"="string" ↩

         "multiValued"="false" />

panl.facet.b=brand

panl.or.facet.b=true

panl.name.b=Brand

panl.type.b=solr.StrField

panl.prefix.b=Manufactured by

panl.suffix.b=\ Company

panl.or.separator.b=, or

Then the URL generated for the exact same search on the brand of Kaweco OR Rotring would become:

/Manufactured by Kaweco, or Rotring Company/b/

Not only is there now only one LPSE code, the URL is much shorter.

Hints/Recommendations:

  • This can be a good option if a prefix and/or suffix is set for a facet as it will drastically reduce the length of the URL.

Other Facet Options

Hierarchical Facets

Hierarchical facets allow one facet to only appear if another facet, parameter, or operand has already been selected. In the Bookstore example, The book series that an author has published will only appear if an author facet has been selected.

See the panl.when.<lpse_code> section for configuration options.

Hints/Recommendations:

  • Any facet can be made hierarchical.
  • This is useful when
  • you have a lot of facets and not all of them need to appear on the search page, or
  • you want to guide a user through the search results (for example let the user select a year, then a month, then a day).

Unless Facets

Unless facets allow a facet to appear until another specified facet, parameter, or operand is selected. In the Bookstore example, The book series that an author has published will only appear if an author facet has been selected.

See the panl.unless.<lpse_code> section for configuration options.

Hints/Recommendations:

  • Any facet can be made an Unless facet.
  • This is useful when you want to display a facet up to a certain point in the users journey.

Facet Sorting

By default Solr sorts the facet results by 'count' - i.e. the number of documents that have this facet value.  This can be set to 'index' which will sort on the facet value.

This is a distinct property from the facets or fields that you would want to be able to sort the result documents on.  This sorts a specific facet, not the documents.

For example, in the Bookstore Panl configuration the brand facet is configured with panl.facetsort.A=index which will sort the returned facets by their facet values (i.e. index).  Below is an image showing the difference between the facet sorting options.



Image: Images showing the difference between sorting on index (left), and count (right).


See the
panl.facetsort.<lpse_code> property for configuration options.

~ ~ ~ * ~ ~ ~