Working With Any Dataset
This section dives deeper into the supported Panl field configurations so that you are able to integrate almost any dataset.
The main driver of available Panl configuration options are the Solr fieldType and whether it is then configured in Panl to be a facet or a field and whether this is to be a Specific Solr Search Field. For a Solr schema file, this is a two step process. In the Solr managed schema file:
- Look at the field XML element, and the type attribute.
- Use the type attribute from point 1 above to look up the Solr fieldType with a value for the name attribute that matches the value of the type attribute above.
Whilst this may sound confusing, this is the way in which Solr is configured and is straight-forward once you have been through the exercise a couple of times.
In the image below, the relationships between the field XML element's type attribute and the fieldType XML element name attribute, along with how the Panl generator surfaces the configuration.
Image: The relationships between the field type, the fieldType class and the Panl properties
As an example above, the variants Solr field in the mechanical pencils Solr managed schema has the following definition
<field "indexed"="true" "stored"="true" "name"="variants" "type"="string"
"multiValued"="true" />
The type attribute with the value string above is then referenced in the managed schema to:
<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
It is the class attribute above which then drives the configuration that Panl can respond to. The value of this class attribute is then used to lookup the Solr class value from the fieldType element in the managed schema file.
To be helpful, Panl will output the field definition as a comment in the property file and set a property which references the Solr field type that is looked up (see the property panl.type.v=solr.StrField below).
# <field "indexed"="true" "stored"="true" "name"="variants" "type"="string"
"multiValued"="true" />
panl.field.v=variants
panl.name.v=Variants
panl.type.v=solr.StrField
The panl.type.v=solr.StrField property's value is used by Panl to determine the available configuration options.
Supported Solr Data Types
The following Solr field types are explicitly supported in Panl, and the configuration options that are available are described.
Additionally, all facets may be hierarchical (i.e. only appear when another facet, operand, or parameter has been selected first), and can be sorted by either their count, or their value.
Solr Field Type |
Prefix / Suffix |
RANGE |
DATE Range |
OR |
BOOL Value Replace |
solr.BoolField |
YES |
NO |
NO |
NO |
YES |
solr.DatePointField |
NO |
NO |
YES |
NO |
NO |
solr.DoublePointField |
YES |
YES |
NO |
YES |
NO |
solr.FloatPointField |
YES |
YES |
NO |
YES |
NO |
solr.IntPointField |
YES |
YES |
NO |
YES |
NO |
solr.LongPointField |
YES |
YES |
NO |
YES |
NO |
solr.StrField |
YES |
NO [23] |
NO |
YES |
NO |
solr.TextField |
See Notes Below [24] |
||||
solr.UUIDField |
YES |
NO |
NO |
YES |
NO |
|
Notes: The solr.TextField above should only be used as a Panl field, not a facet (unless you have a specific need for it - see the Panl Cookbook for an example). If it is configured as a facet, then every word in this field will be included in the facet values. |
Unsupported/Partially Supported Solr Field Types
Whilst the following fields aren't officially supported by Panl, they can still be returned within the results documents (i.e. configured to be fields). If they are configured to be facets, then the operation of Panl is undefined, however, they may work, they are just untested.
- solr.BBoxField
- solr.BinaryField
- solr.CollationField
- solr.CurrencyFieldType
- solr.DateRangeField
- solr.ExternalFileField
- solr.ICUCollationField
- solr.LatLonPointSpatialField
- solr.NestPathField
- solr.PointType
- solr.PreAnalyzedField
- solr.RankField
- solr.RptWithGeometrySpatialField
- solr.SortableTextField
- solr.SpatialRecursivePrefixTreeFieldType
|
IMPORTANT: The Panl generator will not generate any configuration for the above field types, you will have to manually configure them yourself as fields or facets. |
A Note on Prefixes, Suffixes, and Infixes
Depending on the prefix, suffix, or infix that you would like to include in your facet values, will affect the the display in the Browser URL, the way in which it is encoded, and consequently the length.
Any prefix/suffix/infix with a space character in it will have the space URL path encoded with a %20. For example:
The mechanical pencils collection brand facet with the following configuration:
01 02 03 04 05 06 07 |
panl.facet.b=brand panl.or.facet.b=false panl.range.facet.b=false panl.name.b=Brand panl.type.b=solr.StrField panl.prefix.b=Manufactured by panl.suffix.b=\ Company |
To generate the addition URL for the Koh-i-Noor brand of pencil
/Manufactured by Koh-i-Noor Company/b/
The prefix of 'Manufactured by ' and suffix of ' Company', will have the space character URL encoded to %20 and will display as:
I.e. the URL encoded becomes:
/Manufactured%20by%20Koh-i-Noor%20Company/b/
In Firefox, the address bar will display the URL with the %20 replaced by a space character - i.e. the URL path is decoded before displaying to the users.
Image: Mozilla Firefox browser showing the %20 URL encoding replaced with a space character.
For Chrome, the address bar will display the URL with the %20 intact.
Image: Google Chrome browser showing the %20 URL encoding intact.
Safari on Mac OS doesn't even show the URL unless you really look for it - it will just show the hostname.
|
Notes: The %20 encoding does make the URL less human readable - but still is SEO friendlier. You may wish to use a different character, for example the dash character ('-'). Be aware that using this character may interfere with infixes and ranges (especially around negative values), so maybe an underscore... ('_'). |
Facet and Field Types
Each of the defined fields in the file can be defined as either a 'Field' or a 'Facet Field'. If the Panl field is configured to be a 'Field', then it will be returned with the documents, and no other configuration options are applicable. If it is configured as a 'Facet' then, it can be faceted upon and, depending on the Solr field type, additional configuration properties are available.
Either a Field or facet Field may be configured to be a Search Field as well - although only if the underlying Solr field is analysed.
Search Fields
Search fields are available on either Fields or Facet Fields, and this is an additional property that is added. To configure any Solr field as a Panl Search field, add a panl.search.<lpse_code> property, along with the panl.facet.<lpse_code> or panl.field.<lpse_code> property.
01
02 03 04 |
# <field "indexed"="true" "stored"="false" "name"="text_author" ↩ panl.field.T=text_author panl.search.T=text_author panl.name.T=Author panl.type.T=solr.TextField |
Hints/Recommendations:
-
The setup of a Specific Solr Search Search Field crosses both the <panl_collection_url>.panl.properties file and the Solr managed-schema.xml file and is used as an addition to a Field or Facet Field.
- The Solr field MUST be analysed for the facet or field to be configured in the Panl server. If the Solr field is not analysed and is configured as a Specific Solr Search Field in Panl, then the results will not be as expected.
- The recommendation is to only use fields as Specific Solr Search Fields, rather than facets unless you have a specific use case.
Fields
Fields are returned with the documents so that they may be rendered to the results page. They CAN be sorted on, but they CANNOT be faceted on. Any Solr field that is stored (i.e. stored="true" in the managed schema) may be a configured as a field, additionally any Solr field that is also indexed (i.e. indexed="true" in the managed schema) may be set as either a facet or a field.
Multiple <panl_collection_url>.panl.properties files can be defined with separate Panl properties files with different configurations of facets and fields all connecting to a single Solr search collection.
To configure any Solr field as a Panl field, use the panl.field.<lpse_code> property, rather than the panl.facet.<lpse_code> property.
01
02 03 04 |
# <field "indexed"="true" "stored"="true" "name"="diameter" "type"="pint" ↩ panl.field.d=diameter panl.name.d=Diameter panl.type.d=solr.IntPointField |
The only configuration option for a field is the Panl field name - i.e. panl.name.<lpse_code> - which is a 'nicer' display name for the field.
Hints/Recommendations:
- Generally, fields that contain a lot of text are better configured as a Panl Field, unless you wish to have the words in the text as individual facets (similar to a word cloud).
- Use fields for any Solr field that you want to be able to sort on, or be returned with the documents.
- Any facet can be configured to be a field - remember that you may have multiple CaFUPs configured using it as a field or a facet in different places.
- Any field can be configured to be returned or ignored with different Panl FieldSets.
Facet Fields
Facet fields can have involved configuration depending on the Solr field type and the Panl field type. The different type of Facet fields and their configuration are explained in the following heading sections.
REGULAR Facets
If you are going to facet on a Solr field, then the mapped field type should be at least indexed and it is a good idea to have it stored as well, but ensure that the type is not mapped to a Solr field type that is analysed. Multi valued fields are also good to use as facets as they will allow multiple choices for faceting the results, without the need for an OR facet.
|
Note: The reason behind not analysing the Solr field is that if the field is also analysed, then the facets that are returned will be broken up into their word forms.[25] |
A REGULAR facet definition is straightforward:
01
02 03 04 05 06 |
# <field "indexed"="true" "stored"="true" "name"="colours" "type"="string" ↩ "multiValued"="true" /> panl.facet.W=colours panl.name.W=Colours panl.prefix.W=Colours: panl.multivalue.W=true panl.type.W=solr.StrField |
Hints/Recommendations:
- REGULAR facets are
- easy to set up, use, implement,
- offer prefixes and suffixes,
- can be used as a sort order, and
- do not have to be returned in the result documents.
- If they are multivalued, the end user will be able to select more than one.
- They can be configured to be an OR facet if they are single valued, which will allow users to select more than one value.
REGULAR Facets - Multivalue
Building on the REGULAR facets above, an additional Panl configuration item is available that allows these facts that are are set as multivalued in the Solr managed schema for the collection (i.e. in the managed-schema.xml file the XML field definition element has a multiValued attribute set to true - multiValued="true").
This Solr XML field configuration will flow through the Panl generator which will add a property to the specified panl field of panl.multivalue.<lpse_code>=true.
If this field exists and the value is 'true', then an additional property of panl.multivalue.separator.<lpse_code> can be added which will be picked up by the Panl server.
An snippet of the configuration for the Colours Facet field from the mechanical-pencils-multi-separator.panl.properties file:
01
02 03 04 05 06 07 |
# <field "indexed"="true" "stored"="true" "name"="colours" "type"="string" ↩ "multiValued"="true" /> panl.facet.W=colours panl.name.W=Colours panl.prefix.W=Colours: panl.multivalue.W=true panl.type.W=solr.StrField panl.multivalue.separator.W=, |
So that when Panl LPSE URLs are generated, the colours facet values are separated by a comma.
For the first Colour facet selected (Black):
And the subsequent Colour facets (Silver, White):
There is only one LPSE code - 'W' which contains the three values. Without this multivalue separator, the three colours selected above will generate the following URL:
http://localhost:8181/panl-results-viewer/mechanical-pencils/brandandname/Black/Silver/White/WWW/
With the three colours as separate LPSE paths, with three LPSE codes.
Hints/Recommendations:
- When there are multiple combinations of MultiValued Solr facet values, this shortens the URL considerably - this is especially true when a prefix or suffix is defined for the Facet
BOOLEAN Facets
BOOLEAN facets may only have one of two values, namely true or false and can have those values replaced by Panl from a more SEO friendly string to their underlying values.
The only Solr field types that allow true/false value replacement is the solr.BoolField. The replacement values can be set with the panl.bool.<lpse_code>.true and panl.bool.<lpse_code>.false properties.
You may still assign a prefix and suffix to the BOOLEAN facet. As an example, the disassemble Solr field from the mechanical pencils configuration has the following properties in the mechanical pencils configuration:
01
02 03 04 05 06 07 |
# <field "indexed"="true" "stored"="true" "name"="disassemble" "type"="boolean" ↩ panl.facet.D=disassemble panl.name.D=Disassemble panl.type.D=solr.BoolField panl.bool.D.true=able to be panl.bool.D.false=cannot be panl.suffix.D=\ disassembled |
Without using a suffix, this field definition would be:
01
02 03 04 05 06 |
# <field "indexed"="true" "stored"="true" "name"="disassemble" "type"="boolean" ↩ panl.facet.D=disassemble panl.name.D=Disassemble panl.type.D=solr.BoolField panl.bool.D.true=able to be disassembled panl.bool.D.false=cannot be disassembled |
There is no difference between the two definitions with respect to implementation or the size of the JSON response object, it is just a matter of preference for the implementor.
Hints/Recommendations:
- For BOOLEAN facets, use the true and false value replacements where it makes sense.
- Not all BOOLEAN facets have to have the value replacement, if this field is not used often, or does not have value from an SEO perspective.
- If you want to shorten the URL path part further, replace the true/false values with single characters - e.g. 1/0 or y/n
- BOOLEAN replacement values are case-sensitive
BOOLEAN Facets - Checkbox
This is only useful when you want to select only one of the true/false values or no value at all. Even though a BOOLEAN facet has only two values, there are actually three states that a BOOLEAN facet can have:
- 'True' selected - only those results with a true value will be returned
- 'False' selected - only those results with a false value will be returned
- Not selected - all results are returned which have either a true or false value.
Using a checkbox is different from using the in-built Panl functionality in that you may select 'true', 'false' or remove the selection. In the case of a checkbox, you will only be able to select one of the true or false values, or select neither.
This is a good use case if you want to emphasise only one of the values. In either of the below cases, any BOOLEAN facet can be turned into a checkbox, provided that you understand how this impacts the facet selection.
True Value BOOLEAN Facet Checkbox Example
As an example shopping sites may have a 'Speedy Delivery' checkbox, which will filter those results which are available for speedy delivery, however the shopping site does not wish to highlight the results that do not qualify for speedy delivery. Hence the facet may be selected as either 'True' or no facet value at all.
This is implemented in the Bookstore example (book-store.panl.properties file)
01
02 03 04 05 06 07 |
# <field "indexed"="true" "stored"="true" "name"="speedy_delivery" ↩ "type"="boolean" "multiValued"="false" /> panl.facet.V=speedy_delivery panl.name.V=Speedy Delivery panl.type.V=solr.BoolField panl.bool.V.true=Speedy Delivery panl.bool.V.false=Regular Delivery panl.bool.checkbox.V=true |
The panl.bool.checkbox.V=true will enable Panl to pass through additional JSON keys on this facet so that the user interface (and the implementor) can automatically generate the checkbox with the correct links.
False Value BOOLEAN Facet Checkbox Example
As an example, shopping sites may have items which are on backorder, thus making them unavailable for immediate delivery, but still deliverable once the backorder has been fulfilled. For the user experience, you may wish to present a 'Exclude items on backorder' checkbox. In effect this will set the boolean value for 'backorder' to 'false', if unchecked then all items will be shown, both those on backorder, and those not on backorder'.
For the Bookstore implementation, the configuration is highlighting the False value (i.e. those that are NOT on backorder):
01
02 03 04 05 06 07 |
# <field "indexed"="true" "stored"="true" "name"="on_backorder" ↩ "type"="boolean" "multiValued"="false" /> panl.facet.O=on_backorder panl.name.O=On Backorder panl.type.O=solr.BoolField panl.bool.O.true=On Backorder panl.bool.O.false=In Stock panl.bool.checkbox.O=false |
The panl.bool.checkbox.O=false will enable Panl to pass through additional JSON keys on this facet so that the user interface can automatically generate the checkbox with the correct links.
Hints/Recommendations:
- This is a good way to emphasise the positives (or exclude the negatives) within a faceted search.
- You may still render both the true/false values (including prefixes and suffixes) and ignore the BOOLEAN checkbox altogether.
RANGE Facets
RANGE facets allow the end user to filter the results of the facet by a range of values and have the most Panl configuration options available. Whilst ranges are available on String types of data in Solr, the only usage in Panl is with integer or floating point numbers.
RANGE facets will also return the individual values for each of the ranges as a REGULAR facet. If you do not want the REGULAR facet values to be returned as well then set the property panl.range.suppress.<lpse_code>=true.
01
02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 |
# <field "indexed"="true" "stored"="true" "name"="weight" "type"="pint" ↩ panl.facet.w=weight panl.name.w=Weight panl.type.w=solr.IntPointField panl.suffix.w=\ grams panl.range.facet.w=true panl.range.min.w=10 panl.range.max.w=50 panl.range.prefix.w=weighing from panl.range.infix.w=\ to panl.range.suffix.w=\ grams panl.range.min.value.w=from light panl.range.max.value.w=heavy pencils panl.range.min.wildcard.w=true panl.range.max.wildcard.w=true panl.range.suppress.w=false |
Hints/Recommendations:
- You MUST set a minimum and maximum value, which can be useful when you know the possible values ahead of time, however, by using the dynamically generated minimum and maximum you can also present an accurate result.
- Use sparingly, and where it makes sense. Ranges can filter the results down to zero documents if the range of values in the documents falls slightly outside the provided values.
- If there is a large number of disparate values then a range facet may be useful, if there are only a few values, then a REGULAR facet may suffice.
- Derived fields and ranges can also be another option for a range facet, with the dataset being used to generate static ranges and then stored in the Solr field.
- The minimum value replacement will only work if the range value matches those values - i.e. it will not work with dynamic range values unless the dynamic range value matches the minimum value.
DATE Range Facets
Solr stores a date field (of fieldType DatePointField) and stores the date as String representations expressed in Coordinated Universal Time (UTC - i.e. YYYY-MM-DDThh:mm:ssZ). An example value: 1972-05-20T17:33:18Z.
When you choose this for a facet, each of the fields will be returned to the exact second without being able to be rolled up to a day, month, or year. This leads to a very long list of facet values, one for each of the returned result documents. Consequently Panl will not return any facetting information from Solr, however it will add information for the configured date range to the returned JSON object. This will allow the date range to be implemented on the front end.
01
02 03 04 05 06 07 08 09 10 |
# <field "indexed"="true" "stored"="true" "name"="solr_date" "type"="pdate" ↩ panl.facet.S=solr_date panl.name.S=Solr Date panl.type.S=solr.DatePointField panl.date.S.previous=previous panl.date.S.next=next panl.date.S.years=\ years panl.date.S.months=\ months panl.date.S.days=\ days panl.date.S.hours=\ hours |
Both the panl.date.<lpse_code>.previous and panl.date.<lpse_code>.next properties must be set for the DATE Range facet to be active, however they do not have to be implemented on the front-end.
|
IMPORTANT: Panl will __NOT__ request faceting on any Date field types which means that they will not be returned in the base Solr response object, however they can be returned in the field list of Solr document results. Date field types that are defined as facets within the properties file can be used to return RANGE facet values from NOW +/- a specific period. |
Hints/Recommendations:
- If there is a Solr fieldType of then solr.DatePointField this will ALWAYS be configured to be a DATE Range facet
- If you want to have a date range of an arbitrary timeframe - say 3 months to 6 months ago, then you will need to derive a field based on that value and then set a range for that field. For example, if you wanted to range on months then derive a field of month and set it to year * 12 + month - i.e. 1 would be January, year 0 whilst 24295 is July 2024.
- Derived fields based on the dates can be very effective, especially when used as a hierarchy.
OR Facets
OR facets allow an end user to increase the number of results by choosing a single facet value OR another facet value for the same facets. If there are other facets available for this selection within this facet, then they will appear.
01
02 03 04 05 |
# <field "indexed"="true" "stored"="true" "name"="brand" "type"="string" ↩ panl.facet.b=brand panl.or.facet.b=true panl.name.b=Brand panl.type.b=solr.StrField |
OR facets work in conjunction with each other, if you have multiple OR facets configured for a Panl collection then they work within their specific facet, not across facets.
Example:
If two facets are set as OR facets, for example Manufacturer and Mechanism Type. Multiple values for either of the two facets can be selected, provided that the Manufacturers that are selected also have documents with the Mechanism Type.
If you were to choose 'BIC' OR 'OHTO' as pencil manufacturers, and 'Click' OR 'None' for mechanism types then the query would of the form:
Select all pencils that are manufactured by BIC or OHTO AND have pencils that are Click or None mechanism types.
Hints/Recommendations:
- Use OR facets to increase the number of results that are returned.
- Remember that OR facets only return more facets if there are additional values within the dataset.
- OR facets will not return additional facets if any other separate facet is selected.
OR Facets - Separator
OR Facets can change the way that they are represented in the URL by setting text to be used as a separator as opposed to using path segments. If you had the following configuration (based on the mechanical pencils Panl configuration), the brand facet is configured with both a prefix and a suffix:
01
02 03 04 05 06 07 |
# <field "indexed"="true" "stored"="true" "name"="brand" "type"="string" ↩ "multiValued"="false" /> panl.facet.b=brand panl.or.facet.b=true panl.name.b=Brand panl.type.b=solr.StrField panl.prefix.b=Manufactured by panl.suffix.b=\ Company |
Example:
If you were to then select all Brands with Kaweco OR Rotring, the PANL LPSE URL would look like the following:
/Manufactured by Kaweco Company/Manufactured by Rotring Company/bb/
Using the above configuration, if you were to configure the separator as a single comma (', or ') as below
01
02 03 04 05 06 07 08 |
# <field "indexed"="true" "stored"="true" "name"="brand" "type"="string" ↩ "multiValued"="false" /> panl.facet.b=brand panl.or.facet.b=true panl.name.b=Brand panl.type.b=solr.StrField panl.prefix.b=Manufactured by panl.suffix.b=\ Company panl.or.separator.b=, or |
Then the URL generated for the exact same search on the brand of Kaweco OR Rotring would become:
/Manufactured by Kaweco, or Rotring Company/b/
Not only is there now only one LPSE code, the URL is much shorter.
Hints/Recommendations:
- This can be a good option if a prefix and/or suffix is set for a facet as it will drastically reduce the length of the URL.
Other Facet Options
Hierarchical Facets
Hierarchical facets allow one facet to only appear if another facet, parameter, or operand has already been selected. In the Bookstore example, The book series that an author has published will only appear if an author facet has been selected.
See the panl.when.<lpse_code> section for configuration options.
Hints/Recommendations:
- Any facet can be made hierarchical.
- This is useful when
- you have a lot of facets and not all of them need to appear on the search page, or
- you want to guide a user through the search results (for example let the user select a year, then a month, then a day).
Unless Facets
Unless facets allow a facet to appear until another specified facet, parameter, or operand is selected. In the Bookstore example, The book series that an author has published will only appear if an author facet has been selected.
See the panl.unless.<lpse_code> section for configuration options.
Hints/Recommendations:
- Any facet can be made an Unless facet.
- This is useful when you want to display a facet up to a certain point in the users journey.
Facet Sorting
By default Solr sorts the facet results by 'count' - i.e. the number of documents that have this facet value. This can be set to 'index' which will sort on the facet value.
This is a distinct property from the facets or fields that you would want to be able to sort the result documents on. This sorts a specific facet, not the documents.
For example, in the Bookstore Panl configuration the brand facet is configured with panl.facetsort.A=index which will sort the returned facets by their facet values (i.e. index). Below is an image showing the difference between the facet sorting options.
Image: Images showing the difference between sorting on index (left), and count (right).
See the panl.facetsort.<lpse_code> property for configuration options.
~ ~ ~ * ~ ~ ~