Skip to content

SOLR Documentation

Thomas Scherz edited this page Mar 15, 2023 · 40 revisions

SOLR solr.xml

solr.xml file prior to starting your servlet container.

$SOLR_HOME/lib

SOLR Cores and Schema.xml and SolrConfg.xml

conf.properties

The solrconfig.xml file is the configuration file with the most parameters affecting Solr itself.

While configuring Solr, you’ll work with solrconfig.xml often, either directly or via the Config API to create "configuration overlays" (configoverlay.json) to override the values in solrconfig.xml.

In solrconfig.xml, you configure important features such as:

  • request handlers, which process the requests to Solr, such as requests to add documents to the index or requests to return results for a query ** search :

*** qf and pf qf and pf are used by default, if not otherwise specified by client. The default blacklight_config will use these for the "keywords" search.

author_qf/author_pf, title_qf which the default blacklight_config will specify for those searches.

*** defType **** disMax *** facet *** spellcheck

  • updateHandler : Determines index updates through commits. Items are not searchable until they are committed.

  • Datadir and directoryFactory : Where and how Solr stores its indexes are configurable options.

  • listeners, processes that "listen" for particular query-related events; listeners can be used to trigger the execution of special code, such as invoking some common queries to warm-up caches

  • the Request Dispatcher for managing HTTP communications

The solrconfig.xml file is located in the conf/ directory for each collection. Several well-commented example files can be found in the server/solr/configsets/ directories demonstrating best practices for many different types of installations. /var/solr

Solr Uses Managed Schema by Default When a is not explicitly declared in a solrconfig.xml file, Solr implicitly uses a ManagedIndexSchemaFactory, which is by default "mutable" and keeps schema information in a managed-schema file.

true managed-schema If you wish to explicitly configure ManagedIndexSchemaFactory the following options are available:

mutable - controls whether changes may be made to the Schema data. This must be set to true to allow edits to be made with the Schema API.

managedSchemaResourceName is an optional parameter that defaults to "managed-schema", and defines a new name for the schema file that can be anything other than “schema.xml”.

With the default configuration shown above, you can use the Schema API to modify the schema as much as you want, and then later change the value of mutable to false if you wish to "lock" the schema in place and prevent future changes.

Classic schema.xml An alternative to using a managed schema is to explicitly configure a ClassicIndexSchemaFactory. ClassicIndexSchemaFactory requires the use of a schema.xml configuration file, and disallows any programatic changes to the Schema at run time. The schema.xml file must be edited manually and is only loaded only when the collection is loaded.

Switching from schema.xml to Managed Schema If you have an existing Solr collection that uses ClassicIndexSchemaFactory, and you wish to convert to use a managed schema, you can simply modify the solrconfig.xml to specify the use of the ManagedIndexSchemaFactory.

Once Solr is restarted and it detects that a schema.xml file exists, but the managedSchemaResourceName file (i.e., “managed-schema”) does not exist, the existing schema.xml file will be renamed to schema.xml.bak and the contents are re-written to the managed schema file. If you look at the resulting file, you’ll see this at the top of the page:

You are now free to use the Schema API as much as you want to make changes, and remove the schema.xml.bak.

Admin Web interface

SOLR Binaries and server configuration

/opt/solr

GUI : /admin/info/

Logger Core Admin Query

Blacklight Integration

https://github.com/projectblacklight/blacklight/wiki/Indexing-your-data-into-solr

class MyModel < ActiveRecord::Base after_save :index_record before_destroy :remove_from_index

  attr_accessible :field_i_want_to_index

  def to_solr
    # *_texts here is a dynamic field type specified in solrconfig.xml
    {'id' => id,
     'field_i_want_to_index_texts' => field_i_want_to_index}
  end

  def index_record
    SolrService.add(self.to_solr)
    SolrService.commit
  end

  def remove_from_index
    SolrService.delete_by_id(self.id)
    SolrService.commit
  end
end

/app/controllers/catalog_controller.rb (page per sort, fields to search, sort fields) 
https://github.com/projectblacklight/blacklight/wiki/Configuration---Solr-fields

Steps to Update

  1. Set a FREEZE on Scholar content.
  2. Deploy cap deploy prod
  3. Login into Web Server
  4. Change Login Route /login /error (both web servers)
  5. Run rake hyrax:count RAILS_ENV=Production to get a count of all works in each work type.
  6. Run rake scholar:pidsout RAILS_ENV=Production
  7. Login into Solr Server 

  8. As the root user, stop solr with service solr stop

  9. Back up /var/solr to ... tar -cvf solr.tar solr
  10. CD /opt
  11. Delete the existing /opt/solr symlink... rm solr

  12. Download solr-8.11.1.tgz from https://solr.apache.org/downloads.html

  13. Uncompress solr-8.11.1.tgz to /opt/solr-8.11.1

  14. Create a new symlink for /opt/solr-8.11.1 to /opt/solr.... ln -s solr-8.11.1 solr
  15. As the root user, start solo with service solr start
  16. Delete everything from SOLR using the CURL command
  17. curl http://localhost:8983/solr/collection1/update?commit=true -H "Content-Type: text/xml" --data-binary ':'

  18. Login into Web Server

  19. bundle exec rails console production on web server

  20. Run solr reindex (ActiveFedora::Base.reindex_everything)
  21. If reindex doesn’t run than you need to create a new core on solr with the existing name pointing to the correct xml files
  22. If there is anything wrong with the results, then run rake scholar:resave
  23. Run rake hyrax:count to get a count of all works in each work type and compare results.
  24. Confirm upgrade was successful and release the FREEZE.

Steps to Maintain

Steps to Configure

Tutorials

https://www.nopio.com/blog/create-search-application-blacklight/ https://solr.apache.org/guide/solr/latest/getting-started/solr-tutorial.html https://www.solrtutorial.com/solr-in-5-minutes.html