Enterprise Search in
Plone using Solr
Calvin Hendryx-Parker
Plone Conference 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Java Based
• Full-Text Search
• Web Services API
• Standards Based Interfaces
• Scalable
• XML Configuration
• Extensible
What is Solr?
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Indexing
• Query
Playing with Solr
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Data Schema
• Faceted Search
• Administrative
Interface
• Incremental Updates
• Supports Sharding
• Index Databases, Local
Files and Web Pages
• Supports Multiple
Indexes
Solr Features
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Stopwords
• Synonyms
• Highlighted Context
Snippets
• Spelling Suggestions
• More Like This
Suggestions
• Supports Rich
Documents
Solr Features
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010Solr Performance
• Wiktionary Dataset
• 49.5 Millions lines of XML
• 1.3 GB of data
• 1.7 Million Pages Indexed in 5.5 hours
• ZODB Size after import 1.1GB
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
collective.solr
Integration Options
with Plone
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Monkey Patching
• Relies on collective.indexing
• Duplicates all indexes
• Sub-Optimal Integration with Zope Transactions
• Relies on Thread Locals
collective.solr Issues
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
What to do?
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Reevaluate
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• No Monkey Patching
• Simpler Code
Solr Integration as a
Catalog Index
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• ZCatalog Index
• Doesn't depend on
Plone
• Utilizes new
foreign_connections
Connection Method
• Pass through Solr
Queries
• Direct access to the
Solr Response
Enter alm.solrindex
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Still handled by the ZCatalog
• Could change in the future
Sorting
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Handle Parsing Attributes for Indexing
• Translate field-specific queries to Solr
• Registered as Zope Utilities
alm.solrindex Field
Handlers
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
<html>
<body>
<h3>Code Sample</h3>
<p>Replace this text!</p>
</body>
</html>
Example Handler
class TextFieldHandler(DefaultFieldHandler):
def parse_query(self, field, field_query):
name = field.name
request = {name: field_query}
record = parseIndexRequest(request, name, ('query',))
if not record.keys:
return None
query_str = ' '.join(record.keys)
if not query_str:
return None
return {'q': u'+%s:%s' % (name, quote_query(query_str))}
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• GenericSetup Profile
• Tests
• Uses solrpy instead of
the unsupported
solr.py
Other alm.solrindex
Features
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
• Can replace several ZCatalog indexes
• Remove any indexes you have replaced
• Use it for all Text Indexes
• Still Utilize the ZCatalog Indexes for Everything Else
Tips
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Demo
Project Gutenburg Data
Wednesday, October 27, 2010
PLONE CONFERENCE 2010
Questions?
Wednesday, October 27, 2010
Check out
sixfeetup.com/demos
Wednesday, October 27, 2010

Enterprise search in plone using solr