We have many customers who are using WebCenter Content from a long time, so they have a large number of documents in their content repository today. As per IDC, the unstructured data growth rate is 24%, which means in every 4 – 5 years, the data will be doubled. Keeping that in mind, we have come up with a new option to search in WebCenter Content using Elasticsearch engine.
WebCenter Content supports a variety of search engine including DATABASE.METADATA, DATABASE.FULLTEXT, and ORACLETEXTSEARCH. Out of these, ORACLETEXTSEARCH provides a rich searching capability including full-text searches with relevancy ranking, complex query structures, and improved performance compared to DATABASE.FULLTEXT. However, in a large enterprise setup where content items run into millions and ingestion is quite high, customers find rebuilding the ORACLETEXTSEARCH index to be a multi-day activity. Not only that adding a new metadata is a challenge because you need to do fast rebuild which is again a time-consuming process.
Integrating WebCenter Content with Elasticsearch will give customers many more benefits from searching and performance prospective. You can find out those benefits of Elasticsearch here:
- Elasticsearch is an open-source, broadly distributable, readily scalable, enterprise-grade search engine. It is the most popular enterprise level search engine in the market today
- It provides scalability, ability to extend resources and balance the loading between the nodes in a cluster. As a result, it can handle more documents easily
- Many customers faced challenges in indexing because it is a multi-day activity for a large document set. It will help in fast indexing and fast rebuilt of indexes
- It helps you to stop corrupting of indexes, if corrupted also it will take less time to rebuilt and it will not affect the main content server
- As Elasticsearch engine will be running in a separate machine/cluster, so indexing and searching will not affect your content server performance
- And many more…
For more information please visit: https://www.elastic.co/
WebCenter Content communicates with Elasticsearch through REST APIs provided by Elasticsearch. WebCenter Content APIs/services exposed to users remain the same. While the APIs and user interfaces remain mostly untouched in Elasticsearch, rebuild time has reduced significantly. Users will also experience an improved and near real-time search response.
With Elasticsearch, the Indexer Rebuild dialog has two check boxes: Use fast rebuild and Full rebuild with content extraction. You can access this dialog box through Repository Manager by selecting Indexer, then Collection Rebuild Cycle, and then Start.
The Fast Rebuild feature allows the search engine to add new information to the search collection without requiring a full collection rebuild. A Fast Rebuild is required when adding or removing searchable fields. You can open the Collection Rebuild Cycle window and select the Use fast rebuild check box and click OK to do the fast rebuild.
This option fully rebuilds the search index. It extracts content and pushes it to the new index in the Elasticsearch server using the metadata. This is a time consuming task, and therefore, use with extreme caution. You can open the Collection Rebuild Cycle window and select the Full rebuild with content extraction check box and click OK to do the full rebuild.
This option uses the Elasticsearch API to reindex an existing collection to a new collection. For reindexing, it reuses already extracted content and metadata available in the active collection. Since this option doesn’t need to extract content, it’s a faster alternative to Full Rebuild. You can open the Collection Rebuild Cycle window and do not select any of the options. Click OK to do the Elasticsearch reindex.
To know more about configuring Elasticsearch with WebCenter Content, please visit the “Configuring Elasticsearch” section.
From user experience perspective, nothing is changed as such except couple of new search operator at the time of searching. The default search operators are: Contains, Matches, Has Word Prefix, Starts, Ends, Substring and Not Matches.
If you are already using ORACLETEXTSEARCH engine and would like to migrate to ELASTICSEARCH engine, then you do not need to do full rebuild in your repository because your content extraction is already there in the system. A successful migration activity will change the active search index to the Elastic server, es1 from ots1/ots2. It is very easy and fast process. For more information please check “Migrating Existing Search Indexes to Elasticsearch Server” section.
As well as Elasticsearch provides you an option to use Kibana which is an open-source data visualization and exploration tool used for log and time-series analytics, application monitoring, and operational intelligence use cases. Kibana is designed to use Elasticsearch as a data source. Kibana makes your data actionable by providing three key functions. Kibana is:
- An open-source analytics and visualization platform. Use Kibana to explore your Elasticsearch data, and then build beautiful visualizations and dashboards.
- A UI for managing the Elastic Stack. Manage your security settings, assign user roles, take snapshots, roll up your data, and more — all from the convenience of a Kibana UI.
- A centralized hub for Elastic’s solutions. From log analytics to document discovery to SIEM, Kibana is the portal for accessing these and other capabilities.
To know more about Kibana, please read Kibana Guide.
We have already completed Beta of Elasticsearch with few enterprise level customers and seen a huge indexing and performance improvement. If you are interested to use Elasticsearch with WebCenter Content visit Admin Guide or raise support ticket today.
We are keep working on this and enhancing with lot of more functionalities for better manageability and performance. I will write those information in my next blog and keep you posted...