This issue has been created
There are 3 updates.
 
 
XWiki Platform / cid:jira-generated-image-avatar-261776ee-7810-4fa4-a154-5e72394671c0 XWIKI-23239 Open

Improve Solr indexing speed through parallelization and batch processing

 
View issue   ยท   Add comment
 

Issue created

 
cid:jira-generated-image-avatar-53ae9638-7286-43bc-9c5b-f43c07f7df1e Michael Hamann created this issue on 23/May/25 14:45
 
Summary: Improve Solr indexing speed through parallelization and batch processing
Issue Type: cid:jira-generated-image-avatar-261776ee-7810-4fa4-a154-5e72394671c0 Improvement
Affects Versions: 17.3.0
Assignee: Unassigned
Components: Search - Solr
Created: 23/May/25 14:45
Priority: cid:jira-generated-image-static-major-7f70cbe9-d451-449a-b80d-09da03a709b2 Major
Reporter: Michael Hamann
Description:

The idea of this issue is to improve indexing speed in Solr primarily through the means of parallelization and batching. Both re-indexing of the whole wiki due to upgrades as well as large imports can make it necessary to index a lot of pages at once. However, indexing operations can be quite slow currently.

As current systems usually have enough parallel computing power, the idea of this issue is to address the challenge of slow Solr indexing speed by starting with parallelizing the Solr indexing, at least to the point that the preparation of the data to index and the call to Solr happen in separate threads. Further, the idea is to explore if we can speed up indexing by submitting batches of documents to Solr. Similarly, we could reduce context setup costs by exploiting that frequently, e.g., several objects, properties etc. of a single document are indexed together.

 
 

3 updates

 
cid:jira-generated-image-avatar-53ae9638-7286-43bc-9c5b-f43c07f7df1e Changes by Michael Hamann on 23/May/25 14:46
 
Fix Version: 17.5.0-rc-1
Assignee: Michael Hamann
Labels: performance