This issue has been created
 
 
LLM AI Integration / cid:jira-generated-image-avatar-3bf07d1e-823c-4b9a-add1-ad76d40ab974 LLMAI-115 Open

Improve the indexing speed

 
View issue   ยท   Add comment
 

Issue created

 
cid:jira-generated-image-avatar-fec376bd-1779-4e8d-89bd-e415b9a46b97 Michael Hamann created this issue on 26/Nov/24 15:06
 
Summary: Improve the indexing speed
Issue Type: cid:jira-generated-image-avatar-3bf07d1e-823c-4b9a-add1-ad76d40ab974 Improvement
Affects Versions: 0.6.2
Assignee: Unassigned
Created: 26/Nov/24 15:06
Priority: cid:jira-generated-image-static-major-3798deeb-d7dc-41a5-8177-648fbfe631d5 Major
Reporter: Michael Hamann
Description:

In particular when, e.g., the actual embedding instantly fails, it is noticeable that indexing is quite slow. I found that this is due to two reasons:

  1. The regular expression for finding the last heading at the end of a chunk is surprisingly slow.
  2. The Solr index is committed very frequently (after every document).

Both can easily be improved.