This issue has been created
There are 2 updates.
 
 
LLM AI Integration / cid:jira-generated-image-avatar-40dd9839-771f-45bb-93c5-a1b5a252e9c5 LLMAI-73 Open

Implement a separate indexing queue with a UI for its status

 
View issue   ยท   Add comment
 

Issue created

 
cid:jira-generated-image-avatar-1b6bb749-a5ab-4a0b-b24a-c0d8664253fc Michael Hamann created this issue on 27/May/24 14:27
 
Summary: Implement a separate indexing queue with a UI for its status
Issue Type: cid:jira-generated-image-avatar-40dd9839-771f-45bb-93c5-a1b5a252e9c5 Improvement
Affects Versions: 0.3.1
Assignee: Unassigned
Created: 27/May/24 14:27
Priority: cid:jira-generated-image-static-major-3f078d72-4dc0-4372-9315-981216a6ad65 Major
Reporter: Michael Hamann
Description:

At the moment, indexing of documents uses the index API for chunking, embedding and indexing of documents in Solr. Unfortunately, this doesn't provide any insights into how many indexing tasks are pending or which documents have already been indexed. Further, if the provider of the embedding model is temporarily down, documents that failed to index because of this won't be tried again when the provider comes back. It seems inevitable to implement a custom indexing queue for the LLM application and a custom worker for indexing the documents. Features to implement:

  • Wait between indexing requests when there are failures of the embedding model
  • Keep track and display which documents had an error during indexing
  • Display how many outstanding index requests there are for each collection and each embedding model
  • Support queuing whole collections for re-indexing when the collection's configuration changes (asking the user for confirmation)
  • Allow re-triggering the indexing of documents that failed to be indexed
  • Store the queue in the database such that it persists across instance restarts
  • Submit multiple chunks in parallel to the embedding model, also chunks from different documents with a configurable limit (the limit should be per embedding model)
 
 

2 updates

 
cid:jira-generated-image-avatar-1b6bb749-a5ab-4a0b-b24a-c0d8664253fc Changes by Michael Hamann on 27/May/24 14:27
 
Fix Version: 0.4
Assignee: Michael Hamann