Branch: refs/heads/master
Home:
https://github.com/xwiki-contrib/ai-llm-benchmark
Commit: 000fbaaaebcb1a31488369f87a453aafea98e269
https://github.com/xwiki-contrib/ai-llm-benchmark/commit/000fbaaaebcb1a3148…
Author: Paul Pantiru <paul.pantiru(a)xwiki.com>
Date: 2024-11-21 (Thu, 21 Nov 2024)
Changed paths:
A evaluation_results_graphics/en_only/RAG-qa_AnswerRelevancy_bar_chart.png
A evaluation_results_graphics/en_only/RAG-qa_AnswerRelevancy_box_plot.png
A evaluation_results_graphics/en_only/RAG-qa_ContextualPrecision_bar_chart.png
A evaluation_results_graphics/en_only/RAG-qa_ContextualPrecision_box_plot.png
A evaluation_results_graphics/en_only/RAG-qa_ContextualRecall_bar_chart.png
A evaluation_results_graphics/en_only/RAG-qa_ContextualRecall_box_plot.png
A evaluation_results_graphics/en_only/RAG-qa_Correctness_bar_chart.png
A evaluation_results_graphics/en_only/RAG-qa_Correctness_box_plot.png
A evaluation_results_graphics/en_only/RAG-qa_CustomContextualRelevancy_bar_chart.png
A evaluation_results_graphics/en_only/RAG-qa_CustomContextualRelevancy_box_plot.png
A evaluation_results_graphics/en_only/RAG-qa_Faithfulness_bar_chart.png
A evaluation_results_graphics/en_only/RAG-qa_Faithfulness_box_plot.png
A evaluation_results_graphics/en_only/RAG-qa_grouped_bar_chart.png
A evaluation_results_graphics/en_only/RAG-qa_overall_score_box_plot.png
A evaluation_results_graphics/en_only/average_average_power_draw_grouped_chart.png
A evaluation_results_graphics/en_only/average_energy_consumption_grouped_chart.png
A evaluation_results_graphics/en_only/average_energy_per_input_token_grouped_chart.png
A
evaluation_results_graphics/en_only/average_energy_per_output_token_grouped_chart.png
A evaluation_results_graphics/en_only/average_energy_per_total_token_grouped_chart.png
A evaluation_results_graphics/en_only/average_power_draw_chart.png
A evaluation_results_graphics/en_only/correctness_comparison_bar_chart.png
A evaluation_results_graphics/en_only/model_average_power_chart.png
A evaluation_results_graphics/en_only/summarization_Alignment_bar_chart.png
A evaluation_results_graphics/en_only/summarization_Alignment_box_plot.png
A evaluation_results_graphics/en_only/summarization_Coverage_bar_chart.png
A evaluation_results_graphics/en_only/summarization_Coverage_box_plot.png
A evaluation_results_graphics/en_only/summarization_grouped_bar_chart.png
A evaluation_results_graphics/en_only/text_generation_grouped_bar_chart.png
A evaluation_results_graphics/en_only/text_generation_score_bar_chart.png
A evaluation_results_graphics/en_only/text_generation_score_box_plot.png
A reports/report_20241121_172428_en_only/evaluation_report_20241121_172428.pdf
A reports/report_20241121_172428_en_only/model_outputs_20241121_172429.pdf
Log Message:
-----------
New bencmark execuition with 8k context window for the ollama models and new correctness
metric [english results only]
To unsubscribe from these emails, change your notification settings at
https://github.com/xwiki-contrib/ai-llm-benchmark/settings/notifications