[xwiki-contrib/ai-llm-benchmark] 000fba: New bencmark execuition with 8k context window for... - notifications

21 Nov 2024

  Branch: refs/heads/master
  Home:   https://github.com/xwiki-contrib/ai-llm-benchmark
  Commit: 000fbaaaebcb1a31488369f87a453aafea98e269
https://github.com/xwiki-contrib/ai-llm-benchmark/commit/000fbaaaebcb1a3148…
  Author: Paul Pantiru &lt;paul.pantiru(a)xwiki.com&gt;
  Date:   2024-11-21 (Thu, 21 Nov 2024)
  Changed paths:
    A evaluation_results_graphics/en_only/RAG-qa_AnswerRelevancy_bar_chart.png
    A evaluation_results_graphics/en_only/RAG-qa_AnswerRelevancy_box_plot.png
    A evaluation_results_graphics/en_only/RAG-qa_ContextualPrecision_bar_chart.png
    A evaluation_results_graphics/en_only/RAG-qa_ContextualPrecision_box_plot.png
    A evaluation_results_graphics/en_only/RAG-qa_ContextualRecall_bar_chart.png
    A evaluation_results_graphics/en_only/RAG-qa_ContextualRecall_box_plot.png
    A evaluation_results_graphics/en_only/RAG-qa_Correctness_bar_chart.png
    A evaluation_results_graphics/en_only/RAG-qa_Correctness_box_plot.png
    A evaluation_results_graphics/en_only/RAG-qa_CustomContextualRelevancy_bar_chart.png
    A evaluation_results_graphics/en_only/RAG-qa_CustomContextualRelevancy_box_plot.png
    A evaluation_results_graphics/en_only/RAG-qa_Faithfulness_bar_chart.png
    A evaluation_results_graphics/en_only/RAG-qa_Faithfulness_box_plot.png
    A evaluation_results_graphics/en_only/RAG-qa_grouped_bar_chart.png
    A evaluation_results_graphics/en_only/RAG-qa_overall_score_box_plot.png
    A evaluation_results_graphics/en_only/average_average_power_draw_grouped_chart.png
    A evaluation_results_graphics/en_only/average_energy_consumption_grouped_chart.png
    A evaluation_results_graphics/en_only/average_energy_per_input_token_grouped_chart.png
    A
evaluation_results_graphics/en_only/average_energy_per_output_token_grouped_chart.png
    A evaluation_results_graphics/en_only/average_energy_per_total_token_grouped_chart.png
    A evaluation_results_graphics/en_only/average_power_draw_chart.png
    A evaluation_results_graphics/en_only/correctness_comparison_bar_chart.png
    A evaluation_results_graphics/en_only/model_average_power_chart.png
    A evaluation_results_graphics/en_only/summarization_Alignment_bar_chart.png
    A evaluation_results_graphics/en_only/summarization_Alignment_box_plot.png
    A evaluation_results_graphics/en_only/summarization_Coverage_bar_chart.png
    A evaluation_results_graphics/en_only/summarization_Coverage_box_plot.png
    A evaluation_results_graphics/en_only/summarization_grouped_bar_chart.png
    A evaluation_results_graphics/en_only/text_generation_grouped_bar_chart.png
    A evaluation_results_graphics/en_only/text_generation_score_bar_chart.png
    A evaluation_results_graphics/en_only/text_generation_score_box_plot.png
    A reports/report_20241121_172428_en_only/evaluation_report_20241121_172428.pdf
    A reports/report_20241121_172428_en_only/model_outputs_20241121_172429.pdf
  Log Message:
  -----------
  New bencmark execuition with 8k context window for the ollama models and new correctness
metric [english results only]
To unsubscribe from these emails, change your notification settings at
https://github.com/xwiki-contrib/ai-llm-benchmark/settings/notifications