Merge dev into main for release by ofermend · Pull Request #148 · vectara/open-rag-eval

ofermend · 2025-11-18T01:01:05Z

Merge dev into main for version 0.2.4 release

Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com>

fixed lint issues and bumped version to 0.1.7

* Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> * Update publish_release.yml (#95) Fix issue with ONNX install * merge dev into main for release 0.1.6 (#103) * Clean up merge conflicts from dev -> main (#90) Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> * Fix issue with release process and ONNX (#96) initial * Enable base_url. * Update .gitignore * Remove print statement. * added fixed seed for umbrela * Update README.md with the new UI Removed "visualize" step for the "run on Vectara vs with a connector" and condensed everything into "Visualization" section * initial * Added evaluation screenshots to ReadMe * fixed issues from copilot review * fixed lint issues * updated per copilot suggestion * added print of no answer in vectara connector * added seed=42 to boost consistency * bump version (#104) * bump version * bugfix with gemini to catch genai.exceptions * bugfix (#105) * fixed lint issue (#106) --------- Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> Co-authored-by: Suleman Kazi <suleman@vectara.com> Co-authored-by: Renyi Qu <mikustokes@gmail.com> Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com> Co-authored-by: Donna <yu.donna.dong@gmail.com> --------- Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> Co-authored-by: Suleman Kazi <suleman@vectara.com> Co-authored-by: Renyi Qu <mikustokes@gmail.com> Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com> Co-authored-by: Donna <yu.donna.dong@gmail.com>

* Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> * Update publish_release.yml (#95) Fix issue with ONNX install * merge dev into main for release 0.1.6 (#103) * Clean up merge conflicts from dev -> main (#90) Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> * Fix issue with release process and ONNX (#96) initial * Enable base_url. * Update .gitignore * Remove print statement. * added fixed seed for umbrela * Update README.md with the new UI Removed "visualize" step for the "run on Vectara vs with a connector" and condensed everything into "Visualization" section * initial * Added evaluation screenshots to ReadMe * fixed issues from copilot review * fixed lint issues * updated per copilot suggestion * added print of no answer in vectara connector * added seed=42 to boost consistency * bump version (#104) * bump version * bugfix with gemini to catch genai.exceptions * bugfix (#105) * fixed lint issue (#106) --------- Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> Co-authored-by: Suleman Kazi <suleman@vectara.com> Co-authored-by: Renyi Qu <mikustokes@gmail.com> Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com> Co-authored-by: Donna <yu.donna.dong@gmail.com> * Clean up merge conflicts from dev -> main (#90) Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> * initial * fixed issues from copilot review * Added support to process queries in parallel across all connectors. * Updated CitationMetric * Version bumped version to 0.1.7 (#110) fixed lint issues and bumped version to 0.1.7 * Merge conflict 1 (#112) * Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> * Update publish_release.yml (#95) Fix issue with ONNX install * merge dev into main for release 0.1.6 (#103) * Clean up merge conflicts from dev -> main (#90) Merge dev to main for v0.1.4 (#82) * upgraded libs * updated to be compliant with PEP 625 * update MANIFESH.in * updated versions to remove security vulnerabilities * Reformat Open-RAG-Eval -> Open RAG Eval. (#76) * Update publish_release.yml (#80) Added OPENAI key (from secrets) for publish script * Update test.yml (#79) * Llama index connector (#78) * initial llama_index_connector * refactored connector to be a true base class with fetch_data CSVConnector (and unit test) removed since it's really just a results loader and not a true connector * fixe lint issues * updated copilot recommendation * updated after fixing tests * added llama_index in requirements * updated * fixed connector tests and moved to use Pandas instead of CSV * moved configs to separate folder * folder re-arranged * fixed unit test * more updated on README * updated per Suleman's comments * added test_rag_results_loader * updated LI connector to include citations * upgraded transformers version * updated * updated llama_index connector * updates to config file comments * Update _version.py (#81) --------- Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> * Fix issue with release process and ONNX (#96) initial * Enable base_url. * Update .gitignore * Remove print statement. * added fixed seed for umbrela * Update README.md with the new UI Removed "visualize" step for the "run on Vectara vs with a connector" and condensed everything into "Visualization" section * initial * Added evaluation screenshots to ReadMe * fixed issues from copilot review * fixed lint issues * updated per copilot suggestion * added print of no answer in vectara connector * added seed=42 to boost consistency * bump version (#104) * bump version * bugfix with gemini to catch genai.exceptions * bugfix (#105) * fixed lint issue (#106) --------- Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> Co-authored-by: Suleman Kazi <suleman@vectara.com> Co-authored-by: Renyi Qu <mikustokes@gmail.com> Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com> Co-authored-by: Donna <yu.donna.dong@gmail.com> --------- Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> Co-authored-by: Suleman Kazi <suleman@vectara.com> Co-authored-by: Renyi Qu <mikustokes@gmail.com> Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com> Co-authored-by: Donna <yu.donna.dong@gmail.com> * bugfix --------- Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com> Co-authored-by: CJ Cenizal <cj@cenizal.com> Co-authored-by: Suleman Kazi <suleman@vectara.com> Co-authored-by: Renyi Qu <mikustokes@gmail.com> Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com> Co-authored-by: Donna <yu.donna.dong@gmail.com> Co-authored-by: Vishal Naik <naik.vishalishwar@gmail.com>

Add llm support

* some improvements, esp for Together.AI models, version bump etc * updated * bugfix in unit tests * minor updates * Add anthropic and together requirements, config example for llama_index * added TRANSFORMER VERBOSITY override --------- Co-authored-by: david-oplatka <david.oplatka@vectara.com>

* added CLI options so that users don't have to clone the repo * version bump * removing unused imports

* added METRICS guide * updates based on Vish suggestions * updated * fixed metrics.md

* fix(eval): omit empty consistency field in results.json output - Fixed issue where empty `consistency` metrics were still written to results.json - Ensured that `results.json` output only includes non-empty consistency fields - Added unit test `test_results_json_consistency_field.py` to validate the fix - Introduced `requirements-dev.txt` with a full developer toolchain (pytest, linting, pre-commit) * chore(eval): make omit-empty-consistency non-mutating; de-flake test metric name - Return a filtered copy of the report (no in-place mutation) - Make test use a generic non-empty payload instead of a specific metric key * docs(eval): clarify docstring for _omit_empty_consistency to reflect true behavior - Updated docstring to specify that 'consistency' is removed only when present and falsy

* added query generation capability * fixed lint issues * added progress bar * updated to work with HF_TOKEN for gated HHEM * updated test action * updated per Tallat suggestions * updated output formatter * version bump * updated for lint * fixed typos

* now query generation can be configured to control the % of questions per category (roughly) * updated to read env from .env file * Update open_rag_eval/query_generation/llm_generator.py Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * robust handling when no assignment scores exist * upgraded transformers to remove security vulnerability * updated --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* initial support for Vectara HHEM via API * minor updates * fixed bug with using Anthropic model * a few bug fixes and esp with Anthropic model usage * added tests * added langchain to requirements * fixed issue with langchain * fixed issue with torch meta-score and BertScore incompatibility * issue with bert score in consistency score around meta-device incompatability to resolve * reverted back to v 4.50.2 of transformers * moved from bert_score to torchmetrics which is more frequently maintained * added max_length to BERT score to avoid going over the model sequence length (truncate if that happens) * fixed unit test

sulekz and others added 25 commits July 8, 2025 08:52

Added support to process queries in parallel across all connectors.

437a299

Version bumped version to 0.1.7 (#110)

7141f59

fixed lint issues and bumped version to 0.1.7

Fixed lint issues (#127)

071c63c

Add LLM Judges for Anthropic and Together

047557d

Remove Bad kwargs for Anthropic Models

d237d0a

Update structured output method for TogetherModel

94d4496

Add keyword handling for gemini models

132e399

Fix PR Comments

14ca9c7

Merge pull request #131 from vectara/add-llm-support

2684daa

Add llm support

Merge branch 'main' into dev

f34a5bd

added CLI options so that users don't have to clone the repo (#134)

8963a39

* added CLI options so that users don't have to clone the repo * version bump * removing unused imports

fixed lint issues (#135)

d135b65

added METRICS guide (#141)

2e95f16

* added METRICS guide * updates based on Vish suggestions * updated * fixed metrics.md

update requirements to avoid test failure (#142)

3dca5ec

Minimal fix to .dockerignore (#143)

acf19c5

Merge branch 'main' into dev

7b134e7

Merge branch 'main' into dev

574f67b

ofermend requested a review from vish119 November 18, 2025 01:01

vish119 approved these changes Nov 18, 2025

View reviewed changes

ofermend merged commit 9bf89bf into main Nov 18, 2025
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge dev into main for release#148

Merge dev into main for release#148
ofermend merged 25 commits intomainfrom
dev

ofermend commented Nov 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

ofermend commented Nov 18, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants