Skip to content

Merge dev into main for release#148

Merged
ofermend merged 25 commits intomainfrom
dev
Nov 18, 2025
Merged

Merge dev into main for release#148
ofermend merged 25 commits intomainfrom
dev

Conversation

@ofermend
Copy link
Collaborator

Merge dev into main for version 0.2.4 release

sulekz and others added 25 commits July 8, 2025 08:52
Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>
fixed lint issues and bumped version to 0.1.7
* Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>

* Update publish_release.yml (#95)

Fix issue with ONNX install

* merge dev into main for release 0.1.6 (#103)

* Clean up merge conflicts from dev -> main (#90)

Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>

* Fix issue with release process and ONNX (#96)

initial

* Enable base_url.

* Update .gitignore

* Remove print statement.

* added fixed seed for umbrela

* Update README.md with the new UI

Removed "visualize" step for the "run on Vectara vs with a connector" and condensed everything into "Visualization" section

* initial

* Added evaluation screenshots to ReadMe

* fixed issues from copilot review

* fixed lint issues

* updated per copilot suggestion

* added print of no answer in vectara connector

* added seed=42 to boost consistency

* bump version (#104)

* bump version

* bugfix with gemini to catch genai.exceptions

* bugfix (#105)

* fixed lint issue (#106)

---------

Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>
Co-authored-by: Suleman Kazi <suleman@vectara.com>
Co-authored-by: Renyi Qu <mikustokes@gmail.com>
Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com>
Co-authored-by: Donna <yu.donna.dong@gmail.com>

---------

Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>
Co-authored-by: Suleman Kazi <suleman@vectara.com>
Co-authored-by: Renyi Qu <mikustokes@gmail.com>
Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com>
Co-authored-by: Donna <yu.donna.dong@gmail.com>
* Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>

* Update publish_release.yml (#95)

Fix issue with ONNX install

* merge dev into main for release 0.1.6 (#103)

* Clean up merge conflicts from dev -> main (#90)

Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>

* Fix issue with release process and ONNX (#96)

initial

* Enable base_url.

* Update .gitignore

* Remove print statement.

* added fixed seed for umbrela

* Update README.md with the new UI

Removed "visualize" step for the "run on Vectara vs with a connector" and condensed everything into "Visualization" section

* initial

* Added evaluation screenshots to ReadMe

* fixed issues from copilot review

* fixed lint issues

* updated per copilot suggestion

* added print of no answer in vectara connector

* added seed=42 to boost consistency

* bump version (#104)

* bump version

* bugfix with gemini to catch genai.exceptions

* bugfix (#105)

* fixed lint issue (#106)

---------

Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>
Co-authored-by: Suleman Kazi <suleman@vectara.com>
Co-authored-by: Renyi Qu <mikustokes@gmail.com>
Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com>
Co-authored-by: Donna <yu.donna.dong@gmail.com>

* Clean up merge conflicts from dev -> main (#90)

Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>

* initial

* fixed issues from copilot review

* Added support to process queries in parallel across all connectors.

* Updated CitationMetric

* Version bumped version to 0.1.7 (#110)

fixed lint issues and bumped version to 0.1.7

* Merge conflict 1 (#112)

* Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>

* Update publish_release.yml (#95)

Fix issue with ONNX install

* merge dev into main for release 0.1.6 (#103)

* Clean up merge conflicts from dev -> main (#90)

Merge dev to main for v0.1.4 (#82)

* upgraded libs

* updated to be compliant with PEP 625

* update MANIFESH.in

* updated versions to remove security vulnerabilities

* Reformat Open-RAG-Eval -> Open RAG Eval. (#76)

* Update publish_release.yml (#80)

Added OPENAI key (from secrets) for publish script

* Update test.yml (#79)

* Llama index connector (#78)

* initial llama_index_connector

* refactored connector to be a true base class with fetch_data
CSVConnector (and unit test) removed since it's really just a results loader and not a true connector

* fixe lint issues

* updated copilot recommendation

* updated after fixing tests

* added llama_index in requirements

* updated

* fixed connector tests and moved to use Pandas instead of CSV

* moved configs to separate folder

* folder re-arranged

* fixed unit test

* more updated on README

* updated per Suleman's comments

* added test_rag_results_loader

* updated LI connector to include citations

* upgraded transformers version

* updated

* updated llama_index connector

* updates to config file comments

* Update _version.py (#81)

---------

Co-authored-by: Ofer Mendelevitch <ofermend@gmail.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>

* Fix issue with release process and ONNX (#96)

initial

* Enable base_url.

* Update .gitignore

* Remove print statement.

* added fixed seed for umbrela

* Update README.md with the new UI

Removed "visualize" step for the "run on Vectara vs with a connector" and condensed everything into "Visualization" section

* initial

* Added evaluation screenshots to ReadMe

* fixed issues from copilot review

* fixed lint issues

* updated per copilot suggestion

* added print of no answer in vectara connector

* added seed=42 to boost consistency

* bump version (#104)

* bump version

* bugfix with gemini to catch genai.exceptions

* bugfix (#105)

* fixed lint issue (#106)

---------

Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>
Co-authored-by: Suleman Kazi <suleman@vectara.com>
Co-authored-by: Renyi Qu <mikustokes@gmail.com>
Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com>
Co-authored-by: Donna <yu.donna.dong@gmail.com>

---------

Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>
Co-authored-by: Suleman Kazi <suleman@vectara.com>
Co-authored-by: Renyi Qu <mikustokes@gmail.com>
Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com>
Co-authored-by: Donna <yu.donna.dong@gmail.com>

* bugfix

---------

Co-authored-by: Suleman <108358100+sulekz@users.noreply.github.com>
Co-authored-by: CJ Cenizal <cj@cenizal.com>
Co-authored-by: Suleman Kazi <suleman@vectara.com>
Co-authored-by: Renyi Qu <mikustokes@gmail.com>
Co-authored-by: Stokes Q <33497497+toastedqu@users.noreply.github.com>
Co-authored-by: Donna <yu.donna.dong@gmail.com>
Co-authored-by: Vishal Naik <naik.vishalishwar@gmail.com>
* some improvements, esp for Together.AI models, version bump etc

* updated

* bugfix in unit tests

* minor updates

* Add anthropic and together requirements, config example for llama_index

* added TRANSFORMER VERBOSITY override

---------

Co-authored-by: david-oplatka <david.oplatka@vectara.com>
* added CLI options so that users don't have to clone the repo

* version bump

* removing unused imports
* added METRICS guide

* updates based on Vish suggestions

* updated

* fixed metrics.md
* fix(eval): omit empty consistency field in results.json output

- Fixed issue where empty `consistency` metrics were still written to results.json
- Ensured that `results.json` output only includes non-empty consistency fields
- Added unit test `test_results_json_consistency_field.py` to validate the fix
- Introduced `requirements-dev.txt` with a full developer toolchain
 (pytest, linting, pre-commit)

* chore(eval): make omit-empty-consistency non-mutating; de-flake test metric name

- Return a filtered copy of the report (no in-place mutation)
- Make test use a generic non-empty payload instead of a specific metric key

* docs(eval): clarify docstring for _omit_empty_consistency to reflect true behavior

- Updated docstring to specify that 'consistency' is removed only when present and falsy
* added query generation capability

* fixed lint issues

* added progress bar

* updated to work with HF_TOKEN for gated HHEM

* updated test action

* updated per Tallat suggestions

* updated output formatter

* version bump

* updated for lint

* fixed typos
* now query generation can be configured to control the % of questions per category (roughly)

* updated to read env from .env file

* Update open_rag_eval/query_generation/llm_generator.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* robust handling when no assignment scores exist

* upgraded transformers to remove security vulnerability

* updated

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* initial support for Vectara HHEM via API

* minor updates

* fixed bug with using Anthropic model

* a few bug fixes and esp with Anthropic model usage

* added tests

* added langchain to requirements

* fixed issue with langchain

* fixed issue with torch meta-score and BertScore incompatibility

* issue with bert score in consistency score around meta-device incompatability
to resolve
* reverted back to v 4.50.2 of transformers
* moved from bert_score to torchmetrics which is more frequently maintained

* added max_length to BERT score to avoid going over the model sequence length (truncate if that happens)

* fixed unit test
@ofermend ofermend requested a review from vish119 November 18, 2025 01:01
@ofermend ofermend merged commit 9bf89bf into main Nov 18, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants