Skip to content

Conversation

htahir1
Copy link
Contributor

@htahir1 htahir1 commented Jul 9, 2025

Describe changes

  • New Features

    • Introduced a comprehensive Agent Architecture Comparison example, showcasing three customer service agent designs with interactive visualizations and unified evaluation.
    • Added new prompt management, materialization, and visualization tools for versioned prompt artifacts.
    • Integrated LiteLLM for real LLM calls (with fallback) and optional Langfuse observability.
    • Provided modular pipeline steps for data loading, intent classification, agent comparison, and HTML report generation.
    • Included detailed documentation and quickstart guides for running and customizing the example.
  • Documentation

    • Updated and expanded core documentation to reflect support for both classical ML and AI agent workflows.
    • Added new and revised example READMEs, usage guides, and best practices for secrets, artifacts, models, workspaces, and projects.
  • Chores

    • Added requirements files and .gitignore for clean environment setup and dependency management.

Pre-requisites

Please ensure you have done the following:

  • I have read the CONTRIBUTING.md document.
  • I have added tests to cover my changes.
  • I have based my new branch on develop and the open PR is targeting develop. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.
  • IMPORTANT: I made sure that my changes are reflected properly in the following resources:
    • ZenML Docs
    • Dashboard: Needs to be communicated to the frontend team.
    • Templates: Might need adjustments (that are not reflected in the template tests) in case of non-breaking changes and deprecations.
    • Projects: Depending on the version dependencies, different projects might get affected.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other (add details above)

Summary by CodeRabbit

  • New Features

    • Introduced a comprehensive Agent Architecture Comparison example demonstrating three customer service agent designs with detailed visualizations and unified evaluation reports.
    • Added prompt versioning, materialization, and HTML visualization capabilities for managing prompt artifacts.
    • Integrated LiteLLM for real LLM usage with automatic fallback and optional Langfuse observability for tracking LLM calls and costs.
    • Provided modular pipeline steps covering data loading, intent classifier training, multi-agent comparison, and styled HTML report generation.
    • Included user-friendly scripts and detailed documentation with quickstart instructions and customization guidance.
  • Documentation

    • Expanded core documentation to unify classical ML and AI agent workflows under ZenML’s platform.
    • Enhanced READMEs and guides covering installation, deployment, secrets management, artifact handling, model and agent lifecycle, and collaboration best practices.
  • Chores

    • Added example-specific requirements and .gitignore files to support clean setup and dependency management.

Copy link
Contributor

coderabbitai bot commented Jul 9, 2025

Important

Review skipped

Auto reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

📝 Walkthrough

Walkthrough

This update introduces a comprehensive AI agent architecture comparison example to ZenML, including new modular agents, prompt versioning, custom materializers, and detailed pipeline steps. Documentation across the codebase is updated to reflect ZenML’s unified support for both classical ML and AI agents, with new examples, best practices, and usage instructions. The agent comparison example integrates LiteLLM, LangGraph, and Langfuse, and provides reproducible evaluation and visualization of multiple agent workflows.

Changes

File(s) / Path(s) Change Summary
README.md, examples/agent_comparison/README.md Major overhaul and addition: Updated main and example READMEs with new architecture, examples, installation, and usage instructions for unified ML/agent workflows and agent architecture comparison.
docs/book/introduction.md, docs/book/getting-started/hello-world.md Revised to emphasize unified support for ML and AI agents, updated value propositions, and reorganized next steps for both ML and agent use cases.
docs/book/getting-started/core-concepts.md Expanded to include AI agent pipelines, artifacts, and models; added new example step and pipeline for agent evaluation.
docs/book/getting-started/zenml-pro/projects.md, docs/book/getting-started/zenml-pro/workspaces.md Clarified project and workspace isolation for ML vs. agent systems; added best practices for managing agent and ML workloads, access controls, and LLM cost tracking.
docs/book/how-to/artifacts/artifacts.md Added examples for prompt templates and agent responses as artifacts; updated pipeline to show artifact management for both ML and agent components.
docs/book/how-to/models/models.md Expanded model management documentation to include agent configurations, prompt templates, and agent evaluation; added new example steps for agent config and evaluation.
docs/book/how-to/secrets/secrets.md Added LLM and multi-agent secret management examples; new step shows secure LLM API usage within a pipeline step.
examples/agent_comparison/.gitignore, examples/agent_comparison/requirements.txt Added ignore rules and requirements for the agent comparison example, including ML, LLM, and observability dependencies.
examples/agent_comparison/agent_comparison_pipeline.py, examples/agent_comparison/run.py New entry points for running the agent comparison pipeline with user-friendly console output and dashboard instructions.
examples/agent_comparison/agents.py Implements three agent architectures (SingleAgentRAG, MultiSpecialistAgents, LangGraphCustomerServiceAgent) with fallback logic, visualization, and structured responses.
examples/agent_comparison/llm_utils.py Utilities for LiteLLM and Langfuse integration, with fallback to mock LLM responses and observability configuration.
examples/agent_comparison/materializers/ Introduced custom materializers for prompt and agent objects, including visualization and metadata extraction; added prompt versioning and HTML visualization tools.
examples/agent_comparison/pipelines/ New pipeline definition for comparing agent architectures, including prompt loading, model training, architecture comparison, and evaluation/report generation.
examples/agent_comparison/prompts/ Added prompt templates for single agent, specialists, and LangGraph agent in text and JSON formats.
examples/agent_comparison/steps/ Modularized pipeline steps for data loading, prompt loading, intent classifier training, agent architecture testing, and evaluation/reporting.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Pipeline
    participant DataLoader
    participant PromptLoader
    participant ClassifierTrainer
    participant AgentComparer
    participant Evaluator
    participant ZenMLDashboard

    User->>Pipeline: Run compare_agent_architectures()
    Pipeline->>DataLoader: load_real_conversations()
    DataLoader-->>Pipeline: queries (DataFrame)
    Pipeline->>PromptLoader: load_prompts()
    PromptLoader-->>Pipeline: prompts (Prompt artifacts)
    Pipeline->>ClassifierTrainer: train_intent_classifier(queries)
    ClassifierTrainer-->>Pipeline: intent_classifier (model)
    Pipeline->>AgentComparer: run_architecture_comparison(queries, classifier, prompts)
    AgentComparer-->>Pipeline: results, agent_instances
    Pipeline->>Evaluator: evaluate_and_decide(queries, results)
    Evaluator-->>Pipeline: HTML report
    Pipeline-->>ZenMLDashboard: Artifacts, visualizations, report
Loading

Suggested labels

enhancement

Suggested reviewers

  • AlexejPenner

Poem

🐇
In ZenML’s garden, agents now bloom,
With prompts and graphs in every room.
Pipelines compare, reports are spun,
ML and agents working as one.
From secrets kept to dashboards bright,
This rabbit’s code brings pure delight!
🥕


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@github-actions github-actions bot added the internal To filter out internal PRs and issues label Jul 9, 2025
Copy link

socket-security bot commented Jul 10, 2025

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedlanggraph@​0.5.273100100100100
Addedlangchain-core@​0.3.6874100100100100
Addednumpy@​2.3.17510010010070
Addedlitellm@​1.74.37510010010080
Addedopenai@​1.95.195100100100100
Addedlangfuse@​3.1.399100100100100

View full report

@htahir1 htahir1 marked this pull request as ready for review July 10, 2025 11:11
Copy link
Contributor

github-actions bot commented Jul 10, 2025

Documentation Link Check Results

Absolute links check failed
There are broken absolute links in the documentation. See workflow logs for details
Relative links check passed
Last checked: 2025-07-14 12:11:18 UTC

Copy link
Contributor

Images automagically compressed by Calibre's image-actions

Compression reduced images by 20.7%, saving 874.31 KB.

Filename Before After Improvement Visual comparison
docs/book/.gitbook/assets/readme_problem.png 4.12 MB 3.26 MB -20.7% View diff

366 images did not require optimisation.

Update required: Update image-actions configuration to the latest version before 1/1/21. See README for instructions.

bcdurak and others added 3 commits July 10, 2025 19:02
…ation (#3735)

* first checkpoint

* new changes

* fixes

* new changes

* small change

* deprecate old method

* new changes

* missing import

* listen to events

* linting

* loop optimization

* changed the deprecation warning

* new condition

* switching to jobs

* formatting

* handling the store

* not allowing finished steps to be updated

* docstrings

* label param name

* removed unused function

* comment and formatting

* renamed function

* moved steps outside

* removed unused input

* additional check

* docstrings and formatting

* removed status checks

* orchestrator pod updates

* new check
@htahir1 htahir1 changed the title Update header.png file Update README and add agent example Jul 10, 2025
@htahir1
Copy link
Contributor Author

htahir1 commented Jul 11, 2025

@AlexejPenner @schustmi I am merging this in for now over the weekend. We can redo it on Monday,please approve it when you see this

@htahir1 htahir1 requested a review from AlexejPenner July 11, 2025 17:42
@htahir1
Copy link
Contributor Author

htahir1 commented Jul 13, 2025

@coderabbitai summary

Copy link
Contributor

coderabbitai bot commented Jul 13, 2025

✅ Actions performed

Summary regeneration triggered.

@htahir1
Copy link
Contributor Author

htahir1 commented Jul 13, 2025

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Jul 13, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@htahir1
Copy link
Contributor Author

htahir1 commented Jul 13, 2025

@coderabbitai review

Copy link
Contributor

coderabbitai bot commented Jul 13, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 21

♻️ Duplicate comments (2)
examples/agent_comparison/steps/__init__.py (1)

3-6: Relative imports are fine inside a package – no action needed

The imports resolve correctly as long as examples.agent_comparison.steps is a proper package and is referenced via its parent directory in PYTHONPATH or installed. Docker images using pip install -e . will work.

examples/agent_comparison/agents.py (1)

317-317: Token counting issue persists.

Same inaccurate token counting as noted earlier.

Also applies to: 337-337

🧹 Nitpick comments (27)
examples/agent_comparison/.gitignore (1)

1-43: Consider de-duplicating per-project .gitignore rules

Most of these patterns already exist in the repository-root .gitignore.
Keeping two copies increases the chance they drift apart (e.g. when new build artefacts appear).
If the example really needs additional ignores, prefer adding only the deltas here (or reference the root file via a comment).

docs/book/getting-started/zenml-pro/workspaces.md (2)

12-12: Sentence getting a bit long – split for clarity

The new clause adds valuable context but makes the sentence hard to parse.
Suggestion:

-...are scoped to a workspace. This includes both traditional ML workflows and AI agent development projects.
+...are scoped to a workspace.  
+Workspaces now cover both traditional ML workflows *and* AI-agent development projects.

128-134: Indent level off on the new list items

The added “AI Agent Development Workspace” / “Agent Development Team Workspace” bullets are one space shorter than their siblings, which GitBook sometimes renders as a separate list.

-   * AI Agent Development Workspace
+    * AI Agent Development Workspace
...
-   * Agent Development Team Workspace
+    * Agent Development Team Workspace
examples/agent_comparison/pipelines/__init__.py (1)

3-5: Avoid heavy imports in __init__.py

Importing the full pipeline at package import time can pull in expensive dependencies (LangGraph, LiteLLM, etc.) even when users only want sub-modules.
Consider a lazy import pattern:

from importlib import import_module
import typing as _t

def __getattr__(name: str) -> _t.Any:  # type: ignore[override]
    if name == "compare_agent_architectures":
        return import_module(".agent_comparison", __package__).compare_agent_architectures
    raise AttributeError(name)

This preserves the public API while deferring the heavy work.

examples/agent_comparison/materializers/__init__.py (2)

7-13: Side-effect registration at import time – double-check circular import risk

Registering the PromptMaterializer during module import works, but it means every import of examples.agent_comparison.materializers triggers the registry mutation.
If any downstream code also imports PromptMaterializer (or Prompt) early, you may re-introduce the circular-import situation you just solved.

Consider guarding the call or moving it to an explicit register_materializers() helper that the pipeline’s entry-point invokes once.


15-19: Export list can be future-proofed

__all__ is great. To keep IDE auto-completions stable, sort entries or add a # noqa: F401 comment where wildcard re-exports might confuse linters.

docs/book/introduction.md (1)

17-20: Tighten the introductory paragraph for clarity

The new sentence is informative but long and marketing-heavy. A tighter alternative improves readability:

-ZenML is a unified MLOps framework that extends the battle-tested principles you rely on for classical ML to the new world of AI agents. It's one platform to develop, evaluate, and deploy your entire AI portfolio - from decision trees to complex multi-agent systems. By providing a single framework for your entire AI stack, ZenML enables developers across your organization to collaborate more effectively without maintaining separate toolchains for models and agents.
+ZenML is a unified MLOps framework that applies proven ML principles to AI agents. Use a single platform to develop, evaluate, and deploy everything from decision-tree models to multi-agent systems, without juggling separate toolchains.
examples/agent_comparison/steps/__init__.py (1)

8-14: Keep __all__ alphabetical for quick diffs

Minor style tweak – alphabetical ordering (evaluate_and_decide, load_prompts, …) makes additions easier to review.

examples/agent_comparison/agent_comparison_pipeline.py (1)

10-10: Use explicit relative import for better clarity.

The current import assumes a specific directory structure. Consider using an explicit relative import to make the dependency clearer.

-from pipelines import compare_agent_architectures
+from .pipelines import compare_agent_architectures
examples/agent_comparison/run.py (2)

8-8: Use absolute imports for better clarity

Consider using absolute imports to make the module structure clearer and avoid potential import issues.

-from pipelines import compare_agent_architectures
+from examples.agent_comparison.pipelines import compare_agent_architectures

23-23: Move imports to the top of the file

Following PEP 8 guidelines, imports should be at the top of the file unless there's a specific reason (like avoiding circular imports).

Move these imports to the top of the file:

+from llm_utils import should_use_langfuse, should_use_real_llm
+
 from pipelines import compare_agent_architectures


 def main() -> None:
     """Run the agent architecture comparison pipeline."""
     print("🚀 Starting Agent Architecture Comparison Pipeline...")
     # ... rest of the function
     
-    # Check if real LLMs will be used
-    from llm_utils import should_use_langfuse, should_use_real_llm
examples/agent_comparison/steps/evaluation.py (1)

43-45: Document the mock cost calculation as a placeholder.

The cost calculation uses a hardcoded value that may not reflect actual costs. Consider making this configurable or more clearly documenting it as a placeholder.

-            "total_cost_estimate": np.sum(tokens)
-            * 0.0001,  # Mock cost calculation
+            # TODO: Replace with actual cost calculation based on model pricing
+            "total_cost_estimate": np.sum(tokens)
+            * 0.0001,  # Mock cost: $0.0001 per token (placeholder)
docs/book/getting-started/core-concepts.md (2)

10-10: Consider more concise wording.

The phrase "In order to" could be simplified for better readability.

-**ZenML** is a unified, extensible, open-source MLOps framework for creating portable, production-ready **MLOps pipelines**. It's built for data scientists, ML Engineers, and MLOps Developers to collaborate as they develop to production. By extending the battle-tested principles you rely on for classical ML to the new world of AI agents, ZenML serves as one platform to develop, evaluate, and deploy your entire AI portfolio - from decision trees to complex multi-agent systems. In order to achieve this goal, ZenML introduces various concepts for different aspects of ML workflows and AI agent development, and we can categorize these concepts under three different threads:
+**ZenML** is a unified, extensible, open-source MLOps framework for creating portable, production-ready **MLOps pipelines**. It's built for data scientists, ML Engineers, and MLOps Developers to collaborate as they develop to production. By extending the battle-tested principles you rely on for classical ML to the new world of AI agents, ZenML serves as one platform to develop, evaluate, and deploy your entire AI portfolio - from decision trees to complex multi-agent systems. To achieve this goal, ZenML introduces various concepts for different aspects of ML workflows and AI agent development, and we can categorize these concepts under three different threads:

181-181: Simplify wording for better readability.

The phrase "have the ability to" can be more concise.

-With a deployed **ZenML Server**, users have the ability to create their own teams and project structures. They can easily share pipelines, runs, stacks, and other resources, streamlining the workflow and promoting teamwork across the entire AI development lifecycle.
+With a deployed **ZenML Server**, users can create their own teams and project structures. They can easily share pipelines, runs, stacks, and other resources, streamlining the workflow and promoting teamwork across the entire AI development lifecycle.
examples/agent_comparison/materializers/agent_materializer.py (1)

7-7: Use absolute import for better maintainability.

The relative import from agents import BaseAgent could break if the module structure changes. Consider using an absolute import path.

-from agents import BaseAgent
+from examples.agent_comparison.agents import BaseAgent
examples/agent_comparison/materializers/prompt_materializer.py (1)

8-12: Remove empty version check block.

The Python version check contains empty pass statements and serves no purpose. Either implement the version-specific logic or remove this block entirely.

-if sys.version_info >= (3, 9):
-    pass
-else:
-    pass
examples/agent_comparison/README.md (1)

99-122: Add language specification to code block.

The directory structure code block should specify a language for better syntax highlighting.

-```
+```text
examples/agent_comparison/
├── agents.py                    # Agent architecture implementations
examples/agent_comparison/agents.py (7)

11-12: Apply type ignores more selectively.

Instead of ignoring all type errors for entire import lines, consider being more specific about what needs to be ignored.

-from langchain_core.messages import BaseMessage, HumanMessage  # type: ignore
-from langgraph.graph import END, START, StateGraph  # type: ignore
+from langchain_core.messages import BaseMessage, HumanMessage  # type: ignore[import-untyped]
+from langgraph.graph import END, START, StateGraph  # type: ignore[import-untyped]

13-14: Use absolute imports for better maintainability.

Relative imports can make the code harder to refactor and maintain. Consider using absolute imports.

-from llm_utils import call_llm, should_use_real_llm
-from materializers.prompt import Prompt
+from examples.agent_comparison.llm_utils import call_llm, should_use_real_llm
+from examples.agent_comparison.materializers.prompt import Prompt

27-45: Consider using dataclass for cleaner implementation.

The AgentResponse class can be simplified using Python's dataclass decorator, which would reduce boilerplate and provide useful methods automatically.

+from dataclasses import dataclass
+
+
+@dataclass
 class AgentResponse:
     """Response from an agent with metadata."""
 
-    def __init__(
-        self, text: str, latency_ms: float, confidence: float, tokens_used: int
-    ):
-        """Initialize agent response.
-
-        Args:
-            text: Response text from the agent
-            latency_ms: Response time in milliseconds
-            confidence: Confidence score (0-1)
-            tokens_used: Number of tokens consumed
-        """
-        self.text = text
-        self.latency_ms = latency_ms
-        self.confidence = confidence
-        self.tokens_used = tokens_used
+    text: str
+    """Response text from the agent."""
+    latency_ms: float
+    """Response time in milliseconds."""
+    confidence: float
+    """Confidence score (0-1)."""
+    tokens_used: int
+    """Number of tokens consumed."""

90-158: Refactor process_query for better readability and maintainability.

The method is quite long and handles multiple concerns. Consider extracting the LLM and fallback logic into separate methods.


246-273: Document limitations of keyword-based routing.

The simple keyword matching could misroute queries or fail to match relevant terms. Consider documenting this limitation or suggesting future improvements.

Add a docstring note:

"""Route query to appropriate specialist.

Note: This uses simple keyword matching which may not capture all relevant
queries. Consider using a more sophisticated approach like embedding-based
similarity or a trained classifier for production use.

610-618: Make response validation limits configurable.

The hardcoded length limits (20 and 500) should be configurable for flexibility.

 class LangGraphCustomerServiceAgent(BaseAgent):
     """LangGraph-based customer service agent with workflow visualization."""
 
+    MIN_RESPONSE_LENGTH = 20
+    MAX_RESPONSE_LENGTH = 500
+
     def __init__(self, prompts: Optional[List[Prompt]] = None) -> None:

...

     def _validate_response(
         self, state: CustomerServiceState
     ) -> CustomerServiceState:
         """Validate and finalize the response."""
         # Simple validation - ensure response is not empty and has reasonable length
-        if len(state["response_text"]) < 20:
+        if len(state["response_text"]) < self.MIN_RESPONSE_LENGTH:
             state["response_text"] = (
                 "I apologize, but I need more information to provide you with the best assistance. Could you please provide more details about your request?"
             )
             state["confidence"] = 0.6
-        elif len(state["response_text"]) > 500:
-            state["response_text"] = state["response_text"][:500] + "..."
+        elif len(state["response_text"]) > self.MAX_RESPONSE_LENGTH:
+            state["response_text"] = state["response_text"][:self.MAX_RESPONSE_LENGTH] + "..."
             state["confidence"] = max(0.7, state["confidence"] - 0.1)

649-649: Document that token counting is simulated.

The random token count should be documented as a simulation for demo purposes.

-            tokens_used=random.randint(80, 180),  # Simulated for demo
+            tokens_used=random.randint(80, 180),  # TODO: Implement actual token counting
README.md (3)

78-80: Avoid raw attachment URLs

The bare GitHub-attachments link is brittle (it breaks on PyPI and when GitHub GC’s the asset). Embed it as a Markdown image/video or host it elsewhere:

![Architecture comparison demo](docs/book/.gitbook/assets/agent_comparison_demo.gif)

103-118: Use a concrete return type instead of bare tuple

Precise typing makes the sample copy-paste-ready and helps static checkers.

-from zenml.types import HTMLString
+from typing import Tuple
+from zenml.types import HTMLString
 ...
-def run_architecture_comparison(queries: pd.DataFrame, classifier, prompts: dict) -> tuple:
+def run_architecture_comparison(
+    queries: pd.DataFrame,
+    classifier,
+    prompts: dict,
+) -> Tuple[dict, str]:

305-305: Grammar nit: “to” reads better than “with”

- - 🎙 [Slack](https://zenml.io/slack) - Talk with AI practitioners
+ - 🎙 [Slack](https://zenml.io/slack) - Talk to AI practitioners
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07beafb and 79fab63.

⛔ Files ignored due to path filters (2)
  • docs/book/.gitbook/assets/header.png is excluded by !**/*.png
  • docs/book/.gitbook/assets/readme_problem.png is excluded by !**/*.png
📒 Files selected for processing (31)
  • README.md (5 hunks)
  • docs/book/getting-started/core-concepts.md (5 hunks)
  • docs/book/getting-started/hello-world.md (2 hunks)
  • docs/book/getting-started/zenml-pro/projects.md (2 hunks)
  • docs/book/getting-started/zenml-pro/workspaces.md (2 hunks)
  • docs/book/how-to/artifacts/artifacts.md (3 hunks)
  • docs/book/how-to/models/models.md (6 hunks)
  • docs/book/how-to/secrets/secrets.md (5 hunks)
  • docs/book/introduction.md (1 hunks)
  • examples/agent_comparison/.gitignore (1 hunks)
  • examples/agent_comparison/README.md (1 hunks)
  • examples/agent_comparison/agent_comparison_pipeline.py (1 hunks)
  • examples/agent_comparison/agents.py (1 hunks)
  • examples/agent_comparison/llm_utils.py (1 hunks)
  • examples/agent_comparison/materializers/__init__.py (1 hunks)
  • examples/agent_comparison/materializers/agent_materializer.py (1 hunks)
  • examples/agent_comparison/materializers/prompt.py (1 hunks)
  • examples/agent_comparison/materializers/prompt_materializer.py (1 hunks)
  • examples/agent_comparison/materializers/prompt_visualizer.py (1 hunks)
  • examples/agent_comparison/pipelines/__init__.py (1 hunks)
  • examples/agent_comparison/pipelines/agent_comparison.py (1 hunks)
  • examples/agent_comparison/prompts/langgraph_workflow.txt (1 hunks)
  • examples/agent_comparison/prompts/single_agent_rag.txt (1 hunks)
  • examples/agent_comparison/prompts/specialist_prompts.json (1 hunks)
  • examples/agent_comparison/requirements.txt (1 hunks)
  • examples/agent_comparison/run.py (1 hunks)
  • examples/agent_comparison/steps/__init__.py (1 hunks)
  • examples/agent_comparison/steps/data_loading.py (1 hunks)
  • examples/agent_comparison/steps/evaluation.py (1 hunks)
  • examples/agent_comparison/steps/model_training.py (1 hunks)
  • examples/agent_comparison/steps/testing.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
docs/**/*.md

Instructions used from:

Sources:
⚙️ CodeRabbit Configuration File

🧠 Learnings (13)
📓 Common learnings
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Write comprehensive PR descriptions
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
docs/book/introduction.md (6)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Create migrations with descriptive names: alembic revision -m "Add X to Y table"
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: First line should be a concise summary (50 chars or less)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/getting-started/zenml-pro/workspaces.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
examples/agent_comparison/requirements.txt (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/getting-started/zenml-pro/projects.md (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
examples/agent_comparison/.gitignore (1)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to **/*.py : The primary code style is enforced by ruff, configured in pyproject.toml
docs/book/getting-started/hello-world.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/how-to/artifacts/artifacts.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/getting-started/core-concepts.md (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to docs/book/**/* : Explanation of concepts in documentation
docs/book/how-to/secrets/secrets.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
examples/agent_comparison/README.md (1)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
docs/book/how-to/models/models.md (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
README.md (4)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
🧬 Code Graph Analysis (7)
examples/agent_comparison/pipelines/__init__.py (1)
examples/agent_comparison/pipelines/agent_comparison.py (1)
  • compare_agent_architectures (35-71)
examples/agent_comparison/agent_comparison_pipeline.py (1)
examples/agent_comparison/pipelines/agent_comparison.py (1)
  • compare_agent_architectures (35-71)
examples/agent_comparison/steps/evaluation.py (2)
src/zenml/logger.py (1)
  • get_logger (173-187)
src/zenml/types.py (1)
  • HTMLString (26-27)
examples/agent_comparison/materializers/prompt.py (1)
src/zenml/utils/json_utils.py (1)
  • isoformat (44-53)
examples/agent_comparison/materializers/prompt_visualizer.py (2)
src/zenml/types.py (1)
  • HTMLString (26-27)
examples/agent_comparison/materializers/prompt.py (2)
  • Prompt (9-114)
  • get_variable_names (46-54)
examples/agent_comparison/materializers/agent_materializer.py (4)
examples/agent_comparison/agents.py (7)
  • BaseAgent (47-69)
  • get_graph_visualization (159-182)
  • get_graph_visualization (348-376)
  • get_graph_visualization (652-668)
  • get_mermaid_diagram (184-226)
  • get_mermaid_diagram (378-444)
  • get_mermaid_diagram (670-709)
src/zenml/enums.py (2)
  • ArtifactType (22-31)
  • VisualizationType (56-63)
src/zenml/materializers/base_materializer.py (1)
  • BaseMaterializer (111-395)
examples/agent_comparison/materializers/prompt_materializer.py (4)
  • load (28-39)
  • save (41-68)
  • save_visualizations (70-88)
  • extract_metadata (90-108)
examples/agent_comparison/agents.py (2)
examples/agent_comparison/llm_utils.py (2)
  • call_llm (76-119)
  • should_use_real_llm (27-38)
examples/agent_comparison/materializers/prompt.py (2)
  • Prompt (9-114)
  • format (35-44)
🪛 markdownlint-cli2 (0.17.2)
docs/book/getting-started/zenml-pro/workspaces.md

12-12: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3

(MD001, heading-increment)

README.md

99-99: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🪛 LanguageTool
docs/book/getting-started/core-concepts.md

[style] ~10-~10: Consider a more concise word here.
Context: ...n trees to complex multi-agent systems. In order to achieve this goal, ZenML introduces var...

(IN_ORDER_TO_PREMIUM)


[style] ~181-~181: The phrase ‘have the ability to’ might be wordy. Consider using “can”.
Context: ...With a deployed ZenML Server, users have the ability to create their own teams and project stru...

(HAS_THE_ABILITY_TO)

README.md

[style] ~305-~305: The preposition ‘with’ is not usually used with the verb ‘talk’. Consider using ‘speak’ instead, or replacing the preposition.
Context: ... - 🎙 Slack - Talk with AI practitioners ## ❓ FAQs from ML Eng...

(TALK_WITH)

🔇 Additional comments (24)
examples/agent_comparison/prompts/single_agent_rag.txt (1)

1-8: Prompt LGTM – verify placeholder names match code

{knowledge_context} and {query} look consistent with the loading code.
Just double-check the formatting logic in load_prompts() so that accidental whitespace/newline trimming does not break RAG inputs.

examples/agent_comparison/prompts/langgraph_workflow.txt (1)

1-8: Prompt template reads well

No issues spotted. The placeholders align with the LangGraph agent parameters.

examples/agent_comparison/prompts/specialist_prompts.json (1)

1-6: Consider escaping newlines for safer downstream parsing

The embedded \n\n sequences assume that whichever code interpolates {query} will also split on \n.
If the prompts are forwarded verbatim to an API expecting literal newlines, you’re fine; otherwise JSON-escaped newlines can be mis-interpreted.

Double-check the consumer in load_prompts to be sure the formatting renders as intended.

docs/book/getting-started/hello-world.md (3)

10-10: Excellent addition of unified framework messaging.

This clearly communicates ZenML's value proposition for both classical ML and AI agents, helping users understand the platform's comprehensive scope.


118-122: Well-crafted value proposition updates.

The unified framework and full tracking bullets effectively communicate how ZenML handles diverse AI workloads consistently. The examples (scikit-learn models vs multi-agent systems) provide concrete context.


125-139: Excellent reorganization of learning paths.

The thematic grouping with dedicated LLMs/AI agents section provides clear navigation for users with different interests. The links to specific resources (LLMOps guide, agent evaluation) are particularly valuable.

examples/agent_comparison/agent_comparison_pipeline.py (1)

12-31: Great user experience with informative console output.

The print statements provide clear feedback about what the pipeline does and where to find results. The emojis and formatting make it user-friendly without being overwhelming.

docs/book/getting-started/zenml-pro/projects.md (3)

8-8: Excellent integration of AI agent concepts.

The addition naturally extends the project isolation concept to include AI agents, providing clear value for teams working with mixed workloads.


20-20: Valuable addition to project benefits.

The experiment separation benefit clearly articulates why project isolation matters for different AI development approaches.


116-126: Practical best practices for AI agent development.

The guidance on separating traditional ML and agent development, implementing stricter controls, and tracking LLM costs provides actionable advice for teams adopting AI agents.

examples/agent_comparison/steps/model_training.py (2)

32-52: Simple but effective rule-based labeling approach.

The keyword matching provides a reasonable way to generate training labels for the demo. The logic covers the main customer service categories appropriately.


55-62: Well-configured scikit-learn pipeline.

The TF-IDF vectorizer with 100 features and English stop words, combined with LogisticRegression using a fixed random seed, provides a good baseline classifier with reproducible results.

examples/agent_comparison/pipelines/agent_comparison.py (3)

8-9: Appropriate use of side-effect import for materializer registration.

This is a common ZenML pattern for ensuring custom materializers are registered before pipeline execution.


21-24: Good Docker configuration for reproducible execution.

The requirements file reference and uv package installer provide efficient and reproducible container builds.


32-71: Well-structured pipeline orchestration.

The step sequence is logical: data loading → prompt loading → model training → architecture comparison → evaluation. Disabling cache is appropriate for comparison runs to ensure fresh results.

docs/book/how-to/artifacts/artifacts.md (1)

42-52: Well-structured prompt template example

The addition of the prompt template example clearly demonstrates how ZenML can manage AI artifacts beyond traditional ML data.

docs/book/how-to/secrets/secrets.md (1)

25-38: Excellent LLM secret examples

The CLI examples clearly demonstrate how to create secrets for various LLM providers and multi-agent systems.

examples/agent_comparison/llm_utils.py (1)

76-80: Use compatible type hints for Python < 3.9.

The type hint dict[str, Any] requires Python 3.9+. Use Dict[str, Any] from the typing module for broader compatibility.

+from typing import Any, Dict, Optional
-from typing import Any, Optional

 def call_llm(
     prompt: str,
     model: str = "gpt-3.5-turbo",
-    metadata: Optional[dict[str, Any]] = None,
+    metadata: Optional[Dict[str, Any]] = None,
 ) -> str:
⛔ Skipped due to learnings
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to **/*.py : Use Python 3.9+ compatible code
examples/agent_comparison/materializers/agent_materializer.py (1)

77-140: Excellent visualization implementation!

The save_visualizations method is well-designed:

  • Handles both Mermaid diagrams and text visualizations gracefully
  • Generates clean, styled HTML with proper CSS
  • Correctly escapes curly braces in f-strings
  • Returns appropriate visualization type mappings
examples/agent_comparison/materializers/prompt_materializer.py (1)

13-13: Verify the MaterializersPrompt import source

The import

from materializers.prompt import Prompt as MaterializersPrompt

does not correspond to any materializers/prompt.py module in this repo. Please confirm that:

  • materializers.prompt is provided by an installed dependency or exists elsewhere in your project’s package layout.
  • If it’s for backward‐compatibility with an external package, add a comment explaining its purpose.
  • If it’s actually a local module, switch to a relative import (e.g. from .prompt import Prompt as MaterializersPrompt).

Affected location:

  • examples/agent_comparison/materializers/prompt_materializer.py, lines 13 & 25
examples/agent_comparison/materializers/prompt_visualizer.py (1)

1-489: Excellent visualization implementation!

This module provides comprehensive HTML visualizations for prompts with:

  • Proper HTML escaping throughout to prevent XSS
  • Beautiful, responsive design using ZenML's design system
  • Smart type detection in visualize_prompt_data using duck typing
  • Clean separation between single and multiple prompt visualizations

The CSS styling and HTML structure are professional and well-organized.

examples/agent_comparison/agents.py (2)

18-25: Well-structured state definition for LangGraph workflow.

The TypedDict provides clear type hints and documentation for the workflow state.


184-227: Consider security implications of returning raw HTML.

Returning raw HTML strings could pose XSS risks if not properly handled by the consumer. Ensure proper sanitization when rendering.

README.md (1)

214-217: OpenAI client import will crash on older SDKs

openai.OpenAI() only exists in openai>=1.0.0. Either pin the version in the install command or use the backwards-compatible pattern below:

-import openai
-
-client = openai.OpenAI()  # Set OPENAI_API_KEY environment variable
+from openai import OpenAI
+
+client = OpenAI()  # Requires openai>=1.0.0

Comment on lines +46 to 51
@step
def evaluate_agent_response(prompt: str, test_query: str) -> dict:
"""Evaluates an AI agent's response to a test query."""
response = call_llm_agent(prompt, test_query)
return {"query": test_query, "response": response, "quality_score": 0.95}
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add import or comment for the undefined function.

The example uses call_llm_agent which is not defined or imported, potentially confusing readers.

 @step
 def evaluate_agent_response(prompt: str, test_query: str) -> dict:
     """Evaluates an AI agent's response to a test query."""
+    # Note: call_llm_agent is a placeholder for your LLM integration
+    # You would implement this using libraries like LiteLLM, LangChain, etc.
     response = call_llm_agent(prompt, test_query)
     return {"query": test_query, "response": response, "quality_score": 0.95}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@step
def evaluate_agent_response(prompt: str, test_query: str) -> dict:
"""Evaluates an AI agent's response to a test query."""
response = call_llm_agent(prompt, test_query)
return {"query": test_query, "response": response, "quality_score": 0.95}
```
@step
def evaluate_agent_response(prompt: str, test_query: str) -> dict:
"""Evaluates an AI agent's response to a test query."""
# Note: call_llm_agent is a placeholder for your LLM integration
# You would implement this using libraries like LiteLLM, LangChain, etc.
response = call_llm_agent(prompt, test_query)
return {"query": test_query, "response": response, "quality_score": 0.95}
🤖 Prompt for AI Agents
In docs/book/getting-started/core-concepts.md around lines 46 to 51, the
function call_llm_agent is used but not defined or imported, which can confuse
readers. Add an import statement for call_llm_agent if it exists in the codebase
or include a comment explaining that call_llm_agent is a placeholder for the
actual function that calls the language model agent.

Comment on lines +20 to +40
def load(self, data_type: Type[Any]) -> BaseAgent:
"""Load BaseAgent object from artifact store.

Args:
data_type: The type to load (BaseAgent)

Returns:
Loaded BaseAgent object
"""
_ = data_type # Unused parameter
with fileio.open(
os.path.join(self.uri, "agent_metadata.json"), "r"
) as f:
metadata = json.load(f)

# Note: For this demo, we're not fully serializing/deserializing the agent state
# In a production system, you'd want to save/restore the full agent configuration
# For now, we'll create a basic agent instance with the stored name
agent = BaseAgent(metadata["name"])
return agent

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: Agent state is not properly preserved during load.

The current implementation only restores the agent's name, losing all other state including prompts, specialists, knowledge bases, and other configuration. This breaks the fundamental contract of a materializer - the loaded object should be functionally equivalent to the saved one.

The comment on lines 35-37 acknowledges this limitation, but for production use, you need to implement proper serialization of the full agent state.

Would you like me to help implement proper serialization that preserves the complete agent state? This would involve saving agent type information and configuration parameters, then reconstructing the appropriate agent subclass with all its attributes.

🤖 Prompt for AI Agents
In examples/agent_comparison/materializers/agent_materializer.py between lines
20 and 40, the load method currently only restores the agent's name, losing all
other state and configuration. To fix this, update the serialization to save the
full agent state including prompts, specialists, knowledge bases, and
configuration when saving, and modify the load method to read this complete
state and reconstruct the appropriate agent subclass with all its attributes,
ensuring the loaded agent is functionally equivalent to the saved one.

Comment on lines +47 to +70
class BaseAgent:
"""Base class for all agent architectures."""

def __init__(self, name: str, prompts: Optional[List[Prompt]] = None):
"""Initialize base agent.

Args:
name: Name of the agent architecture
prompts: List of Prompt objects loaded as ZenML artifacts
"""
self.name = name
self.prompts = {p.name: p for p in (prompts or [])}

def process_query(self, query: str) -> AgentResponse:
"""Process a single query and return response with metadata.

Args:
query: Customer service query text

Raises:
NotImplementedError: This method must be implemented by subclasses
"""
raise NotImplementedError

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make BaseAgent a proper abstract base class.

The class should inherit from ABC and use abstractmethod decorator for better type checking and clearer intent.

+from abc import ABC, abstractmethod
+
+
-class BaseAgent:
+class BaseAgent(ABC):
     """Base class for all agent architectures."""

     def __init__(self, name: str, prompts: Optional[List[Prompt]] = None):
         """Initialize base agent.

         Args:
             name: Name of the agent architecture
             prompts: List of Prompt objects loaded as ZenML artifacts
         """
         self.name = name
         self.prompts = {p.name: p for p in (prompts or [])}

+    @abstractmethod
     def process_query(self, query: str) -> AgentResponse:
         """Process a single query and return response with metadata.

         Args:
             query: Customer service query text

         Raises:
             NotImplementedError: This method must be implemented by subclasses
         """
-        raise NotImplementedError
+        pass
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
class BaseAgent:
"""Base class for all agent architectures."""
def __init__(self, name: str, prompts: Optional[List[Prompt]] = None):
"""Initialize base agent.
Args:
name: Name of the agent architecture
prompts: List of Prompt objects loaded as ZenML artifacts
"""
self.name = name
self.prompts = {p.name: p for p in (prompts or [])}
def process_query(self, query: str) -> AgentResponse:
"""Process a single query and return response with metadata.
Args:
query: Customer service query text
Raises:
NotImplementedError: This method must be implemented by subclasses
"""
raise NotImplementedError
from abc import ABC, abstractmethod
class BaseAgent(ABC):
"""Base class for all agent architectures."""
def __init__(self, name: str, prompts: Optional[List[Prompt]] = None):
"""Initialize base agent.
Args:
name: Name of the agent architecture
prompts: List of Prompt objects loaded as ZenML artifacts
"""
self.name = name
self.prompts = {p.name: p for p in (prompts or [])}
@abstractmethod
def process_query(self, query: str) -> AgentResponse:
"""Process a single query and return response with metadata.
Args:
query: Customer service query text
Raises:
NotImplementedError: This method must be implemented by subclasses
"""
pass
🤖 Prompt for AI Agents
In examples/agent_comparison/agents.py around lines 47 to 70, the BaseAgent
class should be made a proper abstract base class by inheriting from ABC and
marking the process_query method with the @abstractmethod decorator. To fix
this, import ABC and abstractmethod from the abc module, have BaseAgent inherit
from ABC, and decorate the process_query method with @abstractmethod to enforce
implementation in subclasses and improve type checking.

htahir1 and others added 6 commits July 13, 2025 17:10
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@strickvl strickvl added the documentation Improvements or additions to documentation label Jul 14, 2025
Copy link
Contributor

@AlexejPenner AlexejPenner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦭 Lets get this in - we can alaways finetune as we go as this can be updated without release

Comment on lines 8 to 11
if sys.version_info >= (3, 9):
pass
else:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if sys.version_info >= (3, 9):
pass
else:
pass

from typing import Tuple

import pandas as pd
from materializers.prompt import Prompt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mixing relative with non-relative imports which I don't like.

@htahir1 htahir1 requested a review from schustmi July 14, 2025 09:59
Copy link
Contributor

LLM Finetuning template updates in examples/llm_finetuning have been pushed.

@htahir1 htahir1 merged commit 6b0047a into develop Jul 14, 2025
6 checks passed
@htahir1 htahir1 deleted the misc/refine-readme-2025 branch July 14, 2025 12:12
htahir1 added a commit that referenced this pull request Jul 14, 2025
* Update header.png file

* Update ZenML logo alt text in README.md

* Update core concepts and steps for ML workflows and agents

* Update model registration and promotion steps

* Enhance project and workspace organization features

* Refactor ML pipeline steps for clarity and consistency

* Update dependencies and add type hints to create_dataset().

* Update create_dataset to return split data as tuples

* Add agent comparison pipeline steps

* Refactor code for better readability

* Update integration information in README.md

* Update prompts, test architectures, and generate diagrams

* Update typing annotations to Any in prompt materializer and visualizer

* Add Langfuse observability integration to LLM utility calls

* Update Langfuse integration for LiteLLM

* Update readme_problem.png asset image

* Optimised images with calibre/image-actions

* Step Status Refresh Functionality + Kubernetes Orchestrator Implementation (#3735)

* first checkpoint

* new changes

* fixes

* new changes

* small change

* deprecate old method

* new changes

* missing import

* listen to events

* linting

* loop optimization

* changed the deprecation warning

* new condition

* switching to jobs

* formatting

* handling the store

* not allowing finished steps to be updated

* docstrings

* label param name

* removed unused function

* comment and formatting

* renamed function

* moved steps outside

* removed unused input

* additional check

* docstrings and formatting

* removed status checks

* orchestrator pod updates

* new check

* Upper limit datasets version (#3824)

* Add Docker settings to pipeline and refactor data loading steps

* Update agent visualizations with automatic generation

* Update visualizations method in Agent Architecture Comparison example

* Register agent materializer import to trigger registration

* Refactor data_loading function return annotations

* Add handling for missing OpenAI library import

* Remove detailed agent workflow print statement

* Update examples/agent_comparison/agent_comparison_pipeline.py

Co-authored-by: Alexej Penner <[email protected]>

* Update pipeline script with evaluation message

* Update README.md

* Update docs/book/how-to/secrets/secrets.md

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Update README.md

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* Remove Langfuse integration and references

* Add token counting notes for accurate counting

* Add import of "re" at the top of the file

* Update README.md

* Update imports to remove unnecessary type ignore

* Update environment variables to use None as default

* Integrate ZenML MCP Server for conversational insights

* Auto-update of LLM Finetuning template

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Barış Can Durak <[email protected]>
Co-authored-by: Michael Schuster <[email protected]>
Co-authored-by: Alexej Penner <[email protected]>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: GitHub Actions <[email protected]>
(cherry picked from commit 6b0047a)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation internal To filter out internal PRs and issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants