-
Notifications
You must be signed in to change notification settings - Fork 536
Update README and add agent example #3815
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughThis update introduces a comprehensive AI agent architecture comparison example to ZenML, including new modular agents, prompt versioning, custom materializers, and detailed pipeline steps. Documentation across the codebase is updated to reflect ZenML’s unified support for both classical ML and AI agents, with new examples, best practices, and usage instructions. The agent comparison example integrates LiteLLM, LangGraph, and Langfuse, and provides reproducible evaluation and visualization of multiple agent workflows. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Pipeline
participant DataLoader
participant PromptLoader
participant ClassifierTrainer
participant AgentComparer
participant Evaluator
participant ZenMLDashboard
User->>Pipeline: Run compare_agent_architectures()
Pipeline->>DataLoader: load_real_conversations()
DataLoader-->>Pipeline: queries (DataFrame)
Pipeline->>PromptLoader: load_prompts()
PromptLoader-->>Pipeline: prompts (Prompt artifacts)
Pipeline->>ClassifierTrainer: train_intent_classifier(queries)
ClassifierTrainer-->>Pipeline: intent_classifier (model)
Pipeline->>AgentComparer: run_architecture_comparison(queries, classifier, prompts)
AgentComparer-->>Pipeline: results, agent_instances
Pipeline->>Evaluator: evaluate_and_decide(queries, results)
Evaluator-->>Pipeline: HTML report
Pipeline-->>ZenMLDashboard: Artifacts, visualizations, report
Suggested labels
Suggested reviewers
Poem
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
Documentation Link Check Results❌ Absolute links check failed |
Images automagically compressed by Calibre's image-actions ✨ Compression reduced images by 20.7%, saving 874.31 KB.
366 images did not require optimisation. Update required: Update image-actions configuration to the latest version before 1/1/21. See README for instructions. |
…ation (#3735) * first checkpoint * new changes * fixes * new changes * small change * deprecate old method * new changes * missing import * listen to events * linting * loop optimization * changed the deprecation warning * new condition * switching to jobs * formatting * handling the store * not allowing finished steps to be updated * docstrings * label param name * removed unused function * comment and formatting * renamed function * moved steps outside * removed unused input * additional check * docstrings and formatting * removed status checks * orchestrator pod updates * new check
@AlexejPenner @schustmi I am merging this in for now over the weekend. We can redo it on Monday,please approve it when you see this |
@coderabbitai summary |
✅ Actions performedSummary regeneration triggered. |
@coderabbitai review |
✅ Actions performedReview triggered.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 21
♻️ Duplicate comments (2)
examples/agent_comparison/steps/__init__.py (1)
3-6
: Relative imports are fine inside a package – no action neededThe imports resolve correctly as long as
examples.agent_comparison.steps
is a proper package and is referenced via its parent directory inPYTHONPATH
or installed. Docker images usingpip install -e .
will work.examples/agent_comparison/agents.py (1)
317-317
: Token counting issue persists.Same inaccurate token counting as noted earlier.
Also applies to: 337-337
🧹 Nitpick comments (27)
examples/agent_comparison/.gitignore (1)
1-43
: Consider de-duplicating per-project .gitignore rulesMost of these patterns already exist in the repository-root
.gitignore
.
Keeping two copies increases the chance they drift apart (e.g. when new build artefacts appear).
If the example really needs additional ignores, prefer adding only the deltas here (or reference the root file via a comment).docs/book/getting-started/zenml-pro/workspaces.md (2)
12-12
: Sentence getting a bit long – split for clarityThe new clause adds valuable context but makes the sentence hard to parse.
Suggestion:-...are scoped to a workspace. This includes both traditional ML workflows and AI agent development projects. +...are scoped to a workspace. +Workspaces now cover both traditional ML workflows *and* AI-agent development projects.
128-134
: Indent level off on the new list itemsThe added “AI Agent Development Workspace” / “Agent Development Team Workspace” bullets are one space shorter than their siblings, which GitBook sometimes renders as a separate list.
- * AI Agent Development Workspace + * AI Agent Development Workspace ... - * Agent Development Team Workspace + * Agent Development Team Workspaceexamples/agent_comparison/pipelines/__init__.py (1)
3-5
: Avoid heavy imports in__init__.py
Importing the full pipeline at package import time can pull in expensive dependencies (LangGraph, LiteLLM, etc.) even when users only want sub-modules.
Consider a lazy import pattern:from importlib import import_module import typing as _t def __getattr__(name: str) -> _t.Any: # type: ignore[override] if name == "compare_agent_architectures": return import_module(".agent_comparison", __package__).compare_agent_architectures raise AttributeError(name)This preserves the public API while deferring the heavy work.
examples/agent_comparison/materializers/__init__.py (2)
7-13
: Side-effect registration at import time – double-check circular import riskRegistering the
PromptMaterializer
during module import works, but it means every import ofexamples.agent_comparison.materializers
triggers the registry mutation.
If any downstream code also importsPromptMaterializer
(orPrompt
) early, you may re-introduce the circular-import situation you just solved.Consider guarding the call or moving it to an explicit
register_materializers()
helper that the pipeline’s entry-point invokes once.
15-19
: Export list can be future-proofed
__all__
is great. To keep IDE auto-completions stable, sort entries or add a# noqa: F401
comment where wildcard re-exports might confuse linters.docs/book/introduction.md (1)
17-20
: Tighten the introductory paragraph for clarityThe new sentence is informative but long and marketing-heavy. A tighter alternative improves readability:
-ZenML is a unified MLOps framework that extends the battle-tested principles you rely on for classical ML to the new world of AI agents. It's one platform to develop, evaluate, and deploy your entire AI portfolio - from decision trees to complex multi-agent systems. By providing a single framework for your entire AI stack, ZenML enables developers across your organization to collaborate more effectively without maintaining separate toolchains for models and agents. +ZenML is a unified MLOps framework that applies proven ML principles to AI agents. Use a single platform to develop, evaluate, and deploy everything from decision-tree models to multi-agent systems, without juggling separate toolchains.examples/agent_comparison/steps/__init__.py (1)
8-14
: Keep__all__
alphabetical for quick diffsMinor style tweak – alphabetical ordering (
evaluate_and_decide
,load_prompts
, …) makes additions easier to review.examples/agent_comparison/agent_comparison_pipeline.py (1)
10-10
: Use explicit relative import for better clarity.The current import assumes a specific directory structure. Consider using an explicit relative import to make the dependency clearer.
-from pipelines import compare_agent_architectures +from .pipelines import compare_agent_architecturesexamples/agent_comparison/run.py (2)
8-8
: Use absolute imports for better clarityConsider using absolute imports to make the module structure clearer and avoid potential import issues.
-from pipelines import compare_agent_architectures +from examples.agent_comparison.pipelines import compare_agent_architectures
23-23
: Move imports to the top of the fileFollowing PEP 8 guidelines, imports should be at the top of the file unless there's a specific reason (like avoiding circular imports).
Move these imports to the top of the file:
+from llm_utils import should_use_langfuse, should_use_real_llm + from pipelines import compare_agent_architectures def main() -> None: """Run the agent architecture comparison pipeline.""" print("🚀 Starting Agent Architecture Comparison Pipeline...") # ... rest of the function - # Check if real LLMs will be used - from llm_utils import should_use_langfuse, should_use_real_llmexamples/agent_comparison/steps/evaluation.py (1)
43-45
: Document the mock cost calculation as a placeholder.The cost calculation uses a hardcoded value that may not reflect actual costs. Consider making this configurable or more clearly documenting it as a placeholder.
- "total_cost_estimate": np.sum(tokens) - * 0.0001, # Mock cost calculation + # TODO: Replace with actual cost calculation based on model pricing + "total_cost_estimate": np.sum(tokens) + * 0.0001, # Mock cost: $0.0001 per token (placeholder)docs/book/getting-started/core-concepts.md (2)
10-10
: Consider more concise wording.The phrase "In order to" could be simplified for better readability.
-**ZenML** is a unified, extensible, open-source MLOps framework for creating portable, production-ready **MLOps pipelines**. It's built for data scientists, ML Engineers, and MLOps Developers to collaborate as they develop to production. By extending the battle-tested principles you rely on for classical ML to the new world of AI agents, ZenML serves as one platform to develop, evaluate, and deploy your entire AI portfolio - from decision trees to complex multi-agent systems. In order to achieve this goal, ZenML introduces various concepts for different aspects of ML workflows and AI agent development, and we can categorize these concepts under three different threads: +**ZenML** is a unified, extensible, open-source MLOps framework for creating portable, production-ready **MLOps pipelines**. It's built for data scientists, ML Engineers, and MLOps Developers to collaborate as they develop to production. By extending the battle-tested principles you rely on for classical ML to the new world of AI agents, ZenML serves as one platform to develop, evaluate, and deploy your entire AI portfolio - from decision trees to complex multi-agent systems. To achieve this goal, ZenML introduces various concepts for different aspects of ML workflows and AI agent development, and we can categorize these concepts under three different threads:
181-181
: Simplify wording for better readability.The phrase "have the ability to" can be more concise.
-With a deployed **ZenML Server**, users have the ability to create their own teams and project structures. They can easily share pipelines, runs, stacks, and other resources, streamlining the workflow and promoting teamwork across the entire AI development lifecycle. +With a deployed **ZenML Server**, users can create their own teams and project structures. They can easily share pipelines, runs, stacks, and other resources, streamlining the workflow and promoting teamwork across the entire AI development lifecycle.examples/agent_comparison/materializers/agent_materializer.py (1)
7-7
: Use absolute import for better maintainability.The relative import
from agents import BaseAgent
could break if the module structure changes. Consider using an absolute import path.-from agents import BaseAgent +from examples.agent_comparison.agents import BaseAgentexamples/agent_comparison/materializers/prompt_materializer.py (1)
8-12
: Remove empty version check block.The Python version check contains empty pass statements and serves no purpose. Either implement the version-specific logic or remove this block entirely.
-if sys.version_info >= (3, 9): - pass -else: - passexamples/agent_comparison/README.md (1)
99-122
: Add language specification to code block.The directory structure code block should specify a language for better syntax highlighting.
-``` +```text examples/agent_comparison/ ├── agents.py # Agent architecture implementationsexamples/agent_comparison/agents.py (7)
11-12
: Apply type ignores more selectively.Instead of ignoring all type errors for entire import lines, consider being more specific about what needs to be ignored.
-from langchain_core.messages import BaseMessage, HumanMessage # type: ignore -from langgraph.graph import END, START, StateGraph # type: ignore +from langchain_core.messages import BaseMessage, HumanMessage # type: ignore[import-untyped] +from langgraph.graph import END, START, StateGraph # type: ignore[import-untyped]
13-14
: Use absolute imports for better maintainability.Relative imports can make the code harder to refactor and maintain. Consider using absolute imports.
-from llm_utils import call_llm, should_use_real_llm -from materializers.prompt import Prompt +from examples.agent_comparison.llm_utils import call_llm, should_use_real_llm +from examples.agent_comparison.materializers.prompt import Prompt
27-45
: Consider using dataclass for cleaner implementation.The AgentResponse class can be simplified using Python's dataclass decorator, which would reduce boilerplate and provide useful methods automatically.
+from dataclasses import dataclass + + +@dataclass class AgentResponse: """Response from an agent with metadata.""" - def __init__( - self, text: str, latency_ms: float, confidence: float, tokens_used: int - ): - """Initialize agent response. - - Args: - text: Response text from the agent - latency_ms: Response time in milliseconds - confidence: Confidence score (0-1) - tokens_used: Number of tokens consumed - """ - self.text = text - self.latency_ms = latency_ms - self.confidence = confidence - self.tokens_used = tokens_used + text: str + """Response text from the agent.""" + latency_ms: float + """Response time in milliseconds.""" + confidence: float + """Confidence score (0-1).""" + tokens_used: int + """Number of tokens consumed."""
90-158
: Refactor process_query for better readability and maintainability.The method is quite long and handles multiple concerns. Consider extracting the LLM and fallback logic into separate methods.
246-273
: Document limitations of keyword-based routing.The simple keyword matching could misroute queries or fail to match relevant terms. Consider documenting this limitation or suggesting future improvements.
Add a docstring note:
"""Route query to appropriate specialist. Note: This uses simple keyword matching which may not capture all relevant queries. Consider using a more sophisticated approach like embedding-based similarity or a trained classifier for production use.
610-618
: Make response validation limits configurable.The hardcoded length limits (20 and 500) should be configurable for flexibility.
class LangGraphCustomerServiceAgent(BaseAgent): """LangGraph-based customer service agent with workflow visualization.""" + MIN_RESPONSE_LENGTH = 20 + MAX_RESPONSE_LENGTH = 500 + def __init__(self, prompts: Optional[List[Prompt]] = None) -> None: ... def _validate_response( self, state: CustomerServiceState ) -> CustomerServiceState: """Validate and finalize the response.""" # Simple validation - ensure response is not empty and has reasonable length - if len(state["response_text"]) < 20: + if len(state["response_text"]) < self.MIN_RESPONSE_LENGTH: state["response_text"] = ( "I apologize, but I need more information to provide you with the best assistance. Could you please provide more details about your request?" ) state["confidence"] = 0.6 - elif len(state["response_text"]) > 500: - state["response_text"] = state["response_text"][:500] + "..." + elif len(state["response_text"]) > self.MAX_RESPONSE_LENGTH: + state["response_text"] = state["response_text"][:self.MAX_RESPONSE_LENGTH] + "..." state["confidence"] = max(0.7, state["confidence"] - 0.1)
649-649
: Document that token counting is simulated.The random token count should be documented as a simulation for demo purposes.
- tokens_used=random.randint(80, 180), # Simulated for demo + tokens_used=random.randint(80, 180), # TODO: Implement actual token countingREADME.md (3)
78-80
: Avoid raw attachment URLsThe bare GitHub-attachments link is brittle (it breaks on PyPI and when GitHub GC’s the asset). Embed it as a Markdown image/video or host it elsewhere:

103-118
: Use a concrete return type instead of baretuple
Precise typing makes the sample copy-paste-ready and helps static checkers.
-from zenml.types import HTMLString +from typing import Tuple +from zenml.types import HTMLString ... -def run_architecture_comparison(queries: pd.DataFrame, classifier, prompts: dict) -> tuple: +def run_architecture_comparison( + queries: pd.DataFrame, + classifier, + prompts: dict, +) -> Tuple[dict, str]:
305-305
: Grammar nit: “to” reads better than “with”- - 🎙 [Slack](https://zenml.io/slack) - Talk with AI practitioners + - 🎙 [Slack](https://zenml.io/slack) - Talk to AI practitioners
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
docs/book/.gitbook/assets/header.png
is excluded by!**/*.png
docs/book/.gitbook/assets/readme_problem.png
is excluded by!**/*.png
📒 Files selected for processing (31)
README.md
(5 hunks)docs/book/getting-started/core-concepts.md
(5 hunks)docs/book/getting-started/hello-world.md
(2 hunks)docs/book/getting-started/zenml-pro/projects.md
(2 hunks)docs/book/getting-started/zenml-pro/workspaces.md
(2 hunks)docs/book/how-to/artifacts/artifacts.md
(3 hunks)docs/book/how-to/models/models.md
(6 hunks)docs/book/how-to/secrets/secrets.md
(5 hunks)docs/book/introduction.md
(1 hunks)examples/agent_comparison/.gitignore
(1 hunks)examples/agent_comparison/README.md
(1 hunks)examples/agent_comparison/agent_comparison_pipeline.py
(1 hunks)examples/agent_comparison/agents.py
(1 hunks)examples/agent_comparison/llm_utils.py
(1 hunks)examples/agent_comparison/materializers/__init__.py
(1 hunks)examples/agent_comparison/materializers/agent_materializer.py
(1 hunks)examples/agent_comparison/materializers/prompt.py
(1 hunks)examples/agent_comparison/materializers/prompt_materializer.py
(1 hunks)examples/agent_comparison/materializers/prompt_visualizer.py
(1 hunks)examples/agent_comparison/pipelines/__init__.py
(1 hunks)examples/agent_comparison/pipelines/agent_comparison.py
(1 hunks)examples/agent_comparison/prompts/langgraph_workflow.txt
(1 hunks)examples/agent_comparison/prompts/single_agent_rag.txt
(1 hunks)examples/agent_comparison/prompts/specialist_prompts.json
(1 hunks)examples/agent_comparison/requirements.txt
(1 hunks)examples/agent_comparison/run.py
(1 hunks)examples/agent_comparison/steps/__init__.py
(1 hunks)examples/agent_comparison/steps/data_loading.py
(1 hunks)examples/agent_comparison/steps/evaluation.py
(1 hunks)examples/agent_comparison/steps/model_training.py
(1 hunks)examples/agent_comparison/steps/testing.py
(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
docs/**/*.md
Instructions used from:
Sources:
⚙️ CodeRabbit Configuration File
🧠 Learnings (13)
📓 Common learnings
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Write comprehensive PR descriptions
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
docs/book/introduction.md (6)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Create migrations with descriptive names: alembic revision -m "Add X to Y table"
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: First line should be a concise summary (50 chars or less)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/getting-started/zenml-pro/workspaces.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
examples/agent_comparison/requirements.txt (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/getting-started/zenml-pro/projects.md (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
examples/agent_comparison/.gitignore (1)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to **/*.py : The primary code style is enforced by ruff, configured in pyproject.toml
docs/book/getting-started/hello-world.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/how-to/artifacts/artifacts.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
docs/book/getting-started/core-concepts.md (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to docs/book/**/* : Explanation of concepts in documentation
docs/book/how-to/secrets/secrets.md (2)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
examples/agent_comparison/README.md (1)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
docs/book/how-to/models/models.md (3)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
README.md (4)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/**/*.py : Document public APIs thoroughly when implementing features
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Applies to src/zenml/alembic/versions/*.py : Database schema changes require Alembic migrations
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Update documentation for user-facing changes (or ensure that nothing was broken)
Learnt from: CR
PR: zenml-io/zenml#0
File: CLAUDE.md:0-0
Timestamp: 2025-07-01T17:52:34.101Z
Learning: Use environment variables or ZenML's secret management for sensitive data
🧬 Code Graph Analysis (7)
examples/agent_comparison/pipelines/__init__.py (1)
examples/agent_comparison/pipelines/agent_comparison.py (1)
compare_agent_architectures
(35-71)
examples/agent_comparison/agent_comparison_pipeline.py (1)
examples/agent_comparison/pipelines/agent_comparison.py (1)
compare_agent_architectures
(35-71)
examples/agent_comparison/steps/evaluation.py (2)
src/zenml/logger.py (1)
get_logger
(173-187)src/zenml/types.py (1)
HTMLString
(26-27)
examples/agent_comparison/materializers/prompt.py (1)
src/zenml/utils/json_utils.py (1)
isoformat
(44-53)
examples/agent_comparison/materializers/prompt_visualizer.py (2)
src/zenml/types.py (1)
HTMLString
(26-27)examples/agent_comparison/materializers/prompt.py (2)
Prompt
(9-114)get_variable_names
(46-54)
examples/agent_comparison/materializers/agent_materializer.py (4)
examples/agent_comparison/agents.py (7)
BaseAgent
(47-69)get_graph_visualization
(159-182)get_graph_visualization
(348-376)get_graph_visualization
(652-668)get_mermaid_diagram
(184-226)get_mermaid_diagram
(378-444)get_mermaid_diagram
(670-709)src/zenml/enums.py (2)
ArtifactType
(22-31)VisualizationType
(56-63)src/zenml/materializers/base_materializer.py (1)
BaseMaterializer
(111-395)examples/agent_comparison/materializers/prompt_materializer.py (4)
load
(28-39)save
(41-68)save_visualizations
(70-88)extract_metadata
(90-108)
examples/agent_comparison/agents.py (2)
examples/agent_comparison/llm_utils.py (2)
call_llm
(76-119)should_use_real_llm
(27-38)examples/agent_comparison/materializers/prompt.py (2)
Prompt
(9-114)format
(35-44)
🪛 markdownlint-cli2 (0.17.2)
docs/book/getting-started/zenml-pro/workspaces.md
12-12: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
README.md
99-99: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🪛 LanguageTool
docs/book/getting-started/core-concepts.md
[style] ~10-~10: Consider a more concise word here.
Context: ...n trees to complex multi-agent systems. In order to achieve this goal, ZenML introduces var...
(IN_ORDER_TO_PREMIUM)
[style] ~181-~181: The phrase ‘have the ability to’ might be wordy. Consider using “can”.
Context: ...With a deployed ZenML Server, users have the ability to create their own teams and project stru...
(HAS_THE_ABILITY_TO)
README.md
[style] ~305-~305: The preposition ‘with’ is not usually used with the verb ‘talk’. Consider using ‘speak’ instead, or replacing the preposition.
Context: ... - 🎙 Slack - Talk with AI practitioners ## ❓ FAQs from ML Eng...
(TALK_WITH)
🔇 Additional comments (24)
examples/agent_comparison/prompts/single_agent_rag.txt (1)
1-8
: Prompt LGTM – verify placeholder names match code
{knowledge_context}
and{query}
look consistent with the loading code.
Just double-check the formatting logic inload_prompts()
so that accidental whitespace/newline trimming does not break RAG inputs.examples/agent_comparison/prompts/langgraph_workflow.txt (1)
1-8
: Prompt template reads wellNo issues spotted. The placeholders align with the LangGraph agent parameters.
examples/agent_comparison/prompts/specialist_prompts.json (1)
1-6
: Consider escaping newlines for safer downstream parsingThe embedded
\n\n
sequences assume that whichever code interpolates{query}
will also split on\n
.
If the prompts are forwarded verbatim to an API expecting literal newlines, you’re fine; otherwise JSON-escaped newlines can be mis-interpreted.Double-check the consumer in
load_prompts
to be sure the formatting renders as intended.docs/book/getting-started/hello-world.md (3)
10-10
: Excellent addition of unified framework messaging.This clearly communicates ZenML's value proposition for both classical ML and AI agents, helping users understand the platform's comprehensive scope.
118-122
: Well-crafted value proposition updates.The unified framework and full tracking bullets effectively communicate how ZenML handles diverse AI workloads consistently. The examples (scikit-learn models vs multi-agent systems) provide concrete context.
125-139
: Excellent reorganization of learning paths.The thematic grouping with dedicated LLMs/AI agents section provides clear navigation for users with different interests. The links to specific resources (LLMOps guide, agent evaluation) are particularly valuable.
examples/agent_comparison/agent_comparison_pipeline.py (1)
12-31
: Great user experience with informative console output.The print statements provide clear feedback about what the pipeline does and where to find results. The emojis and formatting make it user-friendly without being overwhelming.
docs/book/getting-started/zenml-pro/projects.md (3)
8-8
: Excellent integration of AI agent concepts.The addition naturally extends the project isolation concept to include AI agents, providing clear value for teams working with mixed workloads.
20-20
: Valuable addition to project benefits.The experiment separation benefit clearly articulates why project isolation matters for different AI development approaches.
116-126
: Practical best practices for AI agent development.The guidance on separating traditional ML and agent development, implementing stricter controls, and tracking LLM costs provides actionable advice for teams adopting AI agents.
examples/agent_comparison/steps/model_training.py (2)
32-52
: Simple but effective rule-based labeling approach.The keyword matching provides a reasonable way to generate training labels for the demo. The logic covers the main customer service categories appropriately.
55-62
: Well-configured scikit-learn pipeline.The TF-IDF vectorizer with 100 features and English stop words, combined with LogisticRegression using a fixed random seed, provides a good baseline classifier with reproducible results.
examples/agent_comparison/pipelines/agent_comparison.py (3)
8-9
: Appropriate use of side-effect import for materializer registration.This is a common ZenML pattern for ensuring custom materializers are registered before pipeline execution.
21-24
: Good Docker configuration for reproducible execution.The requirements file reference and uv package installer provide efficient and reproducible container builds.
32-71
: Well-structured pipeline orchestration.The step sequence is logical: data loading → prompt loading → model training → architecture comparison → evaluation. Disabling cache is appropriate for comparison runs to ensure fresh results.
docs/book/how-to/artifacts/artifacts.md (1)
42-52
: Well-structured prompt template exampleThe addition of the prompt template example clearly demonstrates how ZenML can manage AI artifacts beyond traditional ML data.
docs/book/how-to/secrets/secrets.md (1)
25-38
: Excellent LLM secret examplesThe CLI examples clearly demonstrate how to create secrets for various LLM providers and multi-agent systems.
examples/agent_comparison/llm_utils.py (1)
76-80
: Use compatible type hints for Python < 3.9.The type hint
dict[str, Any]
requires Python 3.9+. UseDict[str, Any]
from the typing module for broader compatibility.+from typing import Any, Dict, Optional -from typing import Any, Optional def call_llm( prompt: str, model: str = "gpt-3.5-turbo", - metadata: Optional[dict[str, Any]] = None, + metadata: Optional[Dict[str, Any]] = None, ) -> str:⛔ Skipped due to learnings
Learnt from: CR PR: zenml-io/zenml#0 File: CLAUDE.md:0-0 Timestamp: 2025-07-01T17:52:34.101Z Learning: Applies to **/*.py : Use Python 3.9+ compatible code
examples/agent_comparison/materializers/agent_materializer.py (1)
77-140
: Excellent visualization implementation!The
save_visualizations
method is well-designed:
- Handles both Mermaid diagrams and text visualizations gracefully
- Generates clean, styled HTML with proper CSS
- Correctly escapes curly braces in f-strings
- Returns appropriate visualization type mappings
examples/agent_comparison/materializers/prompt_materializer.py (1)
13-13
: Verify theMaterializersPrompt
import sourceThe import
from materializers.prompt import Prompt as MaterializersPromptdoes not correspond to any
materializers/prompt.py
module in this repo. Please confirm that:
materializers.prompt
is provided by an installed dependency or exists elsewhere in your project’s package layout.- If it’s for backward‐compatibility with an external package, add a comment explaining its purpose.
- If it’s actually a local module, switch to a relative import (e.g.
from .prompt import Prompt as MaterializersPrompt
).Affected location:
- examples/agent_comparison/materializers/prompt_materializer.py, lines 13 & 25
examples/agent_comparison/materializers/prompt_visualizer.py (1)
1-489
: Excellent visualization implementation!This module provides comprehensive HTML visualizations for prompts with:
- Proper HTML escaping throughout to prevent XSS
- Beautiful, responsive design using ZenML's design system
- Smart type detection in
visualize_prompt_data
using duck typing- Clean separation between single and multiple prompt visualizations
The CSS styling and HTML structure are professional and well-organized.
examples/agent_comparison/agents.py (2)
18-25
: Well-structured state definition for LangGraph workflow.The TypedDict provides clear type hints and documentation for the workflow state.
184-227
: Consider security implications of returning raw HTML.Returning raw HTML strings could pose XSS risks if not properly handled by the consumer. Ensure proper sanitization when rendering.
README.md (1)
214-217
: OpenAI client import will crash on older SDKs
openai.OpenAI()
only exists inopenai>=1.0.0
. Either pin the version in the install command or use the backwards-compatible pattern below:-import openai - -client = openai.OpenAI() # Set OPENAI_API_KEY environment variable +from openai import OpenAI + +client = OpenAI() # Requires openai>=1.0.0
@step | ||
def evaluate_agent_response(prompt: str, test_query: str) -> dict: | ||
"""Evaluates an AI agent's response to a test query.""" | ||
response = call_llm_agent(prompt, test_query) | ||
return {"query": test_query, "response": response, "quality_score": 0.95} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add import or comment for the undefined function.
The example uses call_llm_agent
which is not defined or imported, potentially confusing readers.
@step
def evaluate_agent_response(prompt: str, test_query: str) -> dict:
"""Evaluates an AI agent's response to a test query."""
+ # Note: call_llm_agent is a placeholder for your LLM integration
+ # You would implement this using libraries like LiteLLM, LangChain, etc.
response = call_llm_agent(prompt, test_query)
return {"query": test_query, "response": response, "quality_score": 0.95}
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
@step | |
def evaluate_agent_response(prompt: str, test_query: str) -> dict: | |
"""Evaluates an AI agent's response to a test query.""" | |
response = call_llm_agent(prompt, test_query) | |
return {"query": test_query, "response": response, "quality_score": 0.95} | |
``` | |
@step | |
def evaluate_agent_response(prompt: str, test_query: str) -> dict: | |
"""Evaluates an AI agent's response to a test query.""" | |
# Note: call_llm_agent is a placeholder for your LLM integration | |
# You would implement this using libraries like LiteLLM, LangChain, etc. | |
response = call_llm_agent(prompt, test_query) | |
return {"query": test_query, "response": response, "quality_score": 0.95} |
🤖 Prompt for AI Agents
In docs/book/getting-started/core-concepts.md around lines 46 to 51, the
function call_llm_agent is used but not defined or imported, which can confuse
readers. Add an import statement for call_llm_agent if it exists in the codebase
or include a comment explaining that call_llm_agent is a placeholder for the
actual function that calls the language model agent.
def load(self, data_type: Type[Any]) -> BaseAgent: | ||
"""Load BaseAgent object from artifact store. | ||
|
||
Args: | ||
data_type: The type to load (BaseAgent) | ||
|
||
Returns: | ||
Loaded BaseAgent object | ||
""" | ||
_ = data_type # Unused parameter | ||
with fileio.open( | ||
os.path.join(self.uri, "agent_metadata.json"), "r" | ||
) as f: | ||
metadata = json.load(f) | ||
|
||
# Note: For this demo, we're not fully serializing/deserializing the agent state | ||
# In a production system, you'd want to save/restore the full agent configuration | ||
# For now, we'll create a basic agent instance with the stored name | ||
agent = BaseAgent(metadata["name"]) | ||
return agent | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Agent state is not properly preserved during load.
The current implementation only restores the agent's name, losing all other state including prompts, specialists, knowledge bases, and other configuration. This breaks the fundamental contract of a materializer - the loaded object should be functionally equivalent to the saved one.
The comment on lines 35-37 acknowledges this limitation, but for production use, you need to implement proper serialization of the full agent state.
Would you like me to help implement proper serialization that preserves the complete agent state? This would involve saving agent type information and configuration parameters, then reconstructing the appropriate agent subclass with all its attributes.
🤖 Prompt for AI Agents
In examples/agent_comparison/materializers/agent_materializer.py between lines
20 and 40, the load method currently only restores the agent's name, losing all
other state and configuration. To fix this, update the serialization to save the
full agent state including prompts, specialists, knowledge bases, and
configuration when saving, and modify the load method to read this complete
state and reconstruct the appropriate agent subclass with all its attributes,
ensuring the loaded agent is functionally equivalent to the saved one.
class BaseAgent: | ||
"""Base class for all agent architectures.""" | ||
|
||
def __init__(self, name: str, prompts: Optional[List[Prompt]] = None): | ||
"""Initialize base agent. | ||
|
||
Args: | ||
name: Name of the agent architecture | ||
prompts: List of Prompt objects loaded as ZenML artifacts | ||
""" | ||
self.name = name | ||
self.prompts = {p.name: p for p in (prompts or [])} | ||
|
||
def process_query(self, query: str) -> AgentResponse: | ||
"""Process a single query and return response with metadata. | ||
|
||
Args: | ||
query: Customer service query text | ||
|
||
Raises: | ||
NotImplementedError: This method must be implemented by subclasses | ||
""" | ||
raise NotImplementedError | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Make BaseAgent a proper abstract base class.
The class should inherit from ABC and use abstractmethod decorator for better type checking and clearer intent.
+from abc import ABC, abstractmethod
+
+
-class BaseAgent:
+class BaseAgent(ABC):
"""Base class for all agent architectures."""
def __init__(self, name: str, prompts: Optional[List[Prompt]] = None):
"""Initialize base agent.
Args:
name: Name of the agent architecture
prompts: List of Prompt objects loaded as ZenML artifacts
"""
self.name = name
self.prompts = {p.name: p for p in (prompts or [])}
+ @abstractmethod
def process_query(self, query: str) -> AgentResponse:
"""Process a single query and return response with metadata.
Args:
query: Customer service query text
Raises:
NotImplementedError: This method must be implemented by subclasses
"""
- raise NotImplementedError
+ pass
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class BaseAgent: | |
"""Base class for all agent architectures.""" | |
def __init__(self, name: str, prompts: Optional[List[Prompt]] = None): | |
"""Initialize base agent. | |
Args: | |
name: Name of the agent architecture | |
prompts: List of Prompt objects loaded as ZenML artifacts | |
""" | |
self.name = name | |
self.prompts = {p.name: p for p in (prompts or [])} | |
def process_query(self, query: str) -> AgentResponse: | |
"""Process a single query and return response with metadata. | |
Args: | |
query: Customer service query text | |
Raises: | |
NotImplementedError: This method must be implemented by subclasses | |
""" | |
raise NotImplementedError | |
from abc import ABC, abstractmethod | |
class BaseAgent(ABC): | |
"""Base class for all agent architectures.""" | |
def __init__(self, name: str, prompts: Optional[List[Prompt]] = None): | |
"""Initialize base agent. | |
Args: | |
name: Name of the agent architecture | |
prompts: List of Prompt objects loaded as ZenML artifacts | |
""" | |
self.name = name | |
self.prompts = {p.name: p for p in (prompts or [])} | |
@abstractmethod | |
def process_query(self, query: str) -> AgentResponse: | |
"""Process a single query and return response with metadata. | |
Args: | |
query: Customer service query text | |
Raises: | |
NotImplementedError: This method must be implemented by subclasses | |
""" | |
pass |
🤖 Prompt for AI Agents
In examples/agent_comparison/agents.py around lines 47 to 70, the BaseAgent
class should be made a proper abstract base class by inheriting from ABC and
marking the process_query method with the @abstractmethod decorator. To fix
this, import ABC and abstractmethod from the abc module, have BaseAgent inherit
from ABC, and decorate the process_query method with @abstractmethod to enforce
implementation in subclasses and improve type checking.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
…nto misc/refine-readme-2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🦭 Lets get this in - we can alaways finetune as we go as this can be updated without release
if sys.version_info >= (3, 9): | ||
pass | ||
else: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if sys.version_info >= (3, 9): | |
pass | |
else: | |
pass |
from typing import Tuple | ||
|
||
import pandas as pd | ||
from materializers.prompt import Prompt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mixing relative with non-relative imports which I don't like.
LLM Finetuning template updates in |
* Update header.png file * Update ZenML logo alt text in README.md * Update core concepts and steps for ML workflows and agents * Update model registration and promotion steps * Enhance project and workspace organization features * Refactor ML pipeline steps for clarity and consistency * Update dependencies and add type hints to create_dataset(). * Update create_dataset to return split data as tuples * Add agent comparison pipeline steps * Refactor code for better readability * Update integration information in README.md * Update prompts, test architectures, and generate diagrams * Update typing annotations to Any in prompt materializer and visualizer * Add Langfuse observability integration to LLM utility calls * Update Langfuse integration for LiteLLM * Update readme_problem.png asset image * Optimised images with calibre/image-actions * Step Status Refresh Functionality + Kubernetes Orchestrator Implementation (#3735) * first checkpoint * new changes * fixes * new changes * small change * deprecate old method * new changes * missing import * listen to events * linting * loop optimization * changed the deprecation warning * new condition * switching to jobs * formatting * handling the store * not allowing finished steps to be updated * docstrings * label param name * removed unused function * comment and formatting * renamed function * moved steps outside * removed unused input * additional check * docstrings and formatting * removed status checks * orchestrator pod updates * new check * Upper limit datasets version (#3824) * Add Docker settings to pipeline and refactor data loading steps * Update agent visualizations with automatic generation * Update visualizations method in Agent Architecture Comparison example * Register agent materializer import to trigger registration * Refactor data_loading function return annotations * Add handling for missing OpenAI library import * Remove detailed agent workflow print statement * Update examples/agent_comparison/agent_comparison_pipeline.py Co-authored-by: Alexej Penner <[email protected]> * Update pipeline script with evaluation message * Update README.md * Update docs/book/how-to/secrets/secrets.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update README.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Remove Langfuse integration and references * Add token counting notes for accurate counting * Add import of "re" at the top of the file * Update README.md * Update imports to remove unnecessary type ignore * Update environment variables to use None as default * Integrate ZenML MCP Server for conversational insights * Auto-update of LLM Finetuning template --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Barış Can Durak <[email protected]> Co-authored-by: Michael Schuster <[email protected]> Co-authored-by: Alexej Penner <[email protected]> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: GitHub Actions <[email protected]> (cherry picked from commit 6b0047a)
Describe changes
New Features
Documentation
Chores
Pre-requisites
Please ensure you have done the following:
develop
and the open PR is targetingdevelop
. If your branch wasn't based on develop read Contribution guide on rebasing branch to develop.Types of changes
Summary by CodeRabbit
New Features
Documentation
Chores