-
Notifications
You must be signed in to change notification settings - Fork 5
Updated docs #235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Updated docs #235
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Mark YAML tabs as "(Legacy)" and improve Python SDK tab labeling in all pipeline documentation. This is Phase 1 of progressively de-emphasizing YAML in favor of Python-first approach. Changes: - "sdk" → "Python SDK" (more descriptive) - "yaml" → "YAML (Legacy)" (marks as legacy) - Updated 8 docs files with pipeline examples - Configuration files unchanged 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Remove YAML-based pipeline definition examples from all concept docs - Update all examples to show complete code files instead of snippets - Add [concept:] markers to example files for better documentation - Mark YAML configuration as legacy where it refers to pipeline definitions - Preserve YAML parameter definitions as they remain valid - Update documentation to focus on Python SDK approach - Fix duplicate main() calls and improve code consistency 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
## YAML Legacy Removal - Remove empty docs/architecture/yaml.md file - Remove all "YAML (Legacy)" tabs from concept documentation - Remove YAML pipeline examples from multiple documentation files - Simplify project parameters to focus on environment variables - Align documentation with Python-first approach ## Code Block Improvements - Add missing linenums="1" to all code blocks in job_intro.md - Add missing linenums="1" to all code blocks in parameters.md - Fix syntax error: returns["sum_of_numbers"] → returns=["sum_of_numbers"] - Update line number highlights in parameters.md for better focus ## Files Modified - docs/concepts/job_intro.md: Fixed syntax, added line numbers - docs/concepts/parameters.md: Removed YAML, added line numbers - docs/concepts/catalog.md: Removed 2 YAML legacy sections - docs/concepts/nesting.md: Removed YAML legacy section - docs/reference.md: Removed all 8 YAML legacy tabs - docs/architecture/yaml.md: Deleted empty file Provides cleaner Python-focused experience with numbered code examples. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add new getting started section with focused tutorials - Reorganize concepts into building blocks, superpowers, and advanced patterns - Enhance parameter documentation with YAML vs environment variable priority - Add comprehensive examples showing parameter override behavior - Improve navigation structure for better learning progression 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Reorganized documentation sections for better flow - Enhanced advanced patterns with complete code examples - Improved building blocks documentation with executable examples - Updated superpowers section with clearer explanations - Removed legacy documentation files - Added collapsible code examples with run commands - Improved navigation structure in mkdocs.yml 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Create new reproducibility.md in superpowers section - Document automatic tracking of run logs, code versions, and data artifacts - Show real examples using actual run IDs from test executions - Include code snippets from examples directory - Explain file hashing, parameter flow, and execution metadata - Update navigation to include reproducibility in superpowers - Remove duplicate reproducibility entry from nav Features covered: - Unique run ID generation and memorable naming - Complete execution logs in .run_log_store/ - Data catalog organization by run ID - Git commit tracking and code versioning - Parameter and metric tracking - File integrity with hashing - Execution context capture 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Create new comparisons section with detailed analysis - Use real ML pipeline example with parallel training and model selection - Show complete implementations: 1 file (Runnable) vs 15+ files (Kedro) - Highlight productivity differences: 5 minutes vs hours/days - Include honest assessment of when to choose each tool - Document configuration overhead: 0 YAML vs multiple required configs - Show learning curve differences and development experience - Add side-by-side comparison table with key metrics - Include practical "try both yourself" section Key insights demonstrated: - Runnable: Zero framework lock-in, immediate productivity - Kedro: Configuration jungle, steep learning curve - File count: 1 vs 15+ files for same functionality - Time to value: minutes vs hours/days - Code changes: zero refactoring vs complete restructure 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add proper type annotations to all function signatures - Remove incorrect int(x) returns pattern - use simple returns list - Fix catalog access: no 'get' without prior 'put', load from S3/central storage - Correct parameters: use YAML file or env vars, not inline execution params - Highlight domain code separation - functions stay pure Python - Clarify Kedro directory structure is recommended, not required - Add note about importing functions from separate modules - Demonstrate proper Runnable parameter file creation Technical corrections based on actual Runnable capabilities: - Parameters must be via file or environment variables - Catalog 'get' only works after 'put' from previous steps - Domain functions remain framework-agnostic - Type hints in function definitions, not return specifications 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add detailed comparison tables across 6 categories: * Core workflow features (Runnable wins on simplicity) * Data management (Kedro wins on sophistication) * Development experience (Runnable wins on productivity) * Production & operations (Runnable wins on portability) * Reproducibility & governance (split advantages) * Ecosystem & integration (Kedro wins on maturity) - Highlight Runnable's unique advantages: * Zero framework lock-in with pure Python functions * Environment portability (local → K8s → Argo with same code) * Advanced workflow patterns (Parallel, Map, Conditional) * Instant productivity (5 minutes vs days) * Mixed task types (Python + notebooks + shell) * Automatic reproducibility without setup - Document Kedro's strengths honestly: * Data catalog sophistication with 20+ dataset types * Enterprise features and governance capabilities * MLOps ecosystem with native integrations * Advanced visualization and monitoring * Team collaboration through opinionated structure * Mature plugin ecosystem - Add practical decision matrix based on team size and priorities - Include honest trade-off analysis: productivity vs enterprise features - Provide clear guidance for different use cases 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Change data versioning comparison from Kedro win to Runnable win - Explain why content-based MD5 hashing beats timestamp versioning: * True change detection - only versions when content actually changes * Automatic deduplication - same content = same hash regardless of timestamp * Data integrity - hash mismatch reveals corruption immediately * No clock skew issues across different machines * Performance optimized with last 5MB sampling for large files - Add detailed comparison showing problems with timestamp versioning: * False changes when files are touched but unchanged * Missed duplicates when same content has different timestamps * Clock synchronization issues in distributed systems * No content verification capabilities - Include technical details about large file handling (5MB sampling) - Show JSON example of how hashing works in practice - Update Kedro's strengths to note timestamp versioning limitation This correction reflects a genuine technical advantage where Runnable's approach is more robust and reliable than traditional timestamp-based versioning systems. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Remove confrontational "winner" language - replace with collaborative "Strengths" format that shows what each tool does well - Use italics to indicate missing features rather than "loses" - Correct false compliance claims: * Neither Runnable nor Kedro has built-in SOX/GDPR/HIPAA compliance * Both provide foundations that compliance can be built upon * Kedro has more enterprises building on top, but no turnkey compliance - Add honest "Compliance & Governance Reality Check" section: * Shows what each tool actually provides for compliance foundations * Explains what real enterprise compliance requires (external tools) * Clarifies that regulatory compliance needs custom implementation - Remove oversimplified team size assumptions: * Replace "small team vs big team" logic with philosophy and priorities * Focus on onboarding efficiency, autonomy vs standardization * Add nuanced scaling considerations for both directions * Update decision matrix to focus on actual priorities that matter - Improve tone throughout to be informative rather than competitive - Focus on helping users understand trade-offs for their specific needs - Maintain technical accuracy while being respectful to both tools 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Add job executor config details covering local, container, K8s types - Document pipeline-executor vs job-executor distinction - Analyze job executor types: local, local-container, emulator, k8s - Plan documentation structure with templates and cross-references - Add metaflow comparison to navigation Foundation for documenting missing job executor configurations. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Transform Argo Workflows documentation from complex to user-friendly * Remove 800+ lines of overwhelming YAML dumps and technical details * Add clear value proposition upfront (true parallel execution) * Structure as simple → advanced with practical examples * Include production-ready configuration examples - Improve local-container documentation flow * Remove duplication in container setup sections * Add progressive complexity (basic → advanced → debugging) * Use effective callouts for better visual organization * Clear decision guidance for when to use each executor - Apply consistent patterns across both executors * Lead with simple working examples * Use callouts to highlight key information * Progressive disclosure of advanced features * Practical guidance over theoretical documentation Result: Documentation that guides users from concept to production without overwhelming them with complexity upfront. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Reorganize from 8 tabs to 6 clear, progressive tabs * Quick Start: Introduction → first pipeline (optimized for immediate value) * Core Concepts: Essential knowledge (merged Building Blocks + Superpowers) * Advanced Patterns: Complex workflows when ready to scale * Production: Deployment focus (moved "Deploy Anywhere" + all configs) * Compare: Decision support (simplified from "Comparisons") * Reference: API documentation - Optimize for quick start experience * Natural progression: Introduction → Your First Pipeline → Adding Data * Include Usage Examples in Quick Start for immediate practical value * "Why Runnable?" positioned for validation after hands-on experience - Strategic content organization * Move "Deploy Anywhere" from Superpowers to Production section * Promote configurations to top-level Production tab * Create clearer mental model: Learn → Build → Deploy → Scale Result: Clear user journey from first visit to production deployment without overwhelming navigation or losing content depth. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
- Create new get-started.md with 30-second transformation demo - Focus on environment portability as key value proposition - Use progressive disclosure: success → parameters → workflows → tools - Include glue code comparison to show familiar patterns - Move detailed tutorials to Core Concepts section - Add clear navigation from introduction to get-started - Emphasize 'write once, run anywhere' messaging Addresses Quick Start complexity by leading with immediate gratification while breadcrumbing users toward deeper concepts.
- Merge get-started content into index.md while preserving logo and branding - Remove Quick Start tab from navigation - home page is now actionable - Simplify navigation: Home → Usage Examples → Why Runnable → Core Concepts - Keep instant success approach: 30sec transformation → progressive teasers - Maintain environment portability focus throughout landing page - Remove redundant get-started.md file Creates immediate actionable experience on homepage while preserving brand identity and progressive disclosure design.
Move the 6-item feature grid from why-runnable.md to index.md bottom, providing immediate value proposition on landing page. Remove redundant why-runnable page and update navigation accordingly. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Rewrite README.md to align with homepage messaging and implement progressive disclosure. Replace complex iris example with simple 30-second transformation demo and streamlined feature highlights. Key improvements: - 30-second transformation demo for immediate value - Clear progression from single function to pipeline - Concise feature highlights with benefits - Streamlined navigation to full documentation - Fixed broken GitHub URLs and license badge - Removed overwhelming technical complexity 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Remove unused Dockerfile from root directory to clean up repository structure as part of documentation reorganization. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.