A presentation demonstrating how AI can be used to create "ladders" that help people engage with complex domains, specifically showcasing automatic text alternative generation for web accessibility.
Watch the video here, view the presentation slides, or see the demoed Alt Text Prompt
This talk explores how AI can bridge the gap between human understanding and complex technical domains, using accessibility as a concrete example. Rather than forcing people to adapt to computers, we can build systems that adapt to people's existing knowledge and workflows.
The presentation includes live demonstrations of an AI system that generates appropriate alt text for images by understanding page context, author intent, and visual semantics—essentially performing "modality translation" from visual to textual media.
- Obsidian (desktop application)
- Git (to clone this repository)
-
Clone this repository:
git clone <repository-url> cd "Talk - Building Ladders- Extending Human Agency with AI"
-
Open in Obsidian:
- Launch Obsidian
- Click "Open folder as vault"
- Select this project directory
- Obsidian will automatically detect the vault configuration
-
Plugin Configuration:
- The
.obsidianfolder contains pre-configured settings - Advanced Slides plugin is already enabled and configured
- No additional setup required—the presentation should work immediately
- The
-
View the Presentation:
- Open
presentation.mdin Obsidian - Click the "Start Presentation" button (Advanced Slides icon) in the ribbon
- Or use Command/Ctrl + P and search for "Advanced Slides: Start Presentation"
- Open
The presentation.md file uses reveal.js syntax and can also be viewed in any reveal.js-compatible renderer outside of Obsidian. Additionally, the presentation.md file is quite readable from the GitHub preview as well.
The talk is organized into the following sections:
- Introduction & Journey - Personal background and the accessibility → AI connection
- The Vision - Making digital experiences equitable through adaptive technology
- Paradigm Shift - From humans adapting to computers → computers adapting to humans
- Breakthrough Example - Apple Math Notes as an exemplar
- Live Demo - Automatic text alternatives generation
- Philosophy - Building ladders vs. walls, and our responsibility as builders
The core demonstration showcases an AI system that generates appropriate alt text for images by understanding context at multiple levels.
The demo system evaluates an AI model's ability to:
- Contextual Understanding: Can the AI understand how an image functions within its page context?
- Author Intent Recognition: Can it determine why an author chose to include a specific image?
- Appropriate Classification: Can it distinguish between decorative, simple informative, and complex informative images?
- Semantic Translation: Can it translate visual information into equivalent textual experiences?
- Accessibility Standards Compliance: Does it follow WCAG guidelines for alt text?
A structured prompt that guides the AI through a 5-step process:
- Page Context Analysis - Understanding the overall page purpose
- Surrounding Content Analysis - Analyzing how the image is used
- Image Classification & Author Intent - Determining image type and purpose
- Alt Text Generation - Creating appropriate alternative text
- Structured Alternative - Providing detailed alternatives for complex images
A JavaScript function that programmatically extracts semantic page information:
- Page structure (headings, landmarks, metadata)
- Open Graph data for social context
- Navigation and content organization
- Deduplication and visibility filtering
This script can be run in a browser console to gather the <page_context> input for the AI system.
The system includes three carefully chosen test scenarios:
Tests: Video thumbnail recognition and clickable media context
- Input: iOS 26 features video thumbnail from YouTube homepage
- Challenge: Distinguishing informative thumbnails from decorative elements
- Expected Behavior: Concise description focusing on visual content preview
Tests: Functional UI element identification and decorative vs. informative classification
- Input: Plus icon in YouTube's "Create" button interface
- Challenge: Determining if icons with adjacent text are decorative
- Expected Behavior: Empty alt text
""when function is clear from context
Tests: Complex data visualization and structured alternative generation
- Input: Scatter plot showing word embedding coordinates for "man", "woman", "boy", "girl"
- Challenge: Converting complex visual relationships into accessible formats
- Expected Behavior: Concise alt text + structured data table alternative