让 AI 助手自动化你的截图工作流 | Automate Screenshot Workflows with AI
写一个计算 pi 的小论文 写 md 记得把代码运行截图也写到 md 里面 @skill.md
Screen Skills 是一个专为 LLM(大语言模型)设计的技能库,使 AI 助手能够智能地捕获 macOS 屏幕截图。这个项目特别适合需要在文档、论文或演示中嵌入实时截图的场景。正在开发中,欢迎 star 和 fork,预计支持 Windows 和 Linux。
通过简单的提示词,AI 助手可以:
- 🤖 自动运行代码并截取终端输出
- 📸 智能识别窗口并精准截图
- 📝 直接嵌入截图到 Markdown 文档
- ⚡ 零人工干预完成整个工作流
- 全屏截图 - 捕获整个屏幕
- 窗口截图 - 精准捕获特定应用窗口(Chrome、终端、VSCode 等)
- 窗口识别 - 自动列出所有可用窗口,支持中英文应用名
- 模糊匹配 - 智能匹配窗口名称(如"Safari"匹配"Safari 浏览器")
- 自动运行实验代码并截取结果
- 生成包含运行截图的完整实验报告
- 适用于数值分析、算法设计等课程作业
- 自动化 API 调用演示
- 截取命令行工具的输出结果
- 生成带截图的教程文档
- 捕获软件界面的实时状态
- 创建包含实际运行截图的演示文稿
- 自动化测试结果文档
- 展示程序实际运行效果
- 验证算法正确性
- 创建交互式代码教程
# 克隆仓库
git clone https://github.com/LangQi99/screen-skills.git
cd screen-skillsmacOS:
# Python 3.x(默认已安装)
# Swift编译器(默认已安装)Windows:
# Python 3.x(从 python.org 下载安装)
# 安装依赖
pip install pywin32 pillowLinux:
# Python 3.x(通常已预装)
# 安装依赖
pip3 install python-xlib
# 安装截图工具(至少安装一个)
sudo apt-get install scrot imagemagick # Ubuntu/Debian
# 或
sudo dnf install scrot ImageMagick # Fedora/RHEL
# 或
sudo pacman -S scrot imagemagick # Arch Linux首次使用时,macOS 可能会要求授予以下权限:
- ✅ 屏幕录制权限 - 用于截取屏幕
- ✅ 系统事件控制 - 用于检测活动窗口
前往 系统偏好设置 > 安全性与隐私 > 隐私 进行授权。
macOS:
# 1. 列出所有窗口
python3 capture-screen-macos/window_info.py list
# 2. 全屏截图
python3 capture-screen-macos/screenshot.py full output.png
# 3. 窗口截图
python3 capture-screen-macos/screenshot.py window "终端" terminal.pngWindows:
# 1. 列出所有窗口
python capture-screen-windows/window_info.py list
# 2. 全屏截图
python capture-screen-windows/screenshot.py full output.png
# 3. 窗口截图
python capture-screen-windows/screenshot.py window "Chrome" chrome.pngLinux:
# 1. 列出所有窗口
python3 capture-screen-linux/window_info.py list
# 2. 全屏截图
python3 capture-screen-linux/screenshot.py full output.png
# 3. 窗口截图
python3 capture-screen-linux/screenshot.py window "Firefox" firefox.png只需一条提示词,AI 即可完成从代码编写到截图嵌入的全流程:
写一个计算pi的小论文(蒙特卡洛方法)
写成md格式,记得把代码运行截图也写到md里面
使用skill自动截图,记得打开一个原生终端去运行
AI 会自动完成:
- ✍️ 创建 Python 计算脚本
- 🖥️ 在终端中运行代码
- 📸 列出窗口并截取终端输出
- 📄 生成包含截图的完整论文
查看完整示例:examples/cal.md
# AI可以自动运行测试并截图
python3 run_tests.py
# 然后自动截取测试结果窗口
python3 capture-screen-macos/screenshot.py window "终端" test_results.png项目使用 Swift 的 Accessibility API和 CGWindowListCopyWindowInfo获取窗口信息:
- 精确获取窗口 ID、标题、位置、大小
- 支持多语言应用名称
- JSON 格式输出,易于解析
使用 macOS 原生的 screencapture命令:
- 高质量 PNG 格式
- 支持特定窗口 ID 截图
- 零外部依赖
我们欢迎所有形式的贡献!
- Fork 本仓库
- 创建特性分支 (
git checkout -b feature/AmazingFeature) - 提交更改 (
git commit -m 'Add some AmazingFeature') - 推送到分支 (
git push origin feature/AmazingFeature) - 开启 Pull Request
本项目采用 MIT 许可证 - 详见 LICENSE 文件
- 感谢 OpenAI 和 Anthropic 推动 LLM 技术发展
- 感谢所有贡献者和使用者
screen Skills is a skill library designed for LLMs (Large Language Models) to intelligently capture screenshots across macOS, Windows, and Linux platforms. This project is particularly useful for scenarios requiring embedded real-time screenshots in documents, papers, or presentations.
With simple prompts, AI assistants can:
- 🤖 Automatically run code and capture terminal output
- 📸 Intelligently identify windows and take precise screenshots
- 📝 Directly embed screenshots into Markdown documents
- ⚡ Zero manual intervention for the entire workflow
- Full Screen Capture - Capture the entire screen
- Window Capture - Precisely capture specific application windows (Chrome, Terminal, VSCode, etc.)
- Window Recognition - Automatically list all available windows, supporting both English and localized app names
- Fuzzy Matching - Intelligently match window names (e.g., "Safari" matches "Safari 浏览器")
- Automatically run experimental code and capture results
- Generate complete experiment reports with runtime screenshots
- Suitable for numerical analysis, algorithm design coursework
- Automate API call demonstrations
- Capture command-line tool outputs
- Generate tutorial documents with screenshots
- Capture real-time software interface states
- Create presentations with actual runtime screenshots
- Automate test result documentation
- Show actual program execution results
- Verify algorithm correctness
- Create interactive code tutorials
# Clone repository
git clone https://github.com/LangQi99/screen-skills.git
cd screen-skillsmacOS:
# Python 3.x (pre-installed)
# Swift compiler (pre-installed)Windows:
# Python 3.x (download from python.org)
# Install dependencies
pip install pywin32 pillowLinux:
# Python 3.x (usually pre-installed)
# Install dependencies
pip3 install python-xlib
# Install screenshot tools (install at least one)
sudo apt-get install scrot imagemagick # Ubuntu/Debian
# or
sudo dnf install scrot ImageMagick # Fedora/RHEL
# or
sudo pacman -S scrot imagemagick # Arch LinuxOn first use, macOS may request the following permissions:
- ✅ Screen Recording - For capturing screens
- ✅ System Events Control - For detecting active windows
Navigate to System Preferences > Security & Privacy > Privacy to authorize.
# 1. List all windows
python3 capture-screen-macos/window_info.py list
# 2. Full screen capture
python3 capture-screen-macos/screenshot.py full output.png
# 3. Window capture
python3 capture-screen-macos/screenshot.py window "Terminal" terminal.pngWith a single prompt, AI completes the entire workflow from code writing to screenshot embedding:
Write a short paper about computing pi (Monte Carlo method)
Format as markdown, include code execution screenshots
Use skill for automatic screenshots, open a native terminal to run
AI automatically:
- ✍️ Creates Python calculation script
- 🖥️ Runs code in terminal
- 📸 Lists windows and captures terminal output
- 📄 Generates complete paper with embedded screenshots
View complete example: examples/计算 π 的蒙特卡洛方法.md
Uses Swift's Accessibility API and CGWindowListCopyWindowInfo:
- Precisely retrieves window ID, title, position, size
- Supports multilingual application names
- JSON format output for easy parsing
Uses macOS native screencapture command:
- High-quality PNG format
- Supports specific window ID capture
- Zero external dependencies
We welcome all forms of contributions!
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see LICENSE file for details
- Thanks to OpenAI and Anthropic for advancing LLM technology
- Thanks to all contributors and users
如果这个项目对你有帮助,请给我们一个 ⭐️
If this project helps you, please give us a ⭐️
Made with ❤️ for the AI community

