wenerme
diff --git a/‎notes/ai/ai-awesome.md‎
Lines changed: 9 additions & 2 deletions b/‎notes/ai/ai-awesome.md‎
Lines changed: 9 additions & 2 deletions
diff --git a/‎notes/ai/ai-glossary.md‎
Lines changed: 5 additions & 0 deletions b/‎notes/ai/ai-glossary.md‎
Lines changed: 5 additions & 0 deletions
diff --git a/‎notes/ai/dev/comfyui.md‎
Lines changed: 81 additions & 0 deletions b/‎notes/ai/dev/comfyui.md‎
Lines changed: 81 additions & 0 deletions
diff --git a/‎notes/ai/lm/llama.cpp.md‎ renamed to ‎notes/ai/dev/llama.cpp.md‎ b/‎notes/ai/lm/llama.cpp.md‎ renamed to ‎notes/ai/dev/llama.cpp.md‎
diff --git a/‎notes/ai/dev/localai.md‎
Lines changed: 89 additions & 0 deletions b/‎notes/ai/dev/localai.md‎
Lines changed: 89 additions & 0 deletions
diff --git a/‎notes/ai/lm/ollama.md‎ renamed to ‎notes/ai/dev/ollama.md‎
Lines changed: 3 additions & 0 deletions b/‎notes/ai/lm/ollama.md‎ renamed to ‎notes/ai/dev/ollama.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎notes/ai/lm/vllm.md‎ renamed to ‎notes/ai/dev/vllm.md‎ b/‎notes/ai/lm/vllm.md‎ renamed to ‎notes/ai/dev/vllm.md‎
diff --git a/‎notes/ai/diffusion/diffusion-awesome.md‎
Lines changed: 0 additions & 15 deletions b/‎notes/ai/diffusion/diffusion-awesome.md‎
Lines changed: 0 additions & 15 deletions
diff --git a/‎notes/ai/ml/huggingface.md‎
Lines changed: 15 additions & 0 deletions b/‎notes/ai/ml/huggingface.md‎
Lines changed: 15 additions & 0 deletions
diff --git a/‎notes/ai/ml/ml-awesome.md‎
Lines changed: 3 additions & 0 deletions b/‎notes/ai/ml/ml-awesome.md‎
Lines changed: 3 additions & 0 deletions
@@ -124,6 +124,9 @@ tags:
 - Agent/MCP
   - [microsoft/autogen](https://github.com/microsoft/autogen)
     - Multi Agent
+  - [magnitudedev/magnitude](https://github.com/magnitudedev/magnitude)
+    - Aapche-2.0, TS
+    - browser automation framework
   - [google-gemini/gemini-fullstack-langgraph-quickstart](https://github.com/google-gemini/gemini-fullstack-langgraph-quickstart)
     - Apache-2.0, TS, Python, React
     - building Fullstack Agents using Gemini 2.5 and LangGraph
@@ -159,15 +162,14 @@ tags:
 - Platform
   - [langfuse/langfuse](https://github.com/langfuse/langfuse)
     - MIT+EE, TS
-    -  LLM Observability, metrics, evals, prompt management, playground, datasets.
+    - LLM Observability, metrics, evals, prompt management, playground, datasets.
 - https://github.com/NielsRogge/Transformers-Tutorials
 - news/trending
   - https://huggingface.co/models
   - https://aihot.today/
 - usecase/showcase
   - [601 real-world gen AI use cases from the world's leading organizations](https://cloud.google.com/transform/101-real-world-generative-ai-use-cases-from-industry-leaders)
 
-
 ## 应用 {#applications}
 
 **产品/服务**
@@ -205,6 +207,11 @@ tags:
 
 ---
 
+- [automattic/harper](https://github.com/automattic/harper)
+  - Apache-2.0, Rust
+  - privacy-first grammar checker
+  - English grammar checker
+  - alternative to Grammarly
 - [openrecall/openrecall](https://github.com/openrecall/openrecall)
   - AGPLv3, Python
   - Windows 11 Recall 替代
 
@@ -67,6 +67,11 @@ tags:
 - open-vocabulary detection
   - 开放词汇检测
   - 识别和处理未在训练数据中出现过的词汇或短语的能力
+- CAM (Class Activation Mapping)（Heatmap）
+  - 类激活映射
+  - 用于可视化卷积神经网络（CNN）在图像分类任务中关注的区域
+  - 通过将特定类别的预测与输入图像的特征图进行关联，生成热力图，显示模型在做出决策时关注的图像区域
+  - 模型可解释性
 
 | en               | cn       |
 | ---------------- | -------- |
 
@@ -18,6 +18,15 @@ title: ComfyUI
   - ComfyDeploy
   - 另存 (API 格式)
   - 调用 /prompt
+- 参考
+  - https://docs.comfy.org/tutorials/
+
+:::caution
+
+- 只能单一显卡
+  - https://github.com/comfyanonymous/ComfyUI/discussions/4139
+
+:::
 
 ```bash
 git clone --depth 1 https://github.com/comfyanonymous/ComfyUI ComfyUI
@@ -36,6 +45,9 @@ uv pip install --pre torch torchvision torchaudio --extra-index-url https://down
 uv pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu121
 #uv pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu128
 uv pip install -r requirements.txt
+
+# Nvidia APEX normalization not installed, using PyTorch LayerNorm
+uv pip install xformers
 ```
 
 ```bash title="mps.py"
@@ -77,6 +89,33 @@ No module named pip
 
 ## Notes
 
+好的，遵照您的要求，这里是精简后的版本，仅包含**目录**和**主要用途说明**两列。
+
+### ComfyUI Models 目录结构详解
+
+| dir                | for                                                                                                                                                       |
+| :----------------- | :-------------------------------------------------------------------------------------------------------------------------------------------------------- |
+| `checkpoints`      | 核心基础模型，也叫“大模型”。这是文生图的起点，决定了图像生成的基础风格和能力。例如 Stable Diffusion v1.5, SDXL, 以及社区训练的各种整合模型。              |
+| `loras`            | LoRA 模型。这些是小型微调文件，用于向基础模型添加特定的角色、画风、概念或服装，灵活性极高。                                         |
+| `vae`              | VAE  模型。用于图像的编码和解码。独立的 VAE 文件可以修正图像的色彩（如改善灰蒙蒙的问题）或修复手部等细节问题。SDXL 模型通常不需要额外 VAE。 |
+| `controlnet`       | ControlNet 模型。用于精确控制图像的生成，例如通过姿势骨架、深度图、线稿、二维码等来引导构图和内容。                                                       |
+| `upscale_models`   | 图像放大模型。用于“图像放大 (模型)”节点，提升图片分辨率并优化细节。例如 ESRGAN, SwinIR, 4x-UltraSharp 等。                                                |
+| `embeddings`       | 文本反演 (Textual Inversion) 嵌入，也叫 Embedding。这些是极小的文件，通过一个关键词触发特定的概念、角色或画风。常用于负面提示词（如 `bad-hands-5`）。     |
+| `clip`             | CLIP 文本编码器模型。通常 ComfyUI 会自动从大模型中加载，但你也可以把独立的 CLIP 模型放在这里，供高级工作流使用。                                          |
+| `clip_vision`      | CLIP Vision 模型。用于分析图像内容，是 IPAdapter、PhotoMaker 等“图像提示”功能的核心组件。                                                                 |
+| `style_models`     | 风格模型。主要用于 T2I-Adapter，功能与 ControlNet 类似，但更侧重于风格的迁移。                                                                            |
+| `hypernetworks`    | Hypernetwork 模型。一种比 LoRA 更早出现的微调技术，现在已不常用，但 ComfyUI 仍然支持加载。                                                                |
+| `unet`             | U-Net 模型。U-Net 是 Stable Diffusion 模型的核心降噪网络。普通用户几乎不会用到这个目录，主要用于模型开发和研究，将 U-Net 单独分离出来加载。               |
+| `text_encoders`    | 文本编码器模型。与 `unet` 类似，用于模型研究，允许单独加载和替换文本编码器部分。                                                                          |
+| `photomaker`       | PhotoMaker 模型。一种专门用于根据输入人脸照片生成统一角色的模型。                                                                                         |
+| `sams`             | SAM (Segment Anything Model) 模型。由 Meta 开发的图像分割模型，在 ComfyUI 中用于精确地创建和分离遮罩 (Mask)。                                             |
+| `gligen`           | GLIGEN 模型。用于“限定区域生成”，允许你通过画框来指定某个物体在图像中的特定位置和大小。                                                                   |
+| `diffusers`        | 用于存放 Hugging Face 的 Diffusers 格式模型。这种格式是一个包含多个子目录和文件的文件夹，而不是单个文件。ComfyUI 可以直接加载这种格式。                   |
+| `configs`          | 配置文件。存放一些旧的 `.ckpt` 模型所需要的 `.yaml` 配置文件，以帮助 ComfyUI 识别其模型架构（如 v1 或 v2）。现在的 `.safetensors` 模型通常不需要。        |
+| `vae_approx`       | VAE 近似解码器模型。这些是极小的、速度极快的模型，用于在 KSampler 采样过程中生成快速预览图，而不是每次都调用完整的 VAE。                                  |
+| `onnx`             | ONNX 模型。用于存放已转换为 ONNX (Open Neural Network Exchange) 格式的模型，通常用于在非 NVIDIA 硬件（如 AMD 显卡）上通过 DirectML 或 Olive 进行推理。    |
+| `diffusion_models` | 扩散模型组件。一个更通用的目录，类似于 `unet`，用于存放扩散模型的某些部分。主要供模型开发者使用。                                                         |
+
 **AI Art**
 
 - Text2Img
@@ -119,6 +158,22 @@ No module named pip
   - SD 1.5
   - LAION 5B
   - SDXL
+- Upscaler
+  - ESRGAN
+  - SwinIR
+  - 4x-UltraSharp
+  - OmniSR
+  - MoSR
+  - DRCT
+  - ADT
+  - DAT
+  - RealPLKSR
+  - SPAN
+  - RGT
+  - HAT
+  - SRFormer
+  - SwiftESRGAN
+  - SPSR
 - KSampler
   - 用于采样生成图像
   - sampler
@@ -142,6 +197,22 @@ No module named pip
   - 912x1216
   - 1008x1344
 - 9:16
+  - 512x896
+  - 576x1024
+  - 768x1366
+  - 1024x1820
+
+输出可以包含日期
+
+```
+%date:yyyy-MM-dd%/ComfyUI
+```
+
+## API
+
+```bash
+
+```
 
 ## 参考 {#reference}
 
@@ -226,3 +297,13 @@ CUDA kernel errors might be asynchronously reported at some other API call, so t
 For debugging consider passing CUDA_LAUNCH_BLOCKING=1
 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
 ```
+
+## ImportError: cannot import name 'guidedFilter' from 'cv2.ximgproc'
+
+```bash
+uv pip uninstall opencv-python opencv-python-headless opencv-contrib-python-headless opencv-contrib-python
+uv pip install opencv-python opencv-python-headless opencv-contrib-python-headless
+uv pip install opencv-contrib-python
+```
+
+- https://github.com/chflame163/ComfyUI_LayerStyle/issues/5
@@ -0,0 +1,89 @@
+---
+tags:
+  - Engine
+---
+
+# LocalAI
+
+- [mudler/LocalAI](https://github.com/mudler/LocalAI)
+  - MIT, Go
+  - 支持后端 llama.cpp, vLLM, Diffusers, Transformers, whisper.cpp, [stable-diffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)
+  - 支持 文、图、音
+  - 支持 文生文、文生图、文生音、语音转文本、向量嵌入
+  - OpenAI 兼容 API
+- 参考
+  - https://models.localai.io/
+  - 兼容模型 https://localai.io/model-compatibility/index.html
+
+```bash
+# CPU: latest
+# Nvidia: latest-gpu-nvidia-cuda-12, latest-gpu-nvidia-cuda-11, latest-nvidia-l4t-arm64
+# AMD GPU ROCm: latest-gpu-hipblas
+# 推荐 AIO/All in One - 例如 latest-aio-cpu, latest-aio-gpu-nvidia-cuda-12
+# 健康 http://localhost:8080/readyz
+# 参考 https://localai.io/basics/container/
+# 会直接下载一些模型 https://github.com/mudler/LocalAI/tree/master/aio/gpu-8g
+# qwen2.5-7b, whisper-1, DreamShaper, MiniCPM-V-2_6
+# 内置模型名 whisper-1, stablediffusion
+# 支持 https_proxy
+docker run --rm -ti --gpus all \
+  -p 8080:8080 \
+  -v $PWD/localai/models:/build/models:cached \
+  --name localai localai/localai:latest-aio-gpu-nvidia-cuda-12
+
+docker exec -it localai bash
+./local-ai
+```
+
+```yaml
+version: '3.9'
+services:
+  api:
+    image: localai/localai:localai/localai:latest-aio-gpu-nvidia-cuda-12
+    healthcheck:
+      test: ['CMD', 'curl', '-f', 'http://localhost:8080/readyz']
+      interval: 1m
+      timeout: 20m
+      retries: 5
+    ports:
+      - 8080:8080
+    environment:
+      - DEBUG=true
+    volumes:
+      - ./models:/build/models:cached
+    deploy:
+      resources:
+        reservations:
+          devices:
+            - driver: nvidia
+              count: 1
+              capabilities: [gpu]
+```
+
+| env         | default       | desc                |
+| ----------- | ------------- | ------------------- |
+| MODELS_PATH | /build/models |
+| THREADS     | nproc-1       |
+| PORT        | 8080          |
+| API_KEY     |
+| MODELS      |               | list of models YAML |
+| PROFILE     |               | cpu, gpu-8g         |
+
+- 容器镜像
+  - quay.io/go-skynet/local-ai
+  - docker.m.daocloud.io/localai/localai
+- MODELS_PATH
+  - 容器 /build/models
+  - /usr/share/local-ai/models
+
+## RuntimeError: operator torchvision::nms does not exist
+
+
+- 确认版本一致
+- `torch.__version__` 和 `torchvision.__version__`
+
+```bash
+python -c "import torch; print(torch.__version__)"
+python -c "import torchvision; print(torchvision.__version__)"
+```
+
@@ -31,6 +31,7 @@ title: ollama
 - ~~Support tools in OpenAI-compatible API [#4386](https://github.com/ollama/ollama/issues/4386)~~
 - phi4-multimodel https://github.com/ollama/ollama/issues/9387
 - qwen3 rerank https://github.com/ollama/ollama/issues/10989
+- InternVL3 https://github.com/ollama/ollama/issues/10248
 
 :::
 
@@ -78,6 +79,8 @@ ollama pull llama3.2-vision:11b
 
 - https://github.com/ollama/ollama/blob/main/envconfig/config.go
 - https://github.com/ollama/ollama/issues/2941
+- OLLAMA_KV_CACHE_TYPE 依赖 flash attention
+  - https://github.com/ollama/ollama/blob/502028968ddca04bd19c0859a73fb4e0cbeac3e1/llm/server.go#L221-L223
 
 ```shell
 /set verbose
 
@@ -86,18 +86,3 @@ tags:
 - CompVis
   - 慕尼黑大学 (LMU Munich) 的计算机视觉与学习研究组 (Computer Vision & Learning Group)
 - LAION: 大型人工智能开放网络 (Large-scale Artificial Intelligence Open Network)，提供了训练数据集。
-
-| date    | model              | size     | author            | notes |
-| ------- | ------------------ | -------- | ----------------- | ----- |
-| 2024-10 | FLUX.1             | dev 12B  | Black Forest Labs |
-| 2024-10 | SD 3.5             | 2.5B, 8B |
-| 2024-02 | SD 3.0             | 800M, 8B |
-| 2023-11 | SDXL Turbo         |          |
-| 2023-07 | SDXL 1.0           | 3.5B     |
-| 2022-12 | SD v2.1            |
-| 2022-11 | SD v2.0            |          |
-| 2022-10 | SD 1.5             | 983M     | RunwayML          |
-| 2022-08 | SD 1.1 1.2 1.3 1.4 |          | CompVis           |
-
-- CFG - Classifier-Free Diffusion Guidance (2022)
-- [black-forest-labs/flux](https://github.com/black-forest-labs/flux)
@@ -82,6 +82,21 @@ completion = client.chat.completions.create(
 )
 ```
 
+## modelscope
+
+```bash
+# 国内镜像下载 HF Repo
+pip install modelscope
+```
+
+- cache_dir= ~/.cache/modelscope/hub
+- allow_patterns
+- ignore_patterns
+- --include
+- --exclude
+- 模型缓存目录
+  - cache_dir/MODEL_ID/THE_MODEL_FILES
+
 # FAQ
 
 ## The model Qwen/Qwen2.5-VL-72B-Instruct is too large to be loaded automatically (146GB > 10GB).
 
@@ -85,6 +85,9 @@ tags:
   - [Tencent/ncnn](https://github.com/Tencent/ncnn)
     - BSD-3, C++, C
     - neural network inference framework optimized for the mobile platform
+  - [gpustack/gpustack](https://github.com/gpustack/gpustack)
+    - Apache-2.0, Python
+    - backend llama.cpp, stable-diffusion.cpp , vLLM, vox-box
   - ~~[johnolafenwa/deepstack](https://github.com/johnolafenwa/deepstack)~~
     - Apache-2.0, Go, Python
     - Cross Platform AI Engine for Edge Devices