wenerme
diff --git a/‎notes/ai/ai-awesome.md‎
Lines changed: 5 additions & 1 deletion b/‎notes/ai/ai-awesome.md‎
Lines changed: 5 additions & 1 deletion
diff --git a/‎notes/ai/llm/llm-faq.md‎
Lines changed: 2 additions & 46 deletions b/‎notes/ai/llm/llm-faq.md‎
Lines changed: 2 additions & 46 deletions
diff --git a/‎notes/ai/llm/llm-pricing.md‎
Lines changed: 62 additions & 0 deletions b/‎notes/ai/llm/llm-pricing.md‎
Lines changed: 62 additions & 0 deletions
diff --git a/‎notes/ai/ml/ml-awesome.md‎
Lines changed: 36 additions & 1 deletion b/‎notes/ai/ml/ml-awesome.md‎
Lines changed: 36 additions & 1 deletion
diff --git a/‎notes/ai/ml/dataset.md‎ renamed to ‎notes/ai/ml/ml-dataset.md‎
Lines changed: 4 additions & 0 deletions b/‎notes/ai/ml/dataset.md‎ renamed to ‎notes/ai/ml/ml-dataset.md‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎notes/ai/ml/ml-glossary.md‎
Lines changed: 32 additions & 0 deletions b/‎notes/ai/ml/ml-glossary.md‎
Lines changed: 32 additions & 0 deletions
diff --git a/‎notes/ai/ml/ml-models.md‎
Lines changed: 24 additions & 0 deletions b/‎notes/ai/ml/ml-models.md‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎notes/ai/ml/paddle/paddle-nlp.md‎
Lines changed: 9 additions & 0 deletions b/‎notes/ai/ml/paddle/paddle-nlp.md‎
Lines changed: 9 additions & 0 deletions
diff --git a/‎notes/ai/ml/paddle/paddleocr.md‎ renamed to ‎notes/ai/ml/paddle/paddle-ocr.md‎
Lines changed: 2 additions & 0 deletions b/‎notes/ai/ml/paddle/paddleocr.md‎ renamed to ‎notes/ai/ml/paddle/paddle-ocr.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎notes/ai/ml/paddle/paddle-speech.md‎
Lines changed: 8 additions & 0 deletions b/‎notes/ai/ml/paddle/paddle-speech.md‎
Lines changed: 8 additions & 0 deletions
@@ -8,8 +8,12 @@ tags:
 - [LLM Awesome](./llm/llm-awesome.md)
 - [ML Awesome](./ml/ml-awesome.md)
 - [Stable-Diffusion Awesome](./diffusion/diffusion-awesome.md)
-- Voice
+- [Voice](./voice/voice-awesome.md)
   - [jianchang512/clone-voice](https://github.com/jianchang512/clone-voice)
+  - [FunAudioLLM/CosyVoice](https://github.com/FunAudioLLM/CosyVoice)
+    - ASR, TTS
+  - [wenet-e2e/wenet](https://github.com/wenet-e2e/wenet)
+    - STT
 - Search/RAG
   - Perplexity
   - [ItzCrazyKns/Perplexica](https://github.com/ItzCrazyKns/Perplexica)
 
@@ -13,52 +13,8 @@ tags:
   - https://huggingface.co/blog/rlhf
   - https://github.com/yizhongw/self-instruct
   - https://platform.openai.com/docs/model-index-for-researchers
-
-## Pricing
-
-|                                 Model |  1M input/output | notes     |
-| ------------------------------------: | ---------------: | --------- |
-|                                gpt-4o |   $5.00 / $15.00 |
-|                            o1-preview |  $15.00 / $60.00 |
-|                               o1-mini |   $3.00 / $12.00 |
-|                           gpt-4o-mini |   $0.15 / $00.60 |
-|                   Gemini 1.5 Pro 128K | $3.5.00 / $10.50 |
-|                 Gemini 1.5 Flash 128K |  $0.075 / $00.30 |
-|               Gemini 1.5 Flash > 128K |   $0.15 / $00.60 |
-|            Gemini 1.5 {Flash/Pro}-002 |              50% | limits\*2 |
-|                Claude 3.5 Sonnet 200K |   $3.00 / $15.00 |
-|                   Claude 3 Haiku 200K |   $0.25 / $01.25 |
-|               Anthropic Claude 3 Opus |  $15.00 / $75.00 |
-|     Groq Llama 3.1 70B Versatile 128k |   $0.59 / $00.79 |
-|                 Groq Whisper V3 Large |         $0.111/h |
-| DeepInfra Llama-3.1-70B-Instruct 128k |    $0.35 / $0.40 |
-|               DeepInfra Qwen2-72b 32k |    $0.35 / $0.40 |
-
-:::tip
-
-- 小模型里 gpt-4o-mini 目前是性价比最好的
-- 开源模型的速度可以非常快，能实现 multi agent 这种模式
-- Gemini 1.5 Flash 支持 1M context window
-- Gemini 1.5 Pro 支持 2M context window
-
-:::
-
-| app          | price              | quota             |
-| ------------ | ------------------ | ----------------- |
-| ChatGPT Plus | $20                | 4o 80/3h, 4 40/3h |
-| ChatGPT Team | $25/年付, $30/月付 | 2\*Plus           |
-
-- ChatGPT Plus/Team 限制
-  - https://help.openai.com/en/articles/6950777-what-is-chatgpt-plus#h_d78bb59065
-- https://www.together.ai/pricing
-- https://deepinfra.com/pricing
-- https://groq.com/pricing/
-- https://openai.com/api/pricing/
-- https://www.anthropic.com/pricing
-- https://fireworks.ai/pricing
-- https://www.anyscale.com/pricing
-- 参考
-  - https://www.vellum.ai/blog/llama-3-1-70b-vs-gpt-4o-vs-claude-3-5-sonnet
+- Tokenizer
+  - https://github.com/QwenLM/Qwen/blob/main/tokenization_note_zh.md
 
 ## model metrics
 
 
@@ -0,0 +1,62 @@
+---
+tags:
+  - VS
+  - Pricing
+---
+
+# Pricing
+
+| Model                                 | 1M input/output                         | notes     |
+| :------------------------------------ | :-------------------------------------- | --------- |
+| gpt-4o                                | $5.00 / $15.00                          |
+| o1                                    | $15.00 / $60.00                         |
+| o1-mini                               | $3.00 / $12.00                          |
+| gpt-4o-mini                           | $0.15 / $00.60                          |
+| OpenAI Realtime API GPT 4o            | $5.00 / $20.00, Audio $100.00 / $200.00 |
+| OpenAI Whisper                        | $0.006 / minute                         |
+| OpenAI TTS                            | $15.00 / 1M chars                       |
+| OpenAI TTS HD                         | $30.00 / 1M chars                       |
+| Gemini 1.5 Pro 128K                   | $3.5.00 / $10.50                        |
+| Gemini 1.5 Flash 128K                 | $0.075 / $00.30                         |
+| Gemini 1.5 Flash > 128K               | $0.15 / $00.60                          |
+| Gemini 1.5 {Flash/Pro}-002            | 50%                                     | limits\*2 |
+| Claude 3.5 Sonnet 200K                | $3.00 / $15.00                          |
+| Claude 3 Haiku 200K                   | $0.25 / $01.25                          |
+| Anthropic Claude 3 Opus               | $15.00 / $75.00                         |
+| Groq Llama 3.1 70B Versatile 128k     | $0.59 / $00.79                          |
+| Groq Whisper V3 Large                 | $0.111/h                                |
+| DeepInfra Llama-3.1-70B-Instruct 128k | $0.35 / $0.40                           |
+| DeepInfra Qwen2-72b 32k               | $0.35 / $0.40                           |
+| Aliyun qwen-long 10M                  | ¥0.50/ ¥2.00                            |
+| Aliyun qwen-turbo                     | ¥0.30/ ¥0.60                            |
+| Aliyun qwen-plus                      | ¥0.80/ ¥2.00                            |
+| Aliyun qwen-max                       | ¥20.00/ ¥60.00                          |
+
+:::tip
+
+- 小模型里 gpt-4o-mini 目前是性价比最好的
+- 开源模型的速度可以非常快，能实现 multi agent 这种模式
+- Gemini 1.5 Flash 支持 1M context window
+- Gemini 1.5 Pro 支持 2M context window
+
+:::
+
+| app          | price              | quota             |
+| ------------ | ------------------ | ----------------- |
+| ChatGPT Plus | $20                | 4o 80/3h, 4 40/3h |
+| ChatGPT Team | $25/年付, $30/月付 | 2\*Plus           |
+
+- ChatGPT Plus/Team 限制
+  - https://help.openai.com/en/articles/6950777-what-is-chatgpt-plus#h_d78bb59065
+- https://openai.com/api/pricing/
+- https://www.together.ai/pricing
+- https://deepinfra.com/pricing
+- https://groq.com/pricing/
+- https://openai.com/api/pricing/
+- https://www.anthropic.com/pricing
+- https://fireworks.ai/pricing
+- https://www.anyscale.com/pricing
+- https://openrouter.ai/models?fmt=table
+- https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-thousand-questions-metering-and-billing
+- 参考
+  - https://www.vellum.ai/blog/llama-3-1-70b-vs-gpt-4o-vs-claude-3-5-sonnet
@@ -171,17 +171,19 @@ tags:
   - [facebookresearch/detectron2](https://github.com/facebookresearch/detectron2)
   - [open-mmlab/mmdetection](https://github.com/open-mmlab/mmdetection)
   - [google-research/big_vision](https://github.com/google-research/big_vision)
-  - [Yolo](./yolo.md) - You Only Look Once
+  - [Yolo](./yolo/README.md) - You Only Look Once
     - YOLO-NAS - Neural Architecture Search
     - [WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7)
     - [YOLOv7 Breakdown](https://blog.roboflow.com/yolov7-breakdown/)
+  - CLIP
 - [lastmile-ai/aiconfig](https://github.com/lastmile-ai/aiconfig)
   - MIT, Python
   - config-based framework to build generative AI applications
 - Dataset
   - https://annas-archive.org/llm
   - https://www.opendatanetwork.com/
   - https://datasetsearch.research.google.com/
+  - kaggle
   - [datumaro](./datumaro.md)
     - 数据集管理
   - [OpenOrca](https://huggingface.co/datasets/Open-Orca/OpenOrca)
@@ -358,6 +360,39 @@ tags:
 - [TheAlgorithms/Python](https://github.com/TheAlgorithms/Python)
 - [freeCodeCamp/freeCodeCamp](https://github.com/freeCodeCamp/freeCodeCamp)
 
+## Institute
+
+- OpenAI
+- DeepMind
+- Microsoft Research
+- SAIL - Stanford AI Lab
+- Carnegie Mellon University Robotics Institute
+- Google AI
+- CSAIL - MIT Computer Science and Artificial Intelligence Laboratory
+- FAIR - Facebook AI Research
+- IBM Research
+
+**国内**
+
+- BAAI - Beijing Academy of Artificial Intelligence - 智源研究院
+  - [baaivision](https://github.com/baaivision)
+  - BEG - BAAI General Embedding
+    - [FlagOpen/FlagEmbedding](https://github.com/FlagOpen/FlagEmbedding)
+      - Retrieval and Retrieval-augmented LLMs
+  - https://huggingface.co/BAAI
+  - https://baai.ac.cn/
+- 百度研究院
+  - https://research.baidu.com/
+- 阿里巴巴达摩院
+  - https://damo.alibaba.com/
+- 腾讯 - https://yuanbao.tencent.com/
+- 商汤科技
+  - https://www.sensetime.com/
+- 旷视科技
+  - https://www.megvii.com/
+- 云从科技
+  - https://www.cloudwalk.cn/
+
 ## UI/Desktop/GUI/WebUI {#ui}
 
 - omnimodel
 
@@ -6,6 +6,8 @@ title: Dataset
 
 - https://roboflow.com/formats
 - https://github.com/ultralytics/yolov5/blob/master/data/coco128.yaml
+- COCO - Common Objects in Context
+  - by 微软研究院（Microsoft Research） at 2014
 - coco128
   - YOLOv5 Tutorial Dataset
   - https://www.kaggle.com/datasets/ultralytics/coco128
@@ -14,3 +16,5 @@ title: Dataset
 - [ultralytics/JSON2YOLO](https://github.com/ultralytics/JSON2YOLO)
   - Convert JSON annotations into YOLO format
 - openlibrary
+- 百度数据集 https://aistudio.baidu.com/datasetoverview
+  - 增值税发票数据集-适配PaddleOCR https://aistudio.baidu.com/datasetdetail/165561
@@ -50,6 +50,10 @@ tags:
 | MNN      |
 | TNN      |
 | NCNN     |
+| CRNN     | Convolutional Recurrent Neural Network                  | 卷积循环神经网络        |
+| DTRB     | Deep Transformer Reinforcement Learning                 | 深度变压器强化学习      |
+
+**Voice**
 
 | abbr. | stand for                              | cn                       |
 | ----- | -------------------------------------- | ------------------------ |
@@ -67,13 +71,41 @@ tags:
 | NMS   | Non-Maximum Suppression                | 非极大值抑制             |
 | IoU   | Intersection over Union                | 交并比                   |
 | mAP   | Mean Average Precision                 | 平均精度                 |
+| SRN   | Semantic Reasoning Network             | 语义推理网络             |
+| STR   | scene text recognition                 | 场景文本识别             |
+| SER   | Structured Entity Recognition          | 结构化实体识别           |
+| RE    | Relation Extraction                    | 关系抽取                 |
+| KIE   | Key Information Extraction             | 关键信息提取             |
+| PSE   |
+
+- 文本检测算法
+  - DB, EAST, SAST, PSE, DB++, FCE
+- 文本识别算法
+  - CRNN, SRN, RARE, NETR, SAR, ViTSTR, ABINet, VisionLAN, SPIN, RobustScanner, SVTR, SVTR_LCNet
+- 端到端文本检测算法
+  - PGNet
+
+**Visual**
+
+| abbr. | stand for                               | cn                 |
+| ----- | --------------------------------------- | ------------------ |
+| MIM   | Masked Image Modeling                   | 掩码图像建模       |
+| CLIP  | Contrastive Language-Image Pre-training | 对比语言图像预训练 |
+| OOB   | Oriented Object Detection               | 有向物体检测       |
+| COCO  | Common Objects in Context               | 上下文中的通用对象 |
+| OKS   | Object Keypoint Similarity              | 对象关键点相似度   |
 
 | en                   | cn       |
 | -------------------- | -------- |
 | Contrastive Learning | 对比学习 |
 | Inpainting           | 局部重绘 |
 | Outpainting          | 扩展绘制 |
 
+- CLIP - 具有良好的通用性和可扩展性 - modular, reusable, scalable
+- MIM - 适合具体的视觉任务 - 如分类、检测、分割
+
+---
+
 - ClassicML
   - Regression
   - Classification
 
@@ -0,0 +1,24 @@
+---
+tags:
+  - Models
+  - Awesome
+---
+
+# Models
+
+- [tensorflow/models](https://github.com/tensorflow/models)
+- https://www.tensorflow.org/resources/models-datasets
+- https://huggingface.co/models
+- https://huggingface.co/timm
+  - ONNX, PyTorch
+- https://huggingface.co/spaces/mteb/leaderboard
+- https://docs.ultralytics.com/models/
+- ONNX
+  - https://github.com/onnx/models
+- PaddlePaddle
+  - [PaddlePaddle/models](https://github.com/PaddlePaddle/models)
+  - [PaddlePaddle/PaddleHub](https://github.com/PaddlePaddle/PaddleHub)
+- 平台
+  - https://aistudio.baidu.com/modelsoverview
+  - https://www.modelscope.cn/
+  - https://www.modelzoo.co/
@@ -0,0 +1,9 @@
+---
+tags:
+- NLP
+---
+
+# PaddleNLP
+
+- [PaddlePaddle/PaddleNLP](https://github.com/PaddlePaddle/PaddleNLP)
+  - Apache-2.0, Python
@@ -24,6 +24,8 @@ tags:
   - ONNX https://huggingface.co/OleehyO/paddleocrv4.onnx/tree/main
 - 缓存目录
   - ~/.paddleocr/whl/det/ch/ch_PP-OCRv4_det_infer/ch_PP-OCRv4_det_infer.tar
+- 参考
+  - 模型列表 https://github.com/frotms/PaddleOCR2Pytorch/blob/main/doc/doc_ch/models_list.md
 
 ```bash
 # 版本 https://pypi.org/project/paddlepaddle/#history
 
@@ -0,0 +1,8 @@
+---
+tags:
+- TTS
+- ASR
+---
+
+# PaddleSpeech
+- [PaddlePaddle/PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech)
-Original file line number
+Diff line change
@@ @@ -0,0 +1,8 @@ @@
 +---
 +tags:
 +- TTS
 +- ASR
 +---
++
 +# PaddleSpeech
 +- [PaddlePaddle/PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech)