Highlights
- Pro
Stars
Code release for "SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers" [NeurIPS D&B, 2024]
Explore the Multimodal “Aha Moment” on 2B Model
[NeurIPS 2024] From Instance Training to Instruction Learning: Task Adapters Generation from Instructions
Training Large Language Model to Reason in a Continuous Latent Space
PaSa -- an advanced paper search agent powered by large language models. It can autonomously make a series of decisions, including invoking search tools, reading papers, and selecting relevant refe…
[NeurIPS 2024] Calibrated Self-Rewarding Vision Language Models
A Self-adaptation Framework🐙 that adapts LLMs for unseen tasks in real-time!
Code and data for OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis
METEOR: Evolutionary Journey of Large Language Models from Guidance to Self-Growth
PSPO*: An Effective Process-supervised Policy Optimization for Reasoning Alignment
Official Pytorch implementation for LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior (ICLR 2025 Oral).
Collection of AWESOME vision-language models for vision tasks
Real-time, fine-grained LLM synthetic data involves data of, by, and for LLMs.
A brief and partial summary of RLHF algorithms.
A curated list of awesome LLM Inference-Time Self-Improvement (ITSI, pronounced "itsy") papers from our recent survey: A Survey on Large Language Model Inference-Time Self-Improvement.
Building a comprehensive and handy list of papers for GUI agents
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Official code for Attention-driven GUI Grounding, AAAI2025
定时获取谷歌学术和arxiv论文的相关更新 (代码只有一个py文件,较简单有注释)
This is a repository for listing papers on scene graph generation and application.
2024中国翻墙软件VPN推荐以及科学上网避坑,稳定好用。对比SSR机场、蓝灯、V2ray、老王VPN、VPS搭建梯子等科学上网与翻墙软件,中国最新科学上网翻墙梯子VPN下载推荐,访问Chatgpt。