Skip to content

Murphyish/Awesome-Knowledge-Fusion

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Awesome-Knowledge-Fusion

Awesome

If you have any questions about the library, please feel free to contact us. Email: [email protected]


A comprehensive list of papers about '[Knowledge Fusion: The Integration of Model Capabilities.]'.

Abstract

As the comprehensive capabilities of foundational large models rapidly improve, similar general abilities have emerged across different models, making capability transfer and fusion between them more feasible. Knowledge fusion aims to integrate existing LLMs of diverse architectures and capabilities into a more powerful model through efficient methods such as knowledge distillation, model merging, mixture of experts, and PEFT, thereby reducing the need for costly LLM development and adaptation. We provide a comprehensive overview of model merging methods and theories, covering their applications across various fields and scenarios, including LLMs, MLLMs, image generation, model compression, continual learning, and more. Finally, we highlight the challenges of knowledge fusion and explore future research directions.

knowledge_fusion


Framework


1. Connectivity and Alignment

1.1 Model Connectivity

Paper Title Pub & Date Key Word
Fine-Tuning Linear Layers Only Is a Simple yet Effective Way for Task Arithmetic 2024 Arxiv
Tangent Transformers for Composition,Privacy and Removal 2024 ICLR
Parameter Efficient Multi-task Model Fusion with Partial Linearization 2024 ICLR
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models 2023 NeurIPS

1.2 Weight Alignment

Paper Title Year Conference/Journal
Equivariant Deep Weight Space Alignment 2024 ICML
Harmony in diversity: Merging neural networks with canonical correlation analysis 2024 ICML
Transformer fusion with optimal transport 2024 ICLR
Layerwise linear mode connectivity 2024 ICLR
Proving linear mode connectivity of neural networks via optimal transport 2024 AISTATS
Training-Free Pretrained Model Merging 2024 CVPR
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering 2024 Arxiv
C2M3: Cycle-Consistent Multi Model Merging 2024 Arxiv
Rethink Model Re-Basin and the Linear Mode Connectivity 2024 Arxiv
Git Re-Basin: Merging Models modulo Permutation Symmetries 2023 ICLR
Re-basin via implicit Sinkhorn differentiation 2023 CVPR
Plateau in Monotonic Linear Interpolation--A "Biased" View of Loss Landscape for Deep Networks 2023 ICLR
Linear Mode Connectivity of Deep Neural Networks via Permutation Invariance and Renormalization 2023 ICLR
REPAIR: REnormalizing Permuted Activations for Interpolation Repair 2023 ICLR
Going beyond linear mode connectivity: The layerwise linear feature connectivity 2023 NeurIPS
The role of permutation invariance in linear mode connectivity of neural networks 2022 ICLR
What can linear interpolation of neural network loss landscapes tell us? 2022 ICML
Loss Surface Simplexes for Mode Connecting Volumes and Fast Ensembling 2021 ICML
Analyzing Monotonic Linear Interpolation in Neural Network Loss Landscapes 2021 ICML
Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances 2021 ICML
Linear Mode Connectivity and the Lottery Ticket Hypothesis 2020 ICML
Optimizing mode connectivity via neuron alignment 2020 NeurIPS
Model fusion via optimal transport 2020 NeurIPS
Uniform convergence may be unable to explain generalization in deep learning 2019 NeurIPS
Explaining landscape connectivity of low-cost solutions for multilayer nets 2019 NeurIPS
Essentially no barriers in neural network energy landscape 2018 ICML
Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs 2018 NeurIPS
Weight Scope Alignment: A Frustratingly Easy Method for Model Merging 2024 Arxiv

2. Parameter Merging

2.1 Merging Methods

Gradient based

Paper Title Year Conference/Journal
Composing parameter-efficient modules with arithmetic operation 2023 NeurIPS
Editing models with task arithmetic 2023 ICLR
Model fusion via optimal transport 2020 NeurIPS
Weight averaging for neural networks and local resampling schemes 1996 AAAI Workshop

Task Vector based

Paper Title Year Conference/Journal
Knowledge Composition using Task Vectors with Learned Anisotropic Scaling 2024 Arxiv
MetaGPT: Merging Large Language Models Using Model Exclusive Task Arithmetic 2024 Arxiv
Checkpoint Merging via Bayesian Optimization in LLM Pretraining 2024 Arxiv
Arcee’s MergeKit: A Toolkit for Merging Large Language Models 2024 Arxiv
Evolutionary optimization of model merging recipes 2024 Arxiv
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts 2024 ACL
AdaMerging: Adaptive Model Merging for Multi-Task Learning 2024 ICLR
Model Merging by Uncertainty-Based Gradient Matching 2024 ICLR
Merging by Matching Models in Task Subspaces 2024 TMLR
Fisher Mask Nodes for Language Model Merging 2024 LREC-COLING
Erasure Coded Neural Network Inference via Fisher Averaging 2024 ISIT
Dataless Knowledge Fusion by Merging Weights of Language Models 2023 ICLR
Merging models with fisher-weighted averaging 2022 NeurIPS

2.2 During or After Training

During Training

Paper Title Year Conference/Journal
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch 2024 ICML
Localizing Task Information for Improved Model Merging and Compression 2024 ICML
Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging 2024 ICLR
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic 2024 Arxiv
Activated Parameter Locating via Causal Intervention for Model Merging 2024 Arxiv
PAFT: A Parallel Training Paradigm for Effective LLM Fine-Tuning 2024 Arxiv
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling 2024 Arxiv
EMR-Merging: Tuning-Free High-Performance Model Merging 2024 Arxiv
Model breadcrumbs: Scaling multi-task model merging with sparse masks 2023 Arxiv
Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion 2023 Arxiv
Resolving Interference When Merging Models 2023 NeurIPS
Task-Specific Skill Localization in Fine-tuned Language Model 2023 ICML

After Training

2.3 For LLMs and MLLMs

For LLMs

For MLLMs

3. Model Ensemble

3.1 Ensemble Methods

Weighted Averaging

Routing

Voting

3.2 Ensemble Object

Entire Model

Adapter

4. Decouple and Reuse

4.1 Reprogramming

4.2 Mask

5. Distillation

5.1 Transformer

5.2 CNN

5.3 GNN

6. Model Reassemble

6.1 Model Stitch

6.2 Model Evolution

7. Others

7.1 External Data Retrieval

7.2 Other Surveys


Star History

Star History Chart


Contact

We invite all researchers to contribute to this repository, 'Knowledge Fusion: The Integration of Model Capabilities'. If you have any questions about the library, please feel free to contact us.

Email: [email protected]

About

A collection of papers related to knowledge fusion

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published