Skip to content

Conversation

@yangwang201911
Copy link
Contributor

@yangwang201911 yangwang201911 commented Nov 28, 2025

This is a clean rebase of PR#2714 to simplify the review process.

  • Implement conditional visual token pruning for QWen-VL models.
    -- Paper: CDPruner (arXiv)
    -- Code: GitHub Repository
  • Add configurations to benchmark.py and WWB tools

Tickets: CVS-173220
Related PRs:

@github-actions github-actions bot added category: visual language Visual language pipeline category: continuous batching Continuous batching category: sampling Sampling / Decoding algorithms category: cmake / build Cmake scripts category: Python API Python API for GenAI category: CPP API Changes in GenAI C++ public headers no-match-files category: GGUF GGUF file reader labels Nov 28, 2025
- Implement CDPruner interface with FastDPP algorithm
- Add OpenCL acceleration for GPU processing
- Support multi-frame video pruning with chunking
- Add comprehensive performance optimizations
- Integrate with InputsEmbedder pipeline
- Add configuration parameters and error handling
- Include comprehensive testing and documentation
@yangwang201911 yangwang201911 force-pushed the ywang2/vlm-cdpruner-clean branch from 09ad8c9 to bf3ea74 Compare November 28, 2025 07:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: cmake / build Cmake scripts category: continuous batching Continuous batching category: CPP API Changes in GenAI C++ public headers category: GGUF GGUF file reader category: Python API Python API for GenAI category: sampling Sampling / Decoding algorithms category: visual language Visual language pipeline no-match-files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant