[Feature] MindNLP 0.6 performance optimize

# Motivation

Reduce the host-bound overhead of MindSpore opeators, optimize Dispatch machanism.

## Dispatch

- Tensor.device: use mindspore native api instead of python added (Only MindSpore >= 2.7.0 supported)
- Dispatcher: use a triky method, only refer the first Tensor input, do not compare with each other with python(will speedup 2x when small models)
- Multi-backend: still support aclnn and aclop(old primitive ops)
- API patch: use a switch to open or close.

## New environments

- ENABLE_DISPATCH: whether use python dispatcher
- ENABLE_PYBOOST: whether use pyboost ops(used for OrangePi and GE graph)
- ENABLE_API_PATCH: speedup models


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] MindNLP 0.6 performance optimize #2225

Motivation

Dispatch

New environments

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] MindNLP 0.6 performance optimize #2225

Description

Motivation

Dispatch

New environments

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions