Custom suggester based on POS tags: Shape mismatch for blis.gemm

## Background
I noticed a fairly concrete pattern in spans that I was trying to categorize using Spancat, so built a custom suggester function based on POS tags and noun chunks.  It works really well to reduce the number of predicted spans to only good candidates.

I am desperately trying to get to training my pipeline using that suggester after understanding that I need to include more components in the pipeline to ensure that records come through to the suggester with the requisite annotations.  However I just can't get past an error I'm getting deep in spaCy libraries.  I can't easily see how my work and the error are connected.  I understand abstractly that the dimensions of a layer are not matching what was expected, but how those expectations are set, I'm not sure.

## Error
The error I'm getting is: `Shape mismatch for blis.gemm: (40, 384), (1024, 384)`.

I have tried:
* replacing my custom suggester with a standard n-gram suggester I got much the same error: `ValueError: Shape mismatch for blis.gemm: (670, 384), (1024, 384)`
* changing the order of the components in the pipeline, I've tried `["tok2vec", "tagger", "parser", "attribute_ruler", "spancat"]`, `["tagger", "parser", "attribute_ruler", "tok2vec", "spancat"]` and `["tok2vec","tagger", "attribute_ruler", "parser", "spancat"]`
* removing the replace_listeners from each of the statically sourced components upstream of spancat
* not sourcing the tok2vec component ... i.e. starting with a new trainable one and unfreezing it as per https://github.com/adrianeboyd/workshop-dh2023/blob/main/litbank/configs/spancat_subtree_lg.cfg
* I even tried removing each of the pipeline components individually, but of course there is a good reason for them all to be there - with the custom suggester needing the pos tagging.
* Reading everything on the internet

As this is driving me a bit nuts, would appreciate even a whack over the head telling me what I'm doing wrong!  Also happy to post more code if needed.  Trying to get this working for an urgent workplace project.

## How to reproduce the behaviour
```
[paths]
train = null
dev = null
vectors = "en_core_web_lg"
init_tok2vec = null

[system]
gpu_allocator = null
seed = 0

[nlp]
lang = "en"
pipeline = ["tok2vec","tagger", "attribute_ruler", "parser", "spancat"]
batch_size = 1000
disabled = []
before_creation = null
after_creation = null
after_pipeline_creation = null
tokenizer = {"@tokenizers":"spacy.Tokenizer.v1"}
vectors = {"@vectors":"spacy.Vectors.v1"}

[components]

[components.spancat]
factory = "spancat"
max_positive = null
scorer = {"@scorers":"spacy.spancat_scorer.v1"}
spans_key = "sc"
threshold = 0.5

[components.spancat.model]
@architectures = "spacy.SpanCategorizer.v1"

[components.spancat.model.reducer]
@layers = "spacy.mean_max_reducer.v1"
hidden_size = 128

[components.spancat.model.scorer]
@layers = "spacy.LinearLogistic.v1"
nO = null
nI = null

[components.spancat.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = 256
upstream = "*"

[components.spancat.suggester]
@misc = "my_custom_suggester.v1"

[components.tok2vec]
source = "en_core_web_lg"

[components.tagger]
source = "en_core_web_lg"
replace_listeners = ["model.tok2vec"]

[components.parser]
source = "en_core_web_lg"
replace_listeners = ["model.tok2vec"]

[components.attribute_ruler]
source = "en_core_web_lg"
replace_listeners = ["model.tok2vec"]

[corpora]

[corpora.dev]
@readers = "spacy.Corpus.v1"
path = ${paths.dev}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[corpora.train]
@readers = "spacy.Corpus.v1"
path = ${paths.train}
max_length = 0
gold_preproc = false
limit = 0
augmenter = null

[training]
dev_corpus = "corpora.dev"
train_corpus = "corpora.train"
seed = ${system.seed}
gpu_allocator = ${system.gpu_allocator}
dropout = 0.1
accumulate_gradient = 1
patience = 1600
max_epochs = 0
max_steps = 20000
eval_frequency = 200
frozen_components = ["tok2vec","tagger","parser","attribute_ruler"]
annotating_components = ["tok2vec","tagger","parser","attribute_ruler"]
before_to_disk = null
before_update = null

[training.batcher]
@batchers = "spacy.batch_by_words.v1"
discard_oversize = false
tolerance = 0.2
get_length = null

[training.batcher.size]
@schedules = "compounding.v1"
start = 100
stop = 1000
compound = 1.001
t = 0.0

[training.logger]
@loggers = "spacy.ConsoleLogger.v1"
progress_bar = false

[training.optimizer]
@optimizers = "Adam.v1"
beta1 = 0.9
beta2 = 0.999
L2_is_weight_decay = true
L2 = 0.01
grad_clip = 1.0
use_averages = false
eps = 0.00000001
learn_rate = 0.001

[training.score_weights]
spans_sc_f = 1.0
spans_sc_p = 0.0
spans_sc_r = 0.0

[pretraining]

[initialize]
vectors = ${paths.vectors}
init_tok2vec = ${paths.init_tok2vec}
vocab_data = null
lookups = null
before_init = null
after_init = null

[initialize.components]

[initialize.tokenizer]
```

## Your Environment
- **spaCy version:** 3.8.4
- **Platform:** Windows-11-10.0.22631-SP0
- **Python version:** 3.12.9
- **Pipelines:** en_core_web_lg (3.8.0), en_core_web_trf (3.8.0)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Custom suggester based on POS tags: Shape mismatch for blis.gemm #13887

Background

Error

How to reproduce the behaviour

Your Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

Custom suggester based on POS tags: Shape mismatch for blis.gemm #13887

Description

Background

Error

How to reproduce the behaviour

Your Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions