feat: switch to stdlib iterators by alecthomas · Pull Request #1144 · alecthomas/chroma

alecthomas · 2025-09-19T14:03:20Z

No description provided.

rogpeppe

A few drive-by thoughts.

rogpeppe · 2025-09-19T15:49:33Z

iterator.go

 	var out []Token
-	for t := i(); t != EOF; t = i() {
+	for t := range i {
+		if t == EOF {


FWIW I don't see that this would be necessary: in fact I 'd strongly suggest it's not necessary to define an EOF token at all. Then this function can be as simple as:

func (i Iterator) Tokens() []Token { return slices.Collect(i) }

which to my mind somewhat calls into question whether the Tokens method is worth having at all.
Maybe Iterator could be defined just as an alias:

type Iterator = iter.Seq[Token]

But then again, it's quite possibly worth preserving surface API compatibility even if the underlying representation changes.

This thought had crossed my mind and I don't exactly recall why there's an EOF token TBH, but there was a good reason for it at some point, so I left it in.

FWIW the EOF token definitely seems a bit "surprising" to me. An analogy that springs to mind is that it feels a little like passing around length-delimited strings but still keeping a zero-byte at the end and asking everyone to ignore the last character.

I've not seen an example of this pattern before and I personally would think very carefully through (and explicitly document) the reasons for why it needs to be this way, assuming it really does.

It existed before to signify that the stream had reached EOF, which is obviously redundant with Go iterators.

FWIW I was aware of its previous use/need, but wondered if there was a reason you'd left it around when moving to the new iterator API.

Ah I see. No, this is basically a half hour PoC, not ready for merge. I was mostly pondering whether switching to iterators would be worth a major version bump. But in combination with some other cleanup, like eradicating EOF, it could be worthwhile I think.

rogpeppe · 2025-09-19T15:51:26Z

iterator.go

-	return func() Token {
-		if len(tokens) == 0 {
-			return EOF
+	return func(yield func(Token) bool) {


return slices.Values(tokens)

rogpeppe · 2025-09-19T15:55:47Z

lexers/http.go

-					return EOF
 				}
 			}
+			if !yield(token) {


Given this is at the end of the function, I think this test is redundant.

rogpeppe · 2025-09-19T15:58:56Z

regexp.go

-// Iterator returns the next Token from the lexer.
-func (l *LexerState) Iterator() Token { // nolint: gocognit
+// Iterator returns a Go iterator over tokens from the lexer.
+func (l *LexerState) Iterator(yield func(Token) bool) { // nolint: gocognit


FWIW it's more conventional to return an actual iterator rather than have the method be an iterator, even if the iterator isn't reusable.

Suggested change

func (l *LexerState) Iterator(yield func(Token) bool) { // nolint: gocognit

func (l *LexerState) Iterator() chroma.Iterator {

return func(yield func(Token) bool) {

etc

}

}

rogpeppe · 2025-09-19T16:01:47Z

regexp.go

-			n := len(l.iteratorStack) - 1
-			t := l.iteratorStack[n]()
+		// Exhaust the token stack, if any.
+		for len(l.tokenStack) > 0 {


I haven't looked in any depth, but I suspect it might be possible to make this much simpler by taking advantage of the push-based iterator and just writing a recursive function rather than a hand-maintained stack.

feat: switch to stdlib iterators

26ea2fc

alecthomas mentioned this pull request Sep 19, 2025

Make it easier to use standard iterators as a token source #1143

Open

1 task

rogpeppe reviewed Sep 19, 2025

View reviewed changes

refactor: Iterator -> iter.Seq[Token]

ca57a0b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: switch to stdlib iterators#1144

feat: switch to stdlib iterators#1144
alecthomas wants to merge 2 commits intomasterfrom
aat/stdlib-iterator

alecthomas commented Sep 19, 2025

Uh oh!

rogpeppe left a comment

Uh oh!

rogpeppe Sep 19, 2025

Uh oh!

alecthomas Sep 19, 2025

Uh oh!

rogpeppe Sep 19, 2025

Uh oh!

alecthomas Sep 20, 2025

Uh oh!

rogpeppe Sep 20, 2025

Uh oh!

alecthomas Sep 20, 2025

Uh oh!

rogpeppe Sep 19, 2025

Uh oh!

rogpeppe Sep 19, 2025

Uh oh!

rogpeppe Sep 19, 2025

Uh oh!

rogpeppe Sep 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

alecthomas commented Sep 19, 2025

Uh oh!

rogpeppe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants