It occurs to me that a C-based custom token matcher is overkill in most cases. For example:
.token hex '0x0BADCAFE' '0x248c'
.token oct '0644' '0777771'
.token bin '11001011b' '10b'
Could be completely defined in the language specification using regexes:
.token hex /0x[0-9A-Fa-f]+/
.token oct /0[1-7][0-7]*/
.token bin /[01]+b/