Replies: 2 comments
-
bab2min/kiwi-farm#1 에도 답변 달아드렸다시피 from kiwipiepy.transformers_addon import KiwiTokenizer |
Beta Was this translation helpful? Give feedback.
0 replies
-
from kiwipiepy import Kiwi
from tokenizers import Tokenizer, models, pre_tokenizers, trainers
# Kiwi 형태소 분석기 초기화
kiwi = Kiwi()
# 토크나이저 초기화
tokenizer = Tokenizer(models.WordLevel(unk_token="<unk>"))
# 특수 토큰 설정
special_tokens = ["<pad>", "<unk>", "<s>", "</s>", "<mask>"]
# Kiwi를 이용한 커스텀 토크나이징 함수
def kiwi_tokenizer(text):
return [word for word, tag, start_pos, end_pos in
kiwi.analyze(text)[0][0]]
# 커스텀 토크나이저 적용
tokenizer.pre_tokenizer = pre_tokenizers.PreTokenizer.custom(kiwi_tokenizer)
# 트레이너 설정
trainer = trainers.WordLevelTrainer(vocab_size=36000,
special_tokens=special_tokens)
# 토크나이저 훈련
data_files = ["./data/1234567.txt"] # 훈련 데이터 파일
tokenizer.train(trainer, data_files)
# 토크나이저 저장
tokenizer.save("kiwi_korean_tokenizer.json")
ImportError Traceback (most recent call last)
Cell In[1], line 1----> 1 from kiwipiepy import Kiwi 2 from
tokenizers import Tokenizer, models, pre_tokenizers, trainers 4 #
Kiwi 형태소 분석기 초기화
File D:\projects\kiwi\kiwipiepy\__init__.py:7 5 from
kiwipiepy._version import __version__ 6 from kiwipiepy._wrap
import Kiwi, Sentence, TypoTransformer, TypoDefinition, HSDataset,
MorphemeSet, PretokenizedToken----> 7 import kiwipiepy.sw_tokenizer as
sw_tokenizer 8 import kiwipiepy.utils as utils 9 from
kiwipiepy.const import Match
File D:\projects\kiwi\kiwipiepy\sw_tokenizer.py:15 11 import
warnings 13 import tqdm---> 15 from _kiwipiepy import Sw_Tokenizer
17 from kiwipiepy import Kiwi, Token 19 @DataClass 20
class SwTokenizerConfig:
ImportError: cannot import name 'Sw_Tokenizer' from '_kiwipiepy'
(D:\anaconda3\Lib\site-packages\_kiwipiepy.cp311-win_amd64.pyd) <= 님의
블로그에 있는 tomotopy 관련 예제도 한번 실행해좠는데 결론은 똑같은 에러 발생인데 저만 그런 것인가요? kiwi
가상환경을 만들어 몇번지웠다 설치해봐도 맨 마지막 결론은 똑같은 에러 발생입니다. 이리저리 구글링도 해보고 Chatgpt나
클로드2.1에 질문을 던져도 뚜렷한 해결방안이 나오지 않습니다.
2023년 12월 17일 (일) 오후 9:38, Minchul Lee ***@***.***>님이 작성:
… bab2min/kiwi-farm#1 <bab2min/kiwi-farm#1> 에도 답변
달아드렸다시피
KiwiTokenizer는 kiwipiepy가 아니라 kiwipiepy.transformers_addon 패키지에 포함되어 있습니다.
from kiwipiepy.transformers_addon import KiwiTokenizer
—
Reply to this email directly, view it on GitHub
<#147 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AB2B5M4KMZPPBOMMPJCEOE3YJ3RULAVCNFSM6AAAAABAX67RBWVHI2DSMVQWIX3LMV43SRDJONRXK43TNFXW4Q3PNVWWK3TUHM3TQNZWGY3TC>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
'KiwiTokenizer' 한번 사용해볼려고 제시한 예제대로 해봤는데 다음과 같은 에러가 나네요. 키위팜에 있는 예제도 전부
'KiwiTokenizer' 때문에 에러가 발생하던데요?
import kiwipiepy
from kiwipiepy import KiwiTokenizer
ImportError Traceback (most recent call last)
Cell In[3], line 2
1 import kiwipiepy
----> 2 from kiwipiepy import KiwiTokenizer
ImportError: cannot import name 'KiwiTokenizer' from 'kiwipiepy' (D:\projects\kiwi\kiwipiepy_init_.py)
Beta Was this translation helpful? Give feedback.
All reactions