Python의 NLTK WordNet에서 동의어/반의어를 얻는 방법

<시간/>

WordNet은 Python의 Natural Language Toolkit의 일부입니다. 그것은 영어 명사, 형용사, 부사 및 동사의 큰 단어 데이터베이스입니다. 이들은 synsets라고 하는 인지적 동의어 세트로 그룹화됩니다. .

Wordnet을 사용하려면 먼저 NLTK 모듈을 설치한 다음 WordNet 패키지를 다운로드해야 합니다.

$ sudo pip3 install nltk
$ python3
>>> import nltk
>>>nltk.download('wordnet')

워드넷에는 의미가 같은 몇 개의 단어 그룹이 있습니다.

첫 번째 예에서는 wordnet이 단어의 의미와 기타 세부 사항을 반환하는 방법을 볼 것입니다. 때로는 몇 가지 예를 사용할 수 있는 경우 이를 제공할 수도 있습니다.

예시 코드

from nltk.corpus import wordnet   #Import wordnet from the NLTK
synset = wordnet.synsets("Travel")
print('Word and Type : ' + synset[0].name())
print('Synonym of Travel is: ' + synset[0].lemmas()[0].name())
print('The meaning of the word : ' + synset[0].definition())
print('Example of Travel : ' + str(synset[0].examples()))

출력

$ python3 322a.word_info.py
Word and Type : travel.n.01
Synonym of Travel is: travel
The meaning of the word : the act of going from one place to another
Example of Travel : ['he enjoyed selling but he hated the travel']
$

이전 예에서 우리는 일부 단어에 대한 세부 정보를 얻습니다. 여기서 우리는 wordnet이 주어진 단어의 동의어와 반의어를 어떻게 보낼 수 있는지 볼 것입니다.

예시 코드

import nltk
from nltk.corpus import wordnet   #Import wordnet from the NLTK
syn = list()
ant = list()
for synset in wordnet.synsets("Worse"):
   for lemma in synset.lemmas():
      syn.append(lemma.name())    #add the synonyms
      if lemma.antonyms():    #When antonyms are available, add them into the list
      ant.append(lemma.antonyms()[0].name())
print('Synonyms: ' + str(syn))
print('Antonyms: ' + str(ant))

출력

$ python3 322b.syn_ant.py
Synonyms: ['worse', 'worse', 'worse', 'worsened', 'bad', 'bad', 'big', 'bad', 'tough', 'bad', 'spoiled', 'spoilt', 'regretful', 'sorry', 'bad', 'bad', 'uncollectible', 'bad', 'bad', 'bad', 'risky', 'high-risk', 'speculative', 'bad', 'unfit', 'unsound', 'bad', 'bad', 'bad', 'forged', 'bad', 'defective', 'worse']
Antonyms: ['better', 'better', 'good', 'unregretful']
$

NLTK wordnet에는 두 단어가 거의 같은지 여부를 확인할 수 있는 또 다른 훌륭한 기능이 있습니다. 한 쌍의 단어에서 유사도 비율을 반환합니다.

예시 코드

import nltk
from nltk.corpus import wordnet     #Import wordnet from the NLTK
first_word = wordnet.synset("Travel.v.01")
second_word = wordnet.synset("Walk.v.01")
print('Similarity: ' + str(first_word.wup_similarity(second_word)))
first_word = wordnet.synset("Good.n.01")
second_word = wordnet.synset("zebra.n.01")
print('Similarity: ' + str(first_word.wup_similarity(second_word)))

출력

$ python3 322c.compare.py
Similarity: 0.6666666666666666
Similarity: 0.09090909090909091
$