Python

파이썬 한글 자모 분리 패키지

2022년 04월 26일 Hong 댓글 남기기

한글 자모분리를 하는 것은 오타처리, 스팸 감지, 욕설, 성적 표현을 주는 키워드 감지 같은 것을 하기 위해서 사용합니다. 몇 번 해보면 재밌지만 자모분리는 정작 뭐 좀 해보려고 하면 쓸데가 별로 없습니다.

Python에서 사용할 수 있는 한글자모분리 패키지 또는 코드 모음입니다.

파이썬 한글 자모분리 패키지 목록

– 한글툴킷: https://github.com/bluedisk/hangul-toolkit

– 한글유틸: https://github.com/kaniblu/hangul-utils

– 한글파이: https://github.com/rhobot/Hangulpy

– 파이썬 자모: https://github.com/JDongian/python-jamo

– 한글 자음/모음 분해 (코드): https://frhyme.github.io/python/python_korean_englished/

– 한글 유니코드 자모분리: https://nunucompany.tistory.com/28

대부분 주요 기능들은 모두 제공하고 있고 사용법도 쉽고 작동도 잘 됩니다.

원하는 것을 선택해서 쓰면 됩니다.

미분류

우분투에 Mecab 형태소분석기 설치 – Install Mecab in Ubuntu

2022년 04월 22일 Hong 댓글 남기기

우분투에 Mecab(은전한닢) 형태소 분석기를 설치하는 방법입니다.

Mecab 메카브 간략 설명

Mecab를 간단히 설명하면

Mecab은 C++로 만든 일본어 형태소분석기입니다.
Mecab-ko는 Mecab를 고쳐서 만든 한국어형태소분기이며 “은전한닢”라고 부릅니다.

예전 포스트가 있니 참고하세요.

MeCab 형태소 분석기, 형태소분석기란 무엇인가? 워드세그멘터와 형태소분석기

설치방법

Mecab-Ko는 Mecab 코어 모듈과 Mecab-ko-dic을 먼저 설치해야 하는데 번거롭습니다.

konlpy에 있는 간략 스크립트를 쓰면 쉽게 설치가능합니다.

sudo apt-get install curl git
$ bash <(curl -s https://raw.githubusercontent.com/konlpy/konlpy/master/scripts/mecab.sh)

Mecab만으로는 형태소분석을 테스트하거나 활용하기 어려우니 Python 모듈도 설치해줍니다.

# python3.10은 아직 문제가 있으니 안전하게 조금 오래된 버전으로 간다.
# python3.10 -m pip install mecab-python3
python3.8 -m pip install mecab-python3

파이썬을 실행해서 테스트 해봅니다.

mecab = Mecab()
' '.join(mecab.morphs("무궁화꽃이피었습니다."))
# '무궁화 꽃 이 피 었 습니다 .'

리눅스 Linux

E: Unmet dependencies. Try ‘apt-get -f install’ with no packages (or specify a solution).

2022년 04월 22일 Hong 댓글 남기기

우분투에서 패키지를 설치하다가 의존성이 깨지거나 하게 되면 그 뒤로 apt를 실행할 때 마다 에러가 발생합니다. 여간해서는 해결이 잘 안되는데요.

E: Unmet dependencies. Try 'apt-get -f install' with no packages (or specify a solution).

이렇게 하면 됩니다.

sudo apt-get -o Dpkg::Options::="--force-overwrite" install --fix-broken

출처: https://askubuntu.com/questions/1044817/failed-installation-of-package-breaks-apt-get

미분류

[Jenkins] Could not initialize class org.eclipse.jgit.internal.storage.file.FileSnapshot

2022년 04월 22일 Hong 댓글 남기기

젠킨스로 git repository를 polling해서 코드가 푸시되었는지 확인한 후에 자동 빌드하는 프로세스를 만들면 로그에 이런 에러가 나면서 실패하는 경우가 있습니다.

Could not initialize class org.eclipse.jgit.internal.storage.file.FileSnapshot

젠킨스의 jgit 관련 클래스가 잘못된 것인데 이건 해결방법이 마땅치 않습니다.

그냥 Jenkins를 LTS가 아닌 최신버전이나 다른 버전으로 바꿔서 설치해야 합니다.

미분류

jenkins install Certificate verification failed: The certificate is NOT trusted. The certificate chain uses expired certificate. Could not handshake: Error in the certificate verification.

2022년 04월 21일 Hong 댓글 남기기

젠킨스를 설치하는데 지런 에러가 날 수 있습니다.

sudo apt update
sudo apt install jenkins

에러는 이렇습니다.

jenkins install Certificate verification failed: The certificate is NOT trusted. The certificate chain uses expired certificate. Could not handshake: Error in the certificate verification.

certification을 재설치해줘야 해결이 됩니다.

sudo apt install ca-certificates
sudo apt update
sudo apt install jenkins

토탈 데이터 사이언스 – Total Data Science

월별 글 목록: 2022년 4월월

파이썬 한글 자모 분리 패키지

파이썬 한글 자모분리 패키지 목록

우분투에 Mecab 형태소분석기 설치 – Install Mecab in Ubuntu

Mecab 메카브 간략 설명

설치방법

E: Unmet dependencies. Try ‘apt-get -f install’ with no packages (or specify a solution).

[Jenkins] Could not initialize class org.eclipse.jgit.internal.storage.file.FileSnapshot

jenkins install Certificate verification failed: The certificate is NOT trusted. The certificate chain uses expired certificate. Could not handshake: Error in the certificate verification.