Ubuntu 16.04에서 Sphinx를 설치 및 설정하는 방법

<시간/>

이 기사에서는 Ubunt 16.04에서 Sphinx를 설치하고 설정하는 방법에 대해 배울 것입니다. Sphinx는 전체 테스트 검색을 허용하는 오픈 소스 검색 엔진이며 데이터가 있는 곳에서 방대한 데이터를 매우 효과적으로 검색하는 데 가장 좋습니다. 모든 소스(예 − SQL 데이터베이스, 일반 텍스트 파일 등)

스핑크스의 특징

고급 색인 및 쿼리를 위한 좋은 도구입니다.
높은 검색 성능 및 색인.
후처리를 위한 고급 결과입니다.
고급 검색으로 쉽게 확장할 수 있습니다.
SQL 및 XML 소스와 통합할 수 있습니다.
1000개의 쿼리로 방대한 데이터를 처리하도록 확장 가능합니다.

전제조건

시작하기 전에 몇 가지 전제 조건이 필요했습니다.

시스템에 대한 sudo 권한이 있는 루트가 아닌 사용자가 있는 Ubuntu 시스템이 필요했습니다.
머신에 MySQL이 설치되어 있습니다.

컴퓨터에 스핑크스 설치

apt-get을 사용하여 Ubuntu의 기본 패키지 저장소를 사용하여 Sphinx를 직접 설치할 수 있습니다. 아래는 Sphinx를 설치하는 명령입니다.

$ sudo apt-get install sphinxsearch
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
libmysqlclient20 libstemmer0d
The following NEW packages will be installed:
libmysqlclient20 libstemmer0d sphinxsearch
0 upgraded, 3 newly installed, 0 to remove and 92 not upgraded.
Need to get 2,608 kB of archives.
After this operation, 20.5 MB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 https://in.archive.ubuntu.com/ubuntuxenial/universe amd64 libstemmer0d amd 64 0+svn585-1 [62.1 kB]
Get:2 https://in.archive.ubuntu.com/ubuntuxenial-updates/main amd64 libmysqlclie nt20 amd64 5.7.15-0ubuntu0.16.04.1 [809 kB]
Get:3 https://in.archive.ubuntu.com/ubuntu xenial/universe amd64 sphinxsearch amd 64 2.2.9-1build1 [1,737 kB]
Fetched 2,608 kB in 2s (986 kB/s)
Selecting previously unselected package libstemmer0d:amd64.
(Reading database ... 117542 files and directories currently installed.)
Preparing to unpack .../libstemmer0d_0+svn585-1_amd64.deb ...
Unpacking libstemmer0d:amd64 (0+svn585-1) ...
Selecting previously unselected package libmysqlclient20:amd64.
Preparing to unpack .../libmysqlclient20_5.7.15-0ubuntu0.16.04.1_amd64.deb ...
Unpacking libmysqlclient20:amd64 (5.7.15-0ubuntu0.16.04.1) ...
Selecting previously unselected package sphinxsearch.
Preparing to unpack .../sphinxsearch_2.2.9-1build1_amd64.deb ...
Unpacking sphinxsearch (2.2.9-1build1) ...
Processing triggers for libc-bin (2.23-0ubuntu3) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for systemd (229-4ubuntu4) ...
Setting up libstemmer0d:amd64 (0+svn585-1) ...
Setting up libmysqlclient20:amd64 (5.7.15-0ubuntu0.16.04.1) ...
Setting up sphinxsearch (2.2.9-1build1) ...
Adding system user `sphinxsearch' (UID 119) ...
Adding new group `sphinxsearch' (GID 125) ...
Adding new user `sphinxsearch' (UID 119) with group `sphinxsearch' ...
Not creating home directory `/var/run/sphinxsearch'.
Processing triggers for libc-bin (2.23-0ubuntu3) ...
Processing triggers for ureadahead (0.100.0-19) ...
Processing triggers for systemd (229-4ubuntu4) ...

Sphinx용 테스트 데이터베이스 생성

이제 기본적으로 패키지와 함께 제공되는 샘플 데이터를 사용하여 하나의 테스트 데이터베이스를 만들어야 합니다. 그러면 이후 단계에서 Sphinx 검색을 테스트할 수 있습니다.

테스트 데이터베이스를 생성하고 샘플 데이터베이스를 가져올 MySQL에 로그인합시다.

$ mysql –u root –p
mysql> create database test;
Query OK, 1 row affected (0.01 sec)
mysql> SOURCE /etc/sphinxsearch/example.sql;
Query OK, 0 rows affected, 1 warning (0.01 sec)
Query OK, 0 rows affected (0.03 sec)
Query OK, 4 rows affected (0.01 sec)
Records: 4 Duplicates: 0 Warnings: 0
Query OK, 0 rows affected, 1 warning (0.00 sec)
Query OK, 0 rows affected (0.00 sec)
Query OK, 10 rows affected (0.01 sec)
Records: 10 Duplicates: 0 Warnings: 0
Mysql> quit

검색을 위한 Sphinx 구성

Sphinx에서는 인덱스, 검색 및 소스와 같은 필수 요소가 정의된 환경에 맞게 3개의 주요 블록을 편집하고 구성해야 합니다. 이들은 /etc/sphinxsearch/sphinx에 있는 구성 파일 sphinx.conf에 있습니다. 기존 샘플 구성 파일을 /etc/sphinxsearch 폴더에 복사해야 하는 conf.sample 파일

$ cp /etc/sphinxsearch/sphinx.conf.sample /etc/sphinxsearch/sphinx.conf
$ sudo vi /etc/sphoxsearch/sphinx.conf

구성 파일은 블록이 포함된 아래와 같아야 합니다.

sphinx.conf의 소스 블록

source src1
{
   type = mysql
   #SQL settings (for ‘mysql’ and ‘pgsql’ types)
   sql_host = localhost
   sql_user = roo
   tsql_pass = ubuntu
   sql_db = test
   sql_port = 3306 # optional, default is 3306
   sql_query = \
   SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \
   FROM documents
   sql_attr_uint = group_id
   sql_attr_timestamp = date_added
}

sphinx.conf의 인덱스 블록

index test
{
   source = src1
   path = /var/lib/sphinxsearch/data/test
   docinfo = extern
}
Searchd block in sphinx.conf
searchd
{
   listen = 9312:sphinx #SphinxAPI port
   listen = 9306:mysql41 #SphinxQL port
   log = /var/log/sphinxsearch/searchd.log
   query_log = /var/log/sphinxsearch/testquery.log
   read_timeout = 5
   max_children = 30
   pid_file = /var/run/sphinxsearch/testsearchd.pid
   seamless_rotate = 1
   preopen_indexes = 1
   unlink_old = 1
   binlog_path = /var/lib/sphinxsearch/datatest
}

일단 Sphinx를 인덱싱하는 데 필요한 구성을 편집합니다.

Sphinx에서 인덱스 관리

여기에서는 이전 단계에서 편집한 구성 파일을 사용하여 색인을 생성합니다.

$ sudo indexer –all
Sphinx 2.2.9-id64-release (rel22-r5006)
Copyright (c) 2001-2015, Andrew Aksyonoff
Copyright (c) 2008-2015, Sphinx Technologies Inc (https://sphinxsearch.com)
using config file '/etc/sphinxsearch/sphinx.conf'...
indexing index 'test'...
collected 4 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 4 docs, 193 bytes
total 0.007 sec, 24319 bytes/sec, 504.03 docs/sec
total 4 reads, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg
total 12 writes, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg

프로덕션 환경에서는 인덱스를 최신 상태로 유지해야 하므로 이를 위한 cronjob을 생성합니다.

$ crontab –e

파일 끝에 다음을 추가합니다.

# Edit this file to introduce tasks to be run by cron.
#
# Each task to run has to be defined through a single line
# indicating with different fields when the task will be run
# and what command to run for the task
#
# To define the time you can provide concrete values for
# minute (m), hour (h), day of month (dom), month (mon),
# and day of week (dow) or use '*' in these fields (for 'any').#
# Notice that tasks will be started based on the cron's system
# daemon's notion of time and timezones.
#
# Output of the crontab jobs (including errors) is sent through
# email to the user the crontab file belongs to (unless redirected).
#
# For example, you can run a backup of all your user accounts
# at 5 a.m every week with:
# 0 5 * * 1 tar -zcf /var/backups/home.tgz /home/
#
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h dom mon dow command
@hourly /usr/bin/indexer --rotate --config /etc/sphinxsearch/sphinx.conf –all

Sphinx 서비스 시작

구성 파일을 사용하여 인덱스를 구성했으므로 이제 Sphinx 구성 파일을 편집해야 하므로 기본적으로 Sphinx 데몬은 시작되지 않습니다. 이 /etc/default/sphinxsearch

에서 파일을 편집해야 합니다.

$ vi /etc/default/sphinxsearch
#
# Settings for the sphinxsearch searchd daemon
# Please read /usr/share/doc/sphinxsearch/README.Debian for details.
#
# Should sphinxsearch run automatically on startup? (default: no)
# Before doing this you might want to modify /etc/sphinxsearch/sphinx.conf
# so that it works for you.
START=yes

다음은 Sphinx Daemon을 시작하는 명령입니다.

$ sudo systemctl restart sphinxsearch.services

sphinxsearch 서비스를 다시 시작하면 아래 명령을 사용하여 상태를 확인합니다.

$ sudo systemctl status sphinxsearch.service
sphinxsearch.service - LSB: Fast standalone full-text SQL search engine
Loaded: loaded (/etc/init.d/sphinxsearch; bad; vendor preset: enabled)
Active: active (exited) since Mon 2016-09-19 13:00:20 IST; 1h 10min ago
Docs: man:systemd-sysv-generator(8)
Tasks: 0 (limit: 512)
Memory: 0B
CPU: 0
Sep 19 13:00:20 ubuntu-16 systemd[1]: Starting LSB: Fast standalone full-text SQL search engine...
Sep 19 13:00:20 ubuntu-16 sphinxsearch[7804]: To enable sphinxsearch, edit /etc/default/sphinxsearch and set START=ye
Sep 19 13:00:20 ubuntu-16 systemd[1]: Started LSB: Fast standalone full-text SQL search engine.

Sphinx 검색 테스트

이제 MySQL 인터페이스를 사용하여 포트 9306을 사용하여 SphinxQL에 연결합니다.

$ mysql -h0 -P9306
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 2.2.9-id64-release (rel22-r5006)
Copyright (c) 2000, 2016, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql>

데이터베이스에서 "test"라는 단어 검색

mysql> SELECT * FROM test WHERE MATCH('test '); SHOW META;
+------+----------+------------+
| id   | group_id | date_added |
+------+----------+------------+
|    1 |        1 | 1474272578 |
|    2 |        1 | 1474272578 |
|    4 |        2 | 1474272578 |
+------+----------+------------+
3 rows in set (0.00 sec)
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| total         | 3     |
| total_found   | 3     |
| time          | 0.000 |
| keyword[0]    | test  |
| docs[0]       | 3     |
| hits[0]       | 5     |
+---------------+-------+
6 rows in set (0.00 sec)

<블록 인용>

설정 및 구성을 사용하여 Sphinx를 더 효율적이고 방대한 데이터를 처리할 수 있는 강력한 검색 엔진으로 구성할 수 있습니다. sphinx 검색은 수십억 개의 문서를 처리할 수 있으며 1인당 수천 개의 검색 쿼리를 실행할 수 있는 테라바이트의 데이터를 처리할 수 있습니다. 두 번째.