๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
  • ๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ป ๐ŸŒฎ ๐Ÿ’ฌ
๐Ÿ‘ฉ๐Ÿป‍๐Ÿ’ป/database

[MySQL] match ... against ํ•œ๊ธ€์ž ๊ฒ€์ƒ‰ ์˜ค๋ฅ˜ (innodb_ft_min_token_size, ngram_token_size)

by ๋ฐ”์ฟ„๋ฆฌ 2025. 5. 30.

๊ฐœ์š”

SELECT * FROM articles_cn WHERE MATCH(ner_text) AGAINST('็ซ' IN BOOLEAN MODE);

 

์ด ์ฟผ๋ฆฌ๋กœ ๊ฒ€์ƒ‰์„ ํ•˜๋‹ˆ, ์•„๋ฌด๊ฒƒ๋„ ์ถœ๋ ฅ๋ฐ›์ง€ ๋ชปํ–ˆ๋‹ค.

 

 

ํ˜น์‹œ๋‚˜ ํ•ด์„œ ๋‹ค๋ฅธ ์ฟผ๋ฆฌ๋ฅผ ์‚ฌ์šฉํ•ด ๊ฒ€์ƒ‰ํ–ˆ์„ ๋•

SELECT * FROM articles_cn WHERE JSON_CONTAINS(ner, '"็ซ"');

 

์ž˜ ๋‚˜์˜จ๋‹ค.. ๋ญ ๋•Œ๋ฌธ์— ์•ˆ๋‚˜์˜ค๋Š” ๊ฑธ๊นŒ?

 

ํ™•์ธ

CREATE TABLE `articles_cn` (
  `id` int unsigned NOT NULL AUTO_INCREMENT,
  `title` varchar(255) NOT NULL,
  `content` text NOT NULL,
  `ner` json,
  `ner_text` text,
  PRIMARY KEY (`id`),
  FULLTEXT KEY `idx_ner_text` (`ner_text`) WITH PARSER ngram
) ENGINE=InnoDB;

 

์ € ํ…Œ์ด๋ธ”์€ ์ด DDL๋กœ ์ƒ์„ฑ๋˜์—ˆ๋‹ค.

ner_text๋ฅผ FULLTEXT KEY๋กœ ๋“ฑ๋กํ•ด์„œ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค.

๊ทธ๋ž˜์„œ ngram ์„ค์ •์„ ๋จผ์ € ํ™•์ธํ–ˆ๋‹ค.

SHOW VARIABLES LIKE 'ngram_token_size';

MySQL์€ ngram ํŒŒ์„œ๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ Default token size = 2๋กœ ์„ค์ •๋˜์–ด์žˆ๊ธฐ๋•Œ๋ฌธ์—

1๊ธ€์ž์งœ๋ฆฌ ๋‹จ์–ด๋Š” ์ธ๋ฑ์Šค์— ๋“ฑ๋ก๋˜์ง€ ์•Š๋Š”๋‹ค.

 

์ง„ํ–‰

1. my.cnf ํŒŒ์ผ ์ˆ˜์ • (my.cnf ๊ด€๋ จํ•ด์„œ ์ด์Šˆ๊ฐ€ ์žˆ์—ˆ๋‹ค ใ… ใ…  -> ์ •๋ฆฌ: https://bonory.tistory.com/166)

[mysqld]
innodb_ft_min_token_size = 1

 

2. mysql restart

3. ngram_token_size ํ™•์ธ

SHOW VARIABLES LIKE 'ngram_token_size';

 

4. ํ…Œ์ด๋ธ” ์‚ญ์ œ ํ›„ ์žฌ์ƒ์„ฑ / index ์‚ญ์ œ ํ›„ ์žฌ์ƒ์„ฑ ํ›„ ํ…Œ์ŠคํŠธ

SELECT * FROM articles_cn WHERE MATCH(ner_text) AGAINST('็ซ' IN BOOLEAN MODE);

 

์˜ค์˜ˆ ํ•œ๊ธ€์ž๋„ ์ž˜ ๊ฒ€์ƒ‰๋œ๋‹ค !!!