๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
  • ๐Ÿ‘ฉ๐Ÿปโ€๐Ÿ’ป ๐ŸŒฎ ๐Ÿ’ฌ
๐Ÿ‘ฉ๐Ÿป‍๐Ÿ’ป/aws

[EC2 Instance] file descriptor leak issue

by ๋ฐ”์ฟ„๋ฆฌ 2024. 8. 6.

๊ฐœ์š”

Docker๋กœ Airflow๋ฅผ ๋„์šด ec2 instance๊ฐ€ ํ•œ,๋‘๋‹ฌ์— ํ•œ๋ฒˆ์”ฉ ssh ์ ‘๊ทผ์ด ๋ถˆ๊ฐ€๋Šฅํ•œ ์ƒํƒœ๊ฐ€ ๋ฐ˜๋ณต๋˜์—ˆ๋‹ค.

AWS ec2 instance ์ƒํƒœ๋Š” running ์ƒํƒœ์ธ๋ฐ airflow๊ฐ€ ๋‚ด๋ ค๊ฐ€์žˆ๊ณ  ssh ์ ‘๊ทผ์ด ์•ˆ๋˜๋Š” ๊ฒƒ

์ด ์„œ๋ฒ„๊ฐ€ ๋ฌด๊ฒ๊ฒŒ ๋Œ์•„๊ฐ€๊ณ  ์žˆ์–ด์„œ, cpu ๋ฌธ์ œ์ด๊ฑฐ๋‚˜ memory ๋ฌธ์ œ๋ผ๊ณ  ํŒ๋‹จํ•˜์—ฌ์„œ

์ฃผ๊ธฐ์ ์œผ๋กœ memory ์‚ฌ์šฉ๋Ÿ‰๋„ log๋กœ ๊ธฐ๋กํ•˜๊ณ , ec2 instance monitoring์œผ๋กœ CPU ์‚ฌ์šฉ๋ฅ ์„ ํ™•์ธํ•ด๋„ ๋šœ๋ ทํ•œ ์›์ธ์„ ์ฐพ์ง€ ๋ชปํ–ˆ๋‹ค.

 

ํ™•์ธ

 airflow๊ฐ€ ๋‚ด๋ ค๊ฐ„ ์‹œ์ ์˜ log๋ฅผ ํ•˜๋‚˜ํ•˜๋‚˜ ์ฐพ์•„๋ณด๋‹ˆ

[ERR] Connection could not be made due to the following error: 
 (pymysql.err.OperationalError) (2003, "Can't connect to MySQL server on '...' ([Errno -3] Temporary failure in name resolution)")
(Background on this error at: https://sqlalche.me/e/20/e3q8)

 

์–ด? Database ์—ฐ๊ฒฐ์ด ์•ˆ๋˜์—ˆ๋‹ค.

์—ฌ๊ธฐ์„œ ์ด์‚ฌ๋‹˜์ด ๋ฐ”๋กœ file descriptor ๋ถ€์กฑํ•œ๊ฑฐ ์•„๋‹ˆ๋ƒ๊ณ  ๋ง์”€ํ•˜์…จ๋‹ค.

 

file descriptor๋ฅผ ๋ชจ๋‘ ์‚ฌ์šฉํ•ด๋ฒ„๋ ค์„œ ssh ์ ‘์†๋„ ์•ˆ๋˜์—ˆ๊ณ , database ์—ฐ๊ฒฐ๋„ ์•ˆ๋˜์—ˆ๋˜ ๊ฒƒ์ด๋‹ค.

 

์ง„ํ–‰

file descriptor ๊ฐœ์ˆ˜ ๋Š˜๋ ธ๋‹ค

vi /etc/security/limits.conf

* hard nofile 65535
* soft nofile 65535

root hard nofile 65535
root soft nofile 65535

 

default 1024๋กœ ๋˜์–ด์žˆ๋˜ ๊ฐ’์„ 65535๋กœ ์ˆ˜์ •ํ–ˆ๋‹ค.

$ ulimit -Sn
65535

$ ulimit -Hn
65535

์ฆ๊ฐ€๋œ ๊ฐ’ ํ™•์ธํ•˜๊ณ  instance reboot ํ•ด์ฃผ์—ˆ๋‹ค.

 

๊ทธ ํ›„

๋ฌด๊ฒ๊ฒŒ ๋Œ์•„๊ฐ€๋Š” ์‹œ์ ์— ํŒŒ์ผ ์ˆ˜๋ฅผ ํ™•์ธํ•ด๋ณด์•˜๋‹ค

$ cat /proc/sys/fs/file-nr
15744	0	9223372036854775807

์ตœ๋Œ€ ํŒŒ์ผ ์ˆ˜ 9223372036854775807 ์—์„œ 15744๊ฐœ ์˜คํ”ˆ๋˜์—ˆ๋‹ค.

 

์•„์ฃผ ์—ฌ์œ ์žˆ๋‹ค.

๊ณ„์† ๋ชจ๋‹ˆํ„ฐ๋ง ํ•ด๋ณด์ž