IT System Preventive Maintenance
What is Preventive Maintenance?
You may have seen pop-up notifications on apps or websites for games, securities, banks, or shopping malls, announcing occasional service interruptions on weekends from certain hours (usually during the early morning). This is done to conduct regular preventive maintenance activities during times when users are least likely to use the service, ensuring the stable operation of the associated servers.
In companies of a certain size, there are various business processes, and numerous IT systems exist to support these processes. For example, there is an ERP (Enterprise Resource Planning) system for efficiently managing various resources within the company, a MES (Manufacturing Execution System) for managing manufacturing information and various equipment according to production processes in manufacturing companies, and an HRS (Human Resource System) for managing basic personnel information of employees. Running these IT systems 24/7 can lead to unexpected security issues (hacking), system performance degradation due to memory pool issues, and, in severe cases, server downtime.
Details of PM Activities
To prepare for such failure situations, preventive maintenance activities for IT systems should be performed on the operating servers at least once a quarter or semi-annually.
- Power Recycle: Refreshing garbage data on the server through the reboot process, which involves powering off and on the server.
- Driver Update: Applying the latest drivers (firmware) to prevent recognition errors between hardware and middleware software.
- Security & Patch: Regularly reflecting important updates (patches) for server operating systems like Windows, Unix, and Linux, ensuring they are upgraded to the latest version to prevent vulnerabilities from security holes or errors.
- HA Failover Test: Verifying and improving the failover operation of HA (High Availability) for rapid backup in the event of a temporary service interruption. *In backup mode, when the primary system experiences a failure or stops operating, the secondary system takes over its functions.
- Checking not only the server side but also confirming that integration between different application systems (sender/receiver) is functioning properly.
PM Sequence
List the host names, IPs, services, purposes, OS, and whether they are PM targets for the DB and application servers. During rebooting, check in the order of OS, DBMS, Web Application Server, and Application level.
'IT' 카테고리의 다른 글
카카오 서버 다운과 서비스의 이중화(High Availability) (0) | 2022.12.16 |
---|---|
구글 애드센스 계정 활성화 (0) | 2022.12.13 |
Multi-Factor Authentication (0) | 2022.11.05 |
Smishing, Phishing, Pharming (0) | 2022.11.04 |
구글 애드센스 승인 거절 - 사이트가 다운되었거나 사용할 수 없음 (0) | 2022.11.04 |