Duties:
1. IDC construction and operation and maintenance:
• Responsible for the planning, design and implementation of the company's IDC construction, ensuring the stability, high availability and security of the company's data center infrastructure.
• Collaborate with business requirements for resource planning, manage and optimize the hardware facilities, network devices and storage systems in the data center, ensuring business continuity and efficient operation.
• Monitor and maintain the IDC network architecture, ensuring the reasonable scheduling and distribution of network resources.
2. Public cloud and hybrid cloud operations:
• Responsible for the daily operation and maintenance of public clouds (AWS, Azure, GCP), and for formulating and implementing cloud resource management, network configuration, load balancing, etc. related work.
• Plan and manage large-scale hybrid cloud environments, design multi-cloud architecture solutions, and achieve network interconnection, resource scheduling, and security management in multi-cloud environments.
• In-depth understanding and implementation of cloud-native network plugins (such as CNI, Calico, Flannel, etc.) and their integration and optimization with cloud platforms.
3. Network planning and optimization:
• Responsible for the design, planning and optimization of the company's network architecture, including LAN, WAN, cloud network, data center network, etc. various network systems.
• Proficient in TCP/IP protocol, familiar with Linux network protocol stack and network layer structure, and able to independently diagnose and solve complex network failures.
• Optimize network topology, improve network performance, reliability and security, and ensure stable online business operations.
• Participate in the design and development of a network automation operations system, realizing the automation of network device and service management, configuration, monitoring and failure recovery.
• Optimize network operations through automated processes, reduce manual intervention, and improve fault response speed.
5. Network monitoring and fault emergency response:
• Establish and improve the network monitoring system, real-time monitoring of the network performance, traffic, bandwidth and other key indicators of the data center and cloud environment.
• Regularly conduct failure drills to enhance the team's emergency response capabilities in scenarios such as sudden failures and network attacks, ensuring the continuous and stable provision of network services.
• Quickly locate and resolve network failures to ensure high availability and business continuity of business systems.
6. Network operation process and standardization:
• Design and optimize network operations processes to ensure that network device installation, configuration, updates, and maintenance operations comply with best practices.
Requirements:
1. Education background and experience:
• Bachelor's degree or above, majoring in Computer Science, Communication, Network Engineering, etc.
• More than 5 years of network operations or infrastructure operations related work experience, with rich IDC construction and public cloud operations experience.
• Have large-scale hybrid cloud operations experience, and be able to efficiently schedule and manage resources in private and public cloud environments.
2. Computer Network Proficiency:
• Proficient in TCP/IP protocol, in-depth understanding of network protocol stack, routing principle and switching principle, and able to independently diagnose and optimize network faults.
• Familiar with the basic principles and practices of network security, able to design and implement a secure and efficient network architecture.
3. Cloud platform operation and maintenance ability:
• Proficient in the principles and operations of public cloud products (AWS, Azure, GCP), and able to efficiently manage network resources in the cloud environment.
• Familiar with cloud-native technologies, with practical experience in cloud-native network plugins (CNI, Calico, etc.), and able to independently complete network design and optimization for cloud platforms.
4. Script development ability:
• Have development and management experience in automated operations systems, and be able to design and implement network automation management solutions based on scripts.
5. Network device management and operation:
• Familiar with common routers, switches, load balancers, firewalls, etc. network devices, understand their principles and configuration management, and be able to independently configure and troubleshoot devices.
• Have extensive experience in large-scale network device management and network architecture design, and be able to ensure network stability under high loads.