- Connect with customers, understand users' requirement, lead the team to complete operation and maintenance delivery.
- Team management, train and coordinate for team members, making maintenance plan and daily work arrangement.
- Manage technical documents and report to customers, including weekly report, monthly report, accident report, etc.
- Be responsible for the delivery quality and make continuous improvement.
- Be responsible for the construction, daily maintenance, monitoring, early warning and fault handling of the company's big data cluster, and continuously improve the big data platform to ensure stability and security.
- Be responsible for cluster capacity planning, capacity expansion, upgrading and cluster performance optimization, and participate in the architecture design and improvement of big data infrastructure environment.
- Solve the daily problems encountered by developers and sort out relevant processes.
- Be responsible for the deployment, operation and maintenance and troubleshooting of container products, including mode optimization, cluster management, current network fault location, performance analysis and optimization.
- Responsible for container product deployment and optimization document writing.
- Bachelor's Degree or above in Information Technology or equivalent
- 5+ years of experience managing IT operations and IT support
- 3+ years team management experience locally and internationally
- 1+ year Big Data experience
- 2+ years working experience with container and cluster operations
- In depth understanding of Cloud computing, virtualization, SAAS, PAAS, IASS
- Own the ability of quickly response to emergencies. Rich experience of handling issues remotely
- Proficient in improving operation and maintenance competency by combining various methodology such as policy, technical skills, and communication
- Equip with excellent sense of service, communication skill, and team spirit. High level capability on critical thinking, analyzing, problem solving and executing
- Familiarity with Linux
- Possess deep comprehension of installing, troubleshooting, tuning and expanding of hdfs/hive/spark/yarn/mongo/Hbase/Redis/Kafka/Zookeeper (80% meet is acceptable)
- Expert in Python, Bash, or Ansible
- Familiar with K8s ceph harbor Prometheus components.
- Adept at network performance optimizing and troubleshooting
Bernice Mae Nocum Rallonza EA License No.: 02C3423 Personnel Registration No.: R1442141