Important:
Please use Google Chrome when applying for job to avoid any system errors due to browser incompatibility
Senior Platform Reliability Engineer
Overview
The Platform Reliability Engineer is responsible for ensuring the safety and stability of Akamai’s global platform. This role requires in-depth knowledge of proprietary software and services in addition to a “big picture” understanding of the entire platform. The Platform Reliability Engineer works directly with multiple teams at Akamai to ensure the needs of a dynamic and evolving distributed platform
As a Sr. Platform Reliability Engineer you will:
* Provide operational management and support of Akamai’s global production network by collaborating with multiple teams on multiple products that extend across the company’s global sites.
* Conduct and run software rollouts that span massive, multi-service, complex networks.
* Drive platform improvements and increase operational efficiencies by analyzing behaviors and trends, implementing code, increasing automation, creating and improving alerts and notifications
* Utilize Key Performance Indicators to identify issues early, guide operational responses and build processes to improve TTR
* Resolve complex technical issues that can impact multiple systems, tools, products and customers.
* Represent GPO Technologies in standard incident reviews and other operational events
* Handle issues from our Network Operations Command Center with a persistent eye toward process improvement and incident prevention.
* Collaborate with Development, Performance and Operations teams across the globe and drive activities that promote platform stability and operational growth
About the Team
Global Performance & Operations is responsible for managing, monitoring, and maintaining Akamai’s distributed computing platform in order to achieve the highest possible availability, agility, safety, efficiency, and scalability.
Required Skills and Education:
* 5 years of relevant experience and a Bachelors degree or
* 3 years of relevant experience and a Masters degree
Required Skills
* 5+ years of experience years of experience in operations or system administration
* 5+ years of experience with Unix/Linux systems
* 5+ years of experience with internet protocols such as DNS, HTTP, SSL
* 3+ years of experience with a scripting language (Python, Perl, shell)
* 3+ years of experience networking concepts (TCP/IP, UDP, etc.)
* 3+ years of experience working with SQL and writing SQL queries
Desired Skills
* Excellent communication skills and ability to effectively work in a multi-team environment
* Network Operations experience with Internet-scale systems
* Experience with complexity analysis and software design
* Thorough understanding of distributed systems and Internet principles
* Ability to stay organized and keep multiple priorities on target
* Self-motivated with a consistent drive towards results