The Sr Manager, Cloud Site Reliability Engineering - Platform will be a key leader driving the cultural transformation of Deltek's Platform SRE teams to improve the customer experience and build 24x7x365 operational capabilities to support our Cloud products.
Driven to continuously improve and expand the teams' capabilities, you will be comfortable leading the team, their projects, and incident management activities.
Working with teams across the organization, you will constantly innovate to enhance our service delivery to all internal and external customers through both technical and operational improvements.
Lead the transformation of the Platform SRE teams (Infrastructure and Database) into an agile entity capable of addressing all issues uniformly and efficiently
Identify opportunities for operational improvements and drive the realization of those improvements across the organization
Serve as an escalation point both for customer management and complex troubleshooting needs.
Create, review, and update policy and technical controls on an ongoing basis to ensure compliance and process optimization across the teams.
Serve as a guide and mentor to members of the Platform SRE teams to aid in their growth and development
Leads, in collaboration with other Cloud SRE teams, Cloud Delivery, and Architecture, to continuously improve application logging, monitoring, alerting, and resilience;
steward of the related tools and roadmap.
Serves as a partner to Product Engineering, Cloud Delivery, and PMO in providing leadership for technology solutions and adoption of CI / CD tools and best practices and help drive efficiency gains.
Provides clear and consistent communication and guidance to Cloud Senior Leadership and key business stakeholders regarding opportunities for innovation, risks areas, and trends to watch.
Bachelor's degree in Computer Science or equivalent. Master's degree preferred.
8+ years of experience managing enterprise cloud infrastructure at scale (AWS and Azure) through CI / CD solutions, automation and configuration management, infrastructure as code, microservices, containers, orchestration, and systems operations.
5+ years of hands-on systems engineering, software development, or architect experience.
5+ years of experience leading a team of systems engineers or software developers
3+ years working in a role with a strong focus on developing DevOps solutions and cloud architectures
Strong technical proficiency to effectively troubleshoot and solve problems in a complex multi-platform environment
Extensive experience managing applications based on a variety of common technology stacks (Microsoft, Linux, IIS, LAMP, WebLogic, Oracle, SQL, etc.)
Experience with automated continuous integration and builds (CI / CD) using Jenkins, Git, CloudFormation, Terraform, Ansible and other modern tools.
Extensive experience with monitoring tools like AppDynamics, Splunk, PRTG, SolarWinds DPA, Nagios, CloudWatch, Azure Analytics, and / or PagerDuty
A love of metrics and statistics
Excellent communicator - Communicate with simple messages, craft compelling business cases, and establish trust and credibility with stakeholders.