As a senior member of the Global Cloud Database Reliability Engineering team, you will work on scaling our systems, building automation, and drive operational database improvements for Deltek’s Global Cloud Products.
We are seeking a driven individual with a passion for database performance management to help ensure an outstanding customer experience.
Work closely with Cloud SRE, Delivery, and Architecture to build highly resilient, performant and globally replicated databases.
Build automation around database problems like performance tuning, backups, restores, failover, schema changes and disaster recovery testing.
Provide feedback on database performance and availability to architectural designs and code reviews.
Instrument observability and analytics tooling (AppDynamics, Oracle Enterprise Manager, SolarWinds DPA) to help detect database performance bottlenecks and prevent issues before they happen.
Administration and management of a variety of cloud resources on both Amazon Web Services and Azure.
Analysis and troubleshooting of complex application and database issues.
Work within one of our 24x7 schedules (Sunday Thursday or Tuesday Saturday) and shifts (morning, mid, or night).
Participating in maintenance activities and on-call rotations as required.
Execute disaster recovery plans and reporting on metrics related to those activities.
Bachelor's degree in Computer Science field or equivalent. Master’s degree preferred.
7+ years of experience operating and administering Oracle databases in large production environments at scale on public cloud infrastructure (Amazon Web Services is desired).
5+ years of experience with Linux, Windows, and related technologies especially with troubleshooting production systems.
5+ years of experience applying an automation first approach to problem solving leveraging configuration management tools and scripting (e.
g., Bash, Python, PowerShell).
Strong understanding of database optimization, database schema, and systems architecture.
Ability to troubleshoot SQL performance issues in public cloud production environments
Experience with data replication, high availability, and data migrations tools and techniques.
Extensive experience with monitoring tools like AppDynamics, Splunk, PRTG, SolarWinds DPA, Nagios, CloudWatch, Azure Analytics, and / or PagerDuty.
Must be detail oriented, task driven, and have excellent English communication skills.
Experience with continuous integration and continuous deployment tools (e.g., GitHub, Azure DevOps, Jenkins).
Experience with configuration management and orchestration (e.g., Terraform, Cloud Formation, Ansible) is a plus.
Passionate and curious about ways to leverage technology while continually learning.
Certifications in any of the above technologies (Professional level preferred).