Site Reliability Engineer
Ascendum Solutions
This job is no longer accepting applications
See open jobs at Ascendum Solutions.See open jobs similar to "Site Reliability Engineer" Purpose.Software Engineering
United States
Posted on Jul 27, 2025
Position Summary
The Senior Site Reliability/DevOps Engineer will specialize in developing scalable methods for building, deploying, and supporting cloud, on-prem and store focused enterprise services and systems. Work closely with Software Engineers to deploy and operate solutions; automate and streamline processes; build and maintain tools for deployment, monitoring of platform, and troubleshoot and resolve issues in development, test, and production environments. Demonstrate the company's core values of respect, honesty, integrity, diversity, inclusion and safety.
Job Functions
- Design and build infrastructure and systems that provide high levels of scalability, reliability, and performance for the company's stack, while balancing security, maintainability, reliability and operational excellence
- Work with the engineering team to continuously implement and improve reliable and speedy build environments for DEV & QA; provide timely build status updates; automate as much as possible to improve efficiency and quality
- Promote innovation, outside-of-the-box thinking, teamwork, and self-organization
- Ensure traceability, observability, and retrievability of system behavior
- Build logging, monitoring, and alerting systems to identify bottlenecks and assist with debugging, analysis, and optimization in cloud, on-prem and store environments
- Experiment with and recommend new technologies that simplify or improve the company tech stack
- Craft solid and clearly explained designs, playbooks, and documentation, for consumption by teammates and the larger engineering organization
- Participate in an off-hours on-call rotation, and perform periodic off-hours work during maintenance windows
- Must be able to perform the essential job functions of this position with or without reasonable accommodation
Minimum Skills
- Bachelor's Degree computer science or equivalent related experience (8+ years) and strong theoretical fundamentals (data structures, algorithms, lock-free data structures, multithreaded architectures etc.)
- Any experience with always-on and high-volume web server stack, Azure/GCP PaaS and Azure/Google networking, provisioning native Managed Apps & CI/CD pipelines
- Any understanding of SSH, VPN, TCP/IP, DNS, HTTP(S), network routing and subnets
- 4+ years of experience in the cloud SRE/DevOps/Infrastructure, or any related fields
- Proven knowledge of technology to support omnichannel experiences
- Knowledge of Linux architecture, security, administration, performance monitoring/tuning, troubleshooting, and production operations
- Fluent in Shell Scripting with experience implementing automation and monitoring using shell scripting and other related tools
- Proven knowledge of service-oriented architecture/Cloud
Desired Skills
- Master's Degree
- Other PHD in computer science, information systems, or related field
- Any experience with CI/CD pipelines using tools such as Jenkins, Spinnaker, Azure DevOps, TeamCity, etc.
- Any experience with Azure DevOps services such as DevOps, Pipelines, Test Plans, Artifacts, etc.
- Any experience with Nginx, HAProxy, Squid
- 1 year of experience managing System Observability experience (ELK, PagerDuty, Datadog, New Relic, Azure Monitor, Grafana, etc)
- 1 year of experience with technologies such as Kafka, RabbitMQ, SQS, Ansible, Terraform, Docker and Kubernetes, Jenkins, Spinnaker, Azure DevOps, TeamCity
- 2+ years of experience configuring and managing cloud infrastructure (AWS, GCP, Azure)
- 4+ years of experience in designing/working in high volume eCommerce applications
- Microsoft Azure Certification
- Experience in retail or healthcare industries
This job is no longer accepting applications
See open jobs at Ascendum Solutions.See open jobs similar to "Site Reliability Engineer" Purpose.