Devops SRE Professional Unitforce Technologies Consulting

  • company name Unitforce Technologies Consulting
  • working location Office Location
  • job type Full Time

Experience: 0 - 2 years required

Pay: INR 1 - INR 10001 /Month

Type: Full Time

Location: Bengaluru

Skills: Engineering services, Redhat, Git, Linux, ISO 9001, Shell Scripting, Consulting, Social networking

About Unitforce Technologies Consulting

Job Description

System Reliability : Design, implement, and maintain systems and infrastructure to ensure high availability, reliability, and performance of software applications and services. Incident Response : Monitor system health, performance metrics, and alerts to detect and respond to incidents, outages, and service disruptions in real-time. Implement incident response procedures, runbooks, and escalation protocols to minimize downtime and impact on users. Service Level Objectives (SLOs) : Define, measure, and enforce service level objectives (SLOs) and service level agreements (SLAs) to establish performance targets and reliability goals for critical systems and services. Automation and Tooling : Develop automation scripts, tools, and processes to streamline system provisioning, configuration management, deployment, monitoring, and incident response workflows. Capacity Planning : Perform capacity planning, load testing, and performance tuning to ensure that systems can handle expected traffic loads, scale dynamically, and meet demand spikes without degradation in performance or reliability. Change Management : Implement change management processes, version control practices, and configuration management tools to manage changes, releases, and updates to production systems in a controlled and predictable manner. Infrastructure as Code (IaC) : Implement infrastructure automation using infrastructure as code (IaC) tools such as Terraform, Ansible, or Chef to provision, configure, and manage cloud resources and environments. Monitoring and Observability : Set up monitoring, logging, and observability tools to collect, analyze, and visualize system metrics, logs, and traces for proactive monitoring, troubleshooting, and performance analysis. Continuous Improvement : Continuously evaluate, optimize, and improve system architecture, reliability patterns, and operational processes based on incident postmortems, performance analysis, and lessons learned from production incidents.