Service Reliability Engineer

Extelligence is an intelligent partner that goes the extra mile. We provide customized information management solutions for major industries. Our team in Prague and Bucharest is working with international companies, transforming, and adding value to their business on a daily basis. We are growing quickly, and we are interested to bring more talented individuals into our team. 


  • Design, build, and maintain highly available and scalable services or applications, focusing on observability.
  • Develop and implement monitoring, tracing, and logging solutions to provide deep insights into service behavior and performance.
  • Collaborate with cross-functional teams to define key performance indicators (KPIs) and service level objectives (SLOs) for the services.
  • Establish and maintain service dashboards and visualizations to provide real-time visibility into service health and performance.
  • Develop and maintain alerting systems to proactively detect and respond to service anomalies or degradation.
  • Analyze and troubleshoot complex issues related to service performance, reliability, and availability.
  • Conduct post-incident analysis using observability tools and implement improvements to prevent similar incidents in the future.
  • Drive continuous improvement of the observability platform, including evaluating and adopting new tools and technologies.
  • Participate in capacity planning exercises and provide recommendations to ensure optimal service performance.
  • Collaborate with development teams to design and implement monitoring, tracing, and logging instrumentation within the services.


  • Bachelor’s degree in computer science, Information Technology, or a related field.
  • Strong experience in software development or system administration, with a focus on observability.
  • Proficiency in programming and scripting languages such as Python, Java, Ruby, or Bash.
  • Solid understanding of distributed systems, microservices architectures, and observability principles.
  • Experience with observability tools such as Prometheus, Grafana, Jaeger, Elasticsearch, or Splunk.
  • Familiarity with distributed tracing and logging frameworks, such as OpenTelemetry or Fluentd.
  • Knowledge of cloud technologies (e.g., AWS, GCP, Azure) and containerization (e.g., Docker, Kubernetes).
  • Understanding of infrastructure-as-code tools such as Terraform or Ansible.
  • Strong problem-solving and troubleshooting skills, with the ability to analyze complex system behavior using observability data.
  • Excellent communication and collaboration skills to work effectively with various teams.

Working with Extelligence:

  • We take care of the important things that matter to contractors, for example, we guarantee on-time payment for your work. You will never have to chase us for payment.
  • We always seek to have long term relationships with our team and we always seek to offer opportunities to extend cooperation beyond the first contract or project.
  • Extelligence is a multicultural team, we have more than 15 different nationalities working with us.
  • We also organize events to bring our team together including team building activities and social events.
Job Type: Contract
Job Location: Czech Republic Hybrid in Prague Prague

Apply for this position

Allowed Type(s): .pdf, .docx