Truevo Payments Limited

DevOps & Site Reliability Engineer

  • Basis:  Full-Time
  • Closing Date:  01 Sep, 2022
  • Job Ref:  KMP-74548

Job Description

We are looking for DevOps and Site Reliability Engineers to join our team to contribute to the overall automation of our infrastructure, releases, availability, health and performance of our payment platform, primarily built on AWS Services and microservice architecture.

As a DevOps / SRE you will need to be passionate about automating our current infrastructure build-out, maintaining and improving our release and deployment pipeline, improve our platform quality through capacity management, health detection and recovery tooling.

If you are purpose-driven and relish solving challenges with creative compassion, a flexible remote working lifestyle and an above-market salary, you are looking at the right place.

Your challenge

  • Automation of manual processes and building pipelines that only release versioned and quality-tested artefacts (software and infrastructure) in our different environments (sandbox, stage, production)
  • Build and manage distributed infrastructure, applications and monitoring through a defined set of SLOs and SLIs, all through automation
  • Create a self-service sandbox environment that mimics production for our software development teams to consume
  • Low-level analysis and troubleshooting together with our software engineering teams to identify the root cause of technical incidents
  • Overseeing our alerting and logging mechanism to ensure our platform achieves the desired availability and provide feedback on standards we can adapt
  • Review code, database scripts and other application configuration settings before being released to production
  • Keep an eye on platform performance and identify areas of improvement that can help shape the direction and scalability of our platforms
  • Propose innovative technologies, tool sets and capabilities that help in the overall performance and availability of the platform
  • Promote a can-do and work-to-finish attitude, recognising that delivering certain projects on time might require long hours on certain occasions
  • Understand the overall working functions of the microservices platform, its ecosystem and the value it brings to customers and their objectives
  • Serve the software development teams in their requirements
  • Participate in our 24/7 on-call rotation

Essential skills

  • A passion for overall systems engineering and automation
  • Desire to wear multiple hats
  • Excellent communication and team player skills
  • Experience in TCP / IP / network stacks including load balancers, firewalls, WAF’s, CDN’s
  • Experience working with Linux, Windows environments and cloud platforms such as AWS
  • Experiencing scripting and automation
  • Great problem-solving skills and the ability to find the root cause of issues through troubleshooting and blameless retro’s

Bonus skills (additional points if you have some of these skills)

  • Experience to use Terraform to provision infrastructure
  • Experience and comfortable working with configuration management tools (Ansible, Chef)
  • Experience with container technologies
  • Experience with databases, mem cache technologies and message queues – MsSQL, MySQL, PostgreSQL, Redis etc.
  • Experience working with Git and CI / CD tools such as Bitbucket, Bitbucket pipelines, Octopus Deploy, Teamcity, Jenkins etc.