SRE Engineer

Company:  Matchtech
Location: Basingstoke
Closing Date: 23/10/2024
Salary: £60000 - £65000/annum
Hours: Full Time
Type: Permanent
Job Requirements / Description
MAIN RESPONSIBILITIES * As a Site Reliability Engineer (SRE) you will be supporting our 24/7 service Level Objectives (SLO) and our customer Service Level Agreements (SLA). * Monitoring key performance indicators on the platforms and working collaboratively with engineers across multiple teams and with external service providers to build a sustainable, scalable, distributed, fault-tolerant system. * Maintaining and monitoring the deployment pipelines and build processes for development, staging and production environments, managing 3rd party activities and preparing communications to service users in advance of any downtime. * Ensure the quality (e.g. security, resilience, performance) of the applications and cloud infrastructure we design and deploy, co-ordinating with testing, security and development teams as needed to ensure issues are flagged and resolved in a timely manner. * Manage the end-to-end process for identifying production and staging issues, troubleshooting, performing root cause analysis, and ultimately resolving any issues from the network and application layers all the way down to the system level. This might include anything from digging into source code (our own or from open-source projects), hunting memory leaks, tracing bottlenecks in upstream networks, or database query optimization. * Onboarding new customers, including the provisioning of data within various connected system.to enable our service users * Working as part of a cross disciplinary team to identify key areas of improvement within the platform and build a solution, to automate monitoring solutions for the next generation platform and to drive efficiencies. * Engaging with our regional leadership teams to enable service capacity planning and demand forecasting, anticipating performance bottlenecks, and preventing or mitigating any impact. * Design and implement the tools and processes used for deployment and change management * Plan and execute disaster recovery drills Promotion of a DevOps/DevSecOps culture and adoption of principles throughout the organisation REQUIRED EDUCATION AND QUALIFICATIONS Education Level: * Degree in IT, Computer Science or related technology field or relevant experience * Cloud Provider certifications are an advantage, including IBM, Microsoft and AWS Qualifications: * Cloud Provider certifications are an advantage, including IBM, Microsoft and AWS REQUIRED SKILLS AND COMPETENCIES * Comprehensive knowledge of containerisations, IaaS and PaaS services along with proven experience of managing production Kubernetes containers and container orchestration * Strong experience with cloud security along with an understanding and awareness of risk, vulnerabilities, mitigation. * Strong Software Engineering experience, with an ability to work in multiple programming languages including but not limited to ECMA Script, NodeJS, PowerShell, C++ and Java. * Strong Process automation tool/methodology experience. * Proven ability to communicate clearly and effectively across language barriers and to users of differing technical levels. Fluency in speaking and writing English is essential. * Experience of creating system level documentation. * Experienced with IT Infrastructure best practices, including - Network, Firewalls, Load Balancers and a broad knowledge of monitoring, logging and analytic best practices. Experience with implementing SIEM tools and services is desirable * Self-Starter with experience of working as part of a distributed team along with the ability to collaborate with multiple teams across several time zones. * Agile mindset, strong DevOps/DevSecOps alignment, and a will to continuously improve, both the platform and our ways of working
Apply Now
Share this job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙