We're looking for a DevOps Site Reliability Engineer (SRE) who loves coding and interested in working closely with engineers to help design, build and maintain high-performing, scalable, testable and reliable services.
We believe SREs play a crucial part in providing engineers with tools, best practices and expertise to help them be responsible for the software they build.
We offer:
- Fun startup culture with diverse and inclusive international teams
- Happy Friday sharings and happy-hour gatherings
- Flexible working hours and unlimited work-from-home
- 5-day work week and 15 days annual leave
- Benefits include medical / dental coverage and performance bonus
- Employment visa sponsorship (if required)
FYI, our tech stacks:
- Cloud: Amazon Web Services, Google Cloud Platform
- Database: MongoDB, Google Spanner, Redis, DynamoDB, Amazon Aurora
- Monitoring: New Relic, Pingdom
- Automation: Ansible, Packer
- Continuous Integration: Jenkins, Travis CI
- Tools: Atlassian Suite
- Other SaaS: Algolia, Mixpanel, SendGrid, Twilio, Contentful, Google Analytics
- Deployment: Kubernetes, Docker
- DNS: CloudFlare
Responsibilities:
- Ensuring continuous service delivery with high reliability and operation performance to meet SLAs
- Managing optimisation of SaaS and infrastructure usage and cost
- Collaborating with engineering teams to perform troubleshooting to investigate and respond to infrastructure outages
- Designing and implementing automated architecture solutions to minimise related workload for product engineering teams
Requirements:
- 2+ years hands-on experience with Amazon Web Services or Google Cloud Platform
- Experience with DNS, load balancing, failover strategies, Blue-Green and Canary deployments
- Ability to setup automated monitoring and alerting systems
- Dedication towards up-time and service-level objectives
- Excellent password hygiene and sense of security awareness
- Some experience with log aggregation, analysis and troubleshooting
- Ability to communicate with multiple teams and IaaS / SaaS vendors effectively