No of Positions: 2
Tentative Start Date: September 30, 2021
Work From : Offsite
Rate : $ 15 - 25 (Hourly)
Experience : 5 to 8 Year
Here are the job Details
Important aspects of the job include:
It's MySQL, thousands of instances in hundreds of replication hierarchies, some of them seeing substantial load, the foundation of our Application Data Infrastructure.
It's automated. But as our systems are evolving, this automation needs improvement, extension and refactoring to meet the changing requirements of a different environment.
It's Python, and Go. And being at the center of most, if not all applications, it is literally talking to everything else.
It's moving to all the platforms, including Openstack, Kubernetes and the public cloud.
It's dynamic. With automated capacity testing, restore testing, failover testing and disaster recovery testing, it needs to be able to adapt to planned and unplanned changes in the production conditions and environments.
Sometimes it has problems. Sometimes our customers make problems. Good monitoring and alerting are required to be aware of problems as they develop, or ideally before they develop.
It's in multiple data centers, ours and in the public cloud. Replication and communication over long distances pose their own scaling and performance problems.
As SRE in the data infrastructure team, you will be responsible for planning, building, improving and refactoring solutions that solve these problems. You will also share the on-call rotation and be an escalation contact for incidents. You will be working in close collaboration with multi-functional teams in Core Infrastructure and in the Application Teams.
What will you bring to the role?
This cluster in the current infrastructure is moving 1TB of data per second to give you some background information
Nice to have