A
Arcesium
Site Reliability Engineer
Fin Tech
SaaS
B2B
Seed
Start-up
MnC
Software
1000-5000 Employees
8y - 14y
(Competitive pay)
Hyderabad, Bengaluru/ Bangalore
Python, Kubernetes, observability, monitoring, Dynatrace
Role
Company
Job Description
What you'll do:
- Design, develop, and implement scalable and reliable monitoring solutions for distributed systems at scale.
- Define and implement monitoring requirements in collaboration with cross-functional teams.
- Lead the development of monitoring architectures and strategies.
- Integrate monitoring tools into existing infrastructure.
- Maintain and support monitoring systems.
- Demonstrate strong technical breadth/depth, driving innovation, evaluating new technologies, and deciphering the technical vision for engineering teams.
- Own key contributions to technical design and architecture decisions, considering trade-off s of choices, managing risk, making decisions independently where appropriate, and presenting reasoned options for decision-making by others.
- Lead the way by writing exemplary code, documentation, and RFCs.
- Identify, propose, develop, deploy, and own R&D projects in accordance with the technical vision and needs of the team, turning problem statements into solutions, and operating independently as needed.
What makes you a great fit:
- 10+ years of experience in SRE or a related field.
- Proven experience in designing, developing, and implementing monitoring solutions.
- Deep understanding of monitoring technologies and tools, including Prometheus, Grafana, Loki, and Tempo
- Experience with cloud-based monitoring systems, such as New Relic, Datadog, and Grafana Cloud
- Experience with log analysis tools, such as Splunk, Logstash, Fluent, and Sumo Logic
- Experience with distributed tracing implementation using Open Telemetry, Jaeger
- Strong understanding of SRE principles and practices.
- Experience with incident response and management.
- Reliability: An exposure to Chaos Engineering and various reliability practices ces including disaster recovery will be good to have.
- Experience with Cloud Computing like AWS.
- Experience with Kubernetes.
- Experience in Agile practices (Scrum)
- Excellent analytical, problem-solving, and troubleshooting skills.
- Excellent communication and presentation skills.
All about us
Arcesium
- Arcesium is a global financial technology and professional services firm, delivering solutions to some of the world’s most sophisticated financial institutions, including hedge funds, banks, and institutional asset managers. Expertly designed to achieve a single source of truth throughout a client’s ecosystem
- Arcesium’s cloud-native technology is built to systematize the most complex tasks.
Employee count
1000-5000 Employees
Employment Type
Full Time Job
Company Type
Start-up, MnC
Headquarters
New York, New York, United States
Our links
https://www.arcesium.comFind Popular Jobs on BigShyft.com
Apply to Similar Jobs
- RRubrikSite Reliability Engineer (SRE) - JarvisSeries FStart-up1001-5000 employees3y - 9y₹20 - ₹50 LPABengaluru/ BangalorePython, Java, Unix, C++, DevOps
- HHighRadiusSRE ArchitectFin TechAI/MLSaaSB2BSeries C9y - 16y
Competitive pay
HyderabadLinux, AWS, Google Cloud Platform, Azure
Find Popular Jobs on BigShyft.com
- Home
- >
- Jobs in Hyderabad
- >
- Python Jobs
- >
- Python Jobs in Hyderabad
- >
- Site Reliability Engineer