Arcesium

Site Reliability Engineer

Fin Tech

SaaS

B2B

Seed

Start-up

MnC

Software

1000-5000 Employees

8y - 14y

(Competitive pay)

Hyderabad, Bengaluru/ Bangalore

Python, Kubernetes, observability, monitoring, Dynatrace

Role

Company

Job Description

What you'll do:

Design, develop, and implement scalable and reliable monitoring solutions for distributed systems at scale.
Define and implement monitoring requirements in collaboration with cross-functional teams.
Lead the development of monitoring architectures and strategies.
Integrate monitoring tools into existing infrastructure.
Maintain and support monitoring systems.
Demonstrate strong technical breadth/depth, driving innovation, evaluating new technologies, and deciphering the technical vision for engineering teams.
Own key contributions to technical design and architecture decisions, considering trade-off s of choices, managing risk, making decisions independently where appropriate, and presenting reasoned options for decision-making by others.
Lead the way by writing exemplary code, documentation, and RFCs.
Identify, propose, develop, deploy, and own R&D projects in accordance with the technical vision and needs of the team, turning problem statements into solutions, and operating independently as needed.

What makes you a great fit:

10+ years of experience in SRE or a related field.
Proven experience in designing, developing, and implementing monitoring solutions.
Deep understanding of monitoring technologies and tools, including Prometheus, Grafana, Loki, and Tempo
Experience with cloud-based monitoring systems, such as New Relic, Datadog, and Grafana Cloud
Experience with log analysis tools, such as Splunk, Logstash, Fluent, and Sumo Logic
Experience with distributed tracing implementation using Open Telemetry, Jaeger
Strong understanding of SRE principles and practices.
Experience with incident response and management.
Reliability: An exposure to Chaos Engineering and various reliability practices ces including disaster recovery will be good to have.
Experience with Cloud Computing like AWS.
Experience with Kubernetes.
Experience in Agile practices (Scrum)
Excellent analytical, problem-solving, and troubleshooting skills.
Excellent communication and presentation skills.

All about us

Arcesium

Arcesium is a global financial technology and professional services firm, delivering solutions to some of the world’s most sophisticated financial institutions, including hedge funds, banks, and institutional asset managers. Expertly designed to achieve a single source of truth throughout a client’s ecosystem
Arcesium’s cloud-native technology is built to systematize the most complex tasks.

Employee count

1000-5000 Employees

Employment Type

Full Time Job

Company Type

Start-up, MnC

Headquarters

New York, New York, United States

Our links

https://www.arcesium.com

Find Popular Jobs on BigShyft.com

Jobs By Specialization

Business Analysis Jobs

Engineering Manager Jobs

Site Reliability Jobs

Backend Jobs

Front End Jobs

Apply to Similar Jobs

R
Rubrik
Site Reliability Engineer (SRE) - Jarvis
Series F
Start-up
1001-5000 employees

3y - 9y

₹20 - ₹50 LPA

Bengaluru/ Bangalore

Python, Java, Unix, C++, DevOps
Actively hiringApply
H
HighRadius
SRE Architect
Fin Tech
AI/ML
SaaS
B2B
Series C

9y - 16y

Competitive pay

Hyderabad

Linux, AWS, Google Cloud Platform, Azure
Actively hiringApply