Lead Site Reliability Engineer-Infrastructure Technology
Company: JPMorganChase
Location: Plano
Posted on: April 3, 2026
|
|
|
Job Description:
Description Assume a critical role in defining the future of a
globally recognized firm and have a direct and significant effect
in a realm tailored for top achievers in site reliability. As a
Lead Site Reliability Engineer at JPMorgan Chase within the
Infrastructure & Production Management sector of Consumer &
Community Banking, you hold a leadership role in your team,
demonstrate strong knowledge across multiple technical domains, and
advise others on the technical and business issues facing them.
Take lead and conduct resiliency design reviews, break up complex
problems into digestible work for other engineers, act as a
technical lead for medium to large-sized products, and provide
advice and mentoring to other engineers. Job responsibilities
Advocate and embody site reliability principles, fostering a
culture of excellence and technical influence within your team.
Leverage AI tools to enhance operational effectiveness and automate
processes, ensuring high-quality customer service. Spearhead
projects aimed at enhancing the reliability and stability of
applications and platforms. Utilize data-driven analytics and AI
technologies to automate detection, diagnosis, resolution
processes, elevate service levels and drive continuous improvement.
Engage stakeholders to establish realistic service level objectives
and error budgets, ensuring alignment with customer expectations.
Exhibit advanced technical proficiency in one or more domains,
proactively addressing technology-related bottlenecks. Employ
AI-driven solutions to streamline processes and enhance operational
efficiency. Serve as the primary contact during major incidents,
demonstrating the ability to swiftly identify and resolve issues to
prevent financial losses. Act as a culture carrier by documenting
and disseminating knowledge through internal forums and communities
of practice. Mentor team members, guiding them in the strategic
adoption of AI technologies to enhance operational effectiveness
and customer service. Required qualifications, capabilities, and
skills Formal training or certification on site reliability
engineering concepts and 5 years applied experience. Proven success
in an SRE or senior DevOps role , with deep knowledge of service
level indicators/objectives (SLIs/SLOs), incident management,
postmortem analysis, and systems reliability. Expert with
observability stacks (e.g. Datadog/Dynatrace, Prometheus, Grafana,
Splunk, Elk, OpenTelemetry), including deep experience correlating
telemetry across services and time. Hands-on skills in coding (at
least one high-level programming language), cloud platforms (AWS or
GCP), container orchestration (Kubernetes), infrastructure as code
(Terraform), and resilient CI/CD pipelines. Active experience or
deep curiosity in applying AI to operations—such as LLM-based
copilots, anomaly detection, automated runbooks, autonomous agents.
A track record of delivering under pressure. You finish what you
start, adapt to uncertainty, and thrive in high-accountability
environments. You deconstruct complexity, organize effectively, and
drive clarity into ambiguous operational environments.
Documentation and design are second nature. Outstanding
communication, empathy, and professionalism—especially during
incidents. You recognize that great systems serve real people.
Preferred qualifications, capabilities, and skills Experience with
operational and compliance rigor in banking, fintech, or similar.
Manage and optimize various types of databases, including
relational, NoSQL databases. Experience with game days, chaos
experiments, or failure-mode analysis to improve service
robustness. A background in mentoring engineers or leading
technical knowledge-sharing, especially around AI and SRE best
practices. Ability to initiate and implement ideas to solve
business problems Strong communicator with excellent
problem-solving, critical thinking, and analytical reasoning
skills, along with attention to detail and a passion for
innovation.
Keywords: JPMorganChase, Rowlett , Lead Site Reliability Engineer-Infrastructure Technology, IT / Software / Systems , Plano, Texas