Asset & Wealth Management - Birmingham - Associate - Site reliability engineer
Job Description
What We Do
At Goldman Sachs, our Engineers don’t just make things – we make things possible. Change the world by connecting people and capital with ideas. Solve the most challenging and pressing engineering problems for our clients. Join our engineering teams that build massively scalable software and systems, architect low latency infrastructure solutions, proactively guard against cyber threats, and leverage machine learning alongside financial engineering to continuously turn data into action. Create new businesses, transform finance, and explore a world of opportunity at the speed of markets.
Engineering, which is comprised of our technology division and global strategist's groups, is at the critical center of our business, and our dynamic environment requires innovative strategic thinking and immediate, real solutions. Want to push the limit of digital possibilities? Start here.
What we look for
Goldman Sachs Engineers are innovators and problem-solvers, building solutions in risk management, big data, mobile and more. We look for creative collaborators who evolve, adapt to change and thrive in a fast-paced global environment.
Energetic, self-directed and self-motivated, able to build and sustain long-term relationships with clients and colleagues
Intuitively coalesce towards problems with an open mind, within the context of a team
Exceptional analytical skills, able to apply knowledge and experience in decision-making to arrive at creative and commercial solutions
Strong desire to learn and contribute solutions and ideas to a broad team
Can manage multiple tasks and use sound judgment when prioritizing
Strong verbal and written communication skills
In this role you will
Collaborate with Application Development Engineers, Technology Infrastructure teams and vendor teams to ensure that the solutions being implemented are scalable and highly automated, from infrastructure provisioning to code deployment, and support
Design, implement, and maintain robust monitoring solutions using tools such as Datadog, Prometheus, Grafana, Splunk to ensure the health, performance, and availability of our applications and infrastructure.
Automate activities and operations to make processes faster and more efficient
Utilize Splunk to analyze and correlate logs, troubleshoot issues, and identify trends that could impact system reliability.
Uplift existing services from the firm private cloud to the latest technologies on public cloud
Collaborate with Senior SREs and Leads to contribute to the development of short and long-term reliability strategies.
Support and enhance roadmap initiatives in production management by implementing reliability best practices and automation.
Troubleshoot problems encountered by both technology teams as well as end users of our applications
Build highly available systems based on a microservices architecture
Skills and experience we are looking for
Minimum 3 years of relevant professional experience
B.S. or higher in Computer Science (or equivalent work experience)
Advanced experience in Systems Engineering
Experience working with monitoring tools such as Datadog, Prometheus, and Grafana, including dashboard creation, metric collection, and alert configuration.
Cloud infrastructure experience, preferably AWS
Understanding of principles of Continuous Delivery, Devops and SRE
Solid programming/scripting skills in languages such as Python, Bash, or Go, with the ability to automate tasks and improve operational efficiency.
Experience with DevOps tools such as Github, Terraform, Chef, Ansible, PowerShell
Experience with deploying infrastructure-as-code to production systems and accompanying automation techniques
Preferred Qualifications
Experience as a Site Reliability Engineer or similar role, with a focus on monitoring, alerting, and incident management
Proficiency in log analysis using Splunk/ELK or similar log aggregation tools to troubleshoot and diagnose system issues
Experience with Dynatrace and/or New Relic
Knowledge of compute, storage, firewalls and networking fundamentals
Experience managing Kafka as messaging middleware
Familiarity with high-scale NoSQL solutions like MongoDB
Expertise in delivering microservices architectures
Fintech experience will be a great asset
Job Info
We Offer Best-In-Class Benefits
