As a Platform Owner of AI Ops and SRE, your primary responsibility is to develop comprehensive strategies for implementing AI Ops and SRE practices within the organization. This involves understanding business requirements, assessing technical capabilities, and identifying areas where AI and automation can be leveraged to enhance reliability, performance, and operational efficiency.
• Strategic Leadership: Define and execute comprehensive strategies for implementing AIOps and SRE practices aligned with business objectives.
• Cloud Architecture solutions: Design scalable and resilient cloud architectures to support energy-sector-specific applications, leveraging AIOps for predictive monitoring and automated incident response.
• SRE Implementation: Establish and promote SRE principles, including reliability engineering, service-level objectives, and monitoring strategies tailored to energy systems
• AIOps Integration: Oversee the implementation of AIOps platforms, ensuring the seamless integration of AI-driven insights into IT operations
• Collaboration: You will partner closely with engineering and operations teams to provide technical guidance and ensure the successful implementation of AI Ops and SRE practices. This involves reviewing designs, providing recommendations, and promoting best practices for building and operating reliable and efficient cloud-based applications.
• Continuous Improvement: Monitor and enhance system performance through iterative AIOps and strategies that incorporate AI Ops and SRE practices within the data center and cloud domain. This involves understanding business requirements, assessing technical capabilities, and identifying opportunities to leverage AI and automation for improved reliability and performance.
• Implementing AI-Driven Monitoring and Analytics: You will implement AI-driven monitoring and analytics solutions within the cloud domain. This includes leveraging machine learning and data analysis techniques to identify and predict system anomalies, performance bottlenecks, and potential failures.
• Managing the infrastructure platform within budget guardrails to ensure alignment with company priorities and goals. Collaborating with Transversal Teams to align Non-Functional Requirements (NFRs) and prioritize them jointly.