We are looking for an AI/ML Ops Engineer to support the deployment, monitoring, and operational reliability of AI-powered systems in production environments.This role combines elements of DevOps, cloud engineering, and AI system support. The ideal candidate should be comfortable working with cloud infrastructure, monitoring tools, and modern AI workflows, while collaborating closely with engineering and AI teams.Key ResponsibilitiesSupport deployment and operational management of AI/ML applications and servicesMonitor AI systems using logs, metrics, tracing, and observability toolsTroubleshoot and debug AI workflows, pipelines, and runtime failuresAssist in maintaining scalable, secure, and reliable cloud infrastructureSupport prompt experimentation, version tracking, and A/B testing activitiesCollaborate with engineering teams to improve system reliability, performance, and automationMaintain CI/CD workflows and deployment pipelines for AI servicesParticipate in incident investigation, root cause analysis, and operational support activitiesPreferred Skills & ExperienceExperience working with AWS cloud servicesFamiliarity with monitoring and observability tools such as:Amazon CloudWatchDatadogUnderstanding of logging, metrics, tracing, and alerting conceptsFamiliarity with AI/ML workflows and LLM-based applicationsExperience with containers, APIs, and deployment pipelinesFamiliarity with scripting or programming languages such as Python, JavaScript, or BashUnderstanding of DevOps, cloud infrastructure, and operational best practicesKnowledge of infrastructure-as-code tools such as Terraform or CloudFormationBachelor's degree in Computer Science, Engineering, Data Science, or a related field2+ years of experience in DevOps, Platform Engineering, SRE, MLOps, or AI infrastructure-related rolesWhat We're Looking ForStrong problem-solving and debugging capabilitiesInterest in modern AI operational practices and production AI systemsGood communication and collaboration skillsPractical mindset focused on reliability, scalability, and continuous improvementAbility to work in a fast-paced and evolving technology environment