MLOps Engineer - 39 hours
Full-time
Ref: 27561
Hours: Full-time
Contracted Hours: 39
Contract Type: Permanent
Location: Chester House, Epsom Ave, Handforth, Cheadle, Greater Manchester, SK9 3RN
Description
After the successful launch of our revolutionary Petcare Platform this year, we're embarking on an exciting journey to blend real-time event data with curated historic datasets. This initiative aims to train and deploy machine learning models in a live environment, revolutionizing how consumers interact with our products and services. We're seeking talented MLOps Engineers to join our Engineering team and collaborate closely with our Data Science division to build self-serve orchestration systems for model training, evaluation, and hosting in a multi-cloud environment.
Role Overview:
As an MLOps Engineer, your role will center on the robust deployment, monitoring, and maintenance of machine learning models within our Google Cloud Platform (GCP) and Microsoft Azure environments. You will play a crucial part in ensuring our models deliver value efficiently and reliably. While closely collaborating with Data Science, the emphasis of this role is on enabling model deployment through self-serve orchestration, automated testing, data transfer, and lightweight endpoint hosting. Although this position is focused on MLOps, there will be opportunities to contribute to and upskill in predictive modeling and advanced analytics.
Key Responsibilities:
Oversee the deployment, monitoring, and maintenance of machine learning models within GCP and Azure environments.
Enhance model deployment pipelines using iterative, agile development practices, ensuring efficient model building, deployment, and maintenance.
Integrate and optimize MLOps tools, templates, and processes to streamline our machine learning lifecycle and data operations.
Manage model changes and deployments securely, efficiently, and reliably.
Develop, maintain, and improve pipelines for model training, validation, and deployment.
Proactively address issues related to model performance and data drift.
Maintain comprehensive technical documentation for model deployment processes, versioning, and monitoring.
Implement and manage robust testing frameworks for model validation and performance evaluation.
Automate monitoring tools to track model performance, accuracy, and drift in real-time.
Conduct code and model reviews and participate in knowledge-sharing sessions.
Collaborate with DevOps to drive automation within CI/CD pipelines for machine learning workflows.
Ensure that data used for model training and deployment meets the highest standards of quality and compliance.
Uphold secure best practices across all aspects of the MLOps lifecycle.Ensure that data used for model training and deployment meets the highest standards of quality and compliance.
Implement and uphold secure best practices across all aspects of the MLOps lifecycle.
Essential Experience, Knowledge & Expertise
Proven experience in managing and deploying machine learning models in production environments.
Strong understanding of MLOps principles and best practices.
Experience with ETL processes and data pipelines
Familiarity with containerization technologies (Docker) and version control systems (Git). Hands-on experience with major cloud platforms (GCP, AWS, Azure).
Understanding of model evaluation metrics, performance monitoring, and data drift management
Desirable Experience, Knowledge & Expertise
Experience with Vertex AI, Azure Machine Learning, or equivalent MLOps tools.
Knowledge of agile methodologies such as Scrum or Kanban.
Strong proficiency in a programming language such as Python and R
Experience creating cloud resources through infrastructure as code (Terraform/Terragrunt or similar)
Familiarity with streaming data and real-time data processing.
Experience deploying and/or using open-source MLOps frameworks and tools.
Demonstrated ability to handle and troubleshoot real-world model deployment issue
Role Specific Competencies
Product Mindset: Treat machine learning models and outputs as products, focusing on end-user impact and continuous improvement.
Operational Excellence: Apply best practices in MLOps, including observability, monitoring, and maintaining SLAs.
Agile Approach: Embrace iterative development and continuous feedback to refine and enhance machine learning workflows.
Technical Excellence: Demonstrate a deep understanding of machine learning model training/outputs and deployment best practices.
Why Join Us?
Be part of a dynamic and innovative team revolutionizing the pet care industry.
Work in a cutting-edge multi-cloud environment with opportunities for professional growth and development.
Collaborate with a talented Data Science team on impactful projects.
Enjoy a culture of continuous learning, innovation, and excellence
If you are passionate about MLOps and eager to make an impact in a forward-thinking company, we would love to hear from you.
Pets just see people. They aren’t biased and they don’t discriminate. We take our inspiration from pets and we value and respect difference in all its forms. Our aim is to reflect the diversity of the communities we operate in and every colleague can help us achieve this. We encourage our people to be themselves so even if your skills and experience don’t perfectly align, if you think you can make a unique contribution through your values and behaviours, we want to hear from you!
Organisation: Pets at Home
Date Posted: 20-08-2024
Expiry Date: 29-09-2024