Job description
Our client, a start-up in the retail technology space, is looking for a Senior Machine Learning Engineer to join their team.Senior Machine Learning Engineer Responsibilities;
- Engineering a state of the art Machine Learning software platform
- Combine strong software engineering principles with machine learning to build scalable, reproducible and easy-to-use end-to-end machine learning workflows for advanced deep learning problems
- Build backend infrastructure to perform scalable training, evaluation, and inference in the cloud and client-side infrastructure to perform efficient inference on mobile devices
- Build comprehensive data management systems for scalable data collection, labeling, processing, and evaluation
- Aid in data ingestion, model training, deployment and monitoring
- Distribute model training and pipelines on GPU environments.
- Working closely with experts in Computer Vision to deliver final products
- Create monitoring solutions that allow effective system accuracy, performance and enable troubleshooting of production ML models
- Identify gaps and evaluate relevant tools and technologies as needed to improve processes and systems, leveraging open-source and cloud computing technologies to build effective solutions.
- Collaborate with data scientists, data engineers, product teams, and other key stakeholders and drive ML platform projects from conception to production.
- Identify performance bottlenecks and optimize different aspects of technology pipelines on CPU and/or GPU.
- Do ML model conversion to platform specific inferencing frameworks such as CoreML, WinML, ONNX, etc.
- Bachelor's degree in a technical field such as CS, EE, Physics, Math or a related field
- Familiarity with Machine Learning tools and frameworks 3+ years experience in machine learning engineering
- Familiarity with cloud services, large datasets and data visualization tools
- Proven ability to design, implement and operate large projects at scale
- Strong ability in problem solving and driving for results
- Experience in exporting ONNX models
- Experience in managing and monitoring NVIDIA Triton Inference Serving
- Experience with KServe
- Experience with ML tools and IDEs like Sagemaker,Colab pro and Cloud Platforms (AWS, GCP, Azure), services MLperf
- Experience with any machine/deep learning frameworks like Tensorflow, PyTorch,
- Strong experience in large scale distributed systems, Data Engineering, MLOps, Machine Learning and Data Science areas Experience with building and deploying ML pipelines in a production environment at scale
- Good knowledge of AWS, Python, Spark, Airflow, K8s, Docker, Terraform, etc to build pipelines Must have working experience to MLOps tools such as KubeFlow, MLFlow, Metaflow, or Sagemaker
- Strong understanding of containerization (Docker) and container-orchestration systems like Kubernetes; experience with orchestration tools such as Airflow Experience with stream processing technology Kafka, Spark, Samza, Flink, etc.
- Working experience and good knowledge of CI / CD tools and best practices
Expired job