Hi, Rockits!

MLOps Engineer

От 500 000 RUR
  • Москва
  • Полная занятость
  • Удаленная работа
  • Более 6 лет
  • Machine Learning
  • MLOps
  • DevOps
  • Kubernetes
  • CI/CD
  • Grafana
  • Prometheus
  • Bash
  • Rust
  • Triton Inference Server
  • ONNX
  • TensorRT
  • GPU Optimization
  • Английский — B2 — Средне-продвинутый

We are looking for an experienced MLOps Engineer to work on the project - a decentralized AI protocol on Monad that leverages idle consumer hardware for swarm inference. It enables Small Language Models to achieve advanced multi-step reasoning at lower costs, surpassing the performance and scalability of leading models.

Responsibilities:

  • Deploy scalable, production-ready ML services with optimized infrastructure and auto-scaling Kubernetes clusters, create Helm templates for rapid Kubernetes node deployment.

  • Optimize GPU resources using MIG (Multi-Instance GPU) and NOS (Node Offloading System);

  • Manage cloud storage (e.g., S3) to ensure high availability and performance.Deploy and manage large language models (LLM), small language models (SLM), and large multimodal models (LMM);

  • Serve ML models using technologies like Triton Inference Server, optimize models with ONNX and TensorRT for efficient deployment;

  • Set up monitoring and logging solutions using Grafana, Prometheus, Loki, Elasticsearch, and OpenSearch;

  • Write and maintain CI/CD pipelines using GitHub Actions for seamless deployment processes.

Requirements:

  • 5+ years of experience in MLOps or ML engineering roles;

  • Proficiency in Kubernetes, Helm, and containerization technologies;

  • Experience with GPU optimization (MIG, NOS) and cloud platforms (AWS, GCP, Azure);

  • Strong knowledge of monitoring tools (Grafana, Prometheus) and scripting languages (Python, Bash);

  • Hands-on experience with CI/CD tools and workflow management systems;

  • Familiarity with Triton Inference Server, ONNX, and TensorRT for model serving and optimization.

As a plus:

  • Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field;

  • Experience with advanced ML techniques, such as multi-sampling and dynamic temperatures;

  • Knowledge of distributed training and large model fine-tuning;

  • Proficiency in Go or Rust programming languages;

  • Experience designing and implementing highly secure MLOps pipelines, including secure model deployment and data encryption.