Senior Machine Learning Operations and Data Engineer

  • Location:
    Berkeley, CA
  • Time Type:
  • Experience:
    5 years
  • Salary
    $150,000 - $180,000

Job Description

Emancro’s mission is to build general-purpose hospital logistics robots that perform a wide variety of tasks such as distributing medication and medical supplies within hospitals, and many more tasks in the future. In this way, robots are freeing up medical staff’s time and enable better and more resilient patient care.

We are achieving this by collecting diverse robot teleoperation data as well as human data at large scale and training billion-parameter, general-purpose robotic foundation models. To this end we are rapidly ramping up the amount of data we process and our cloud compute capacity.

We are an ambitious and rapidly growing team pushing the boundaries of what is possible in robotics, leveraging recent, cutting-edge breakthroughs in machine learning-enabled, data-driven robotics.

The Role

  • Start date: As soon as possible, no later than June 1st 2024
  • Location: Berkeley, CA, in person

Work Components

  • Design, develop, and maintain scalable data pipelines and ETL processes to extract, transform, and load data at large scale (in the order of 100sTB)
  • Setting up and maintaining cloud-database (e.g. DynamoDB, Postgres etc.)
  • Manage containerized environments (e.g., Docker, Kubernetes) for running machine learning workloads.
  • Setting up and Maintaining Cloud multi-GPU training infrastructure (GCP, AWS, Azure) with Pytorch and Jax, (both model and data parallelism)
  • Setting up and Maintaining MLOps frameworks, e.g. ClearML, ZenML etc.
  • Implement CI/CD pipelines and automation tools to streamline the model development and deployment process.
  • Deploying ML Models on the cloud for low-latency production/serving

Key Qualifications

  • Expert knowledge of using and configuring GCP (Vertex), AWS, Azure 
  • Python: 5+ years of experience
  • Machine Learning libraries: Pytorch, Jax, model and data parallelism
  • Development tools: Bash, Git
  • Data Science frameworks: Databricks
  • Agile Software development
  • Cloud Management: Slurm, Kubernetes
  • Data Logging: Weights and Biases
  • Orchestration, Autoscaling: Ray, ClearnML, WandB etc.

Optional Qualifications

  • Experience training LLMs and VLMs
  • ML for Robotics, Computer Vision etc.
  • Developing Browser Apps/Dashboards, both frontend and backend Javascript, React, etc. 

Emancro is committed to equal employment opportunities regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or Veteran status.

Apply for this job