MLOps/LLMOps Engineer Job at Cloudious LLC, Remote

eCtUcmRsZUpZMDE4amZaekhwOGJ4ejN6MlE9PQ==
  • Cloudious LLC
  • Remote

Job Description

MLOps/LLMOps Engineer

REMOTE PST Hours

Operationalizing Large Language Models requires specialized expertise beyond traditional MLOps practices. LLMs present unique operational challenges including significantly larger computational requirements, complex data pipelines, specialized infrastructure needs, and unique performance optimization requirements. This specialized role ensures GenAI solutions can scale effectively from proof-of-concept to enterprise-wide deployment in a utility environment.

  • Ensures GenAI solutions move successfully from prototype to production with proper operational support
  • Establishes specialized monitoring for model performance, inference latency, and data quality
  • Enables efficient scaling of LLM solutions across multiple business units
  • Creates high-performance deployment architectures that balance speed, cost, and reliability
  • Develops operational data pipelines to continuously improve model performance with new utility-specific data

Key Responsibilities:

  • Design and implement LLM-specific deployment architectures with Docker containers for both batch and real-time inference
  • Configure GPU infrastructure on-premises or in the cloud with appropriate CI/CD pipelines for model updates
  • Build comprehensive monitoring and observability systems with appropriate logging, metrics, and alerts
  • Implement load balancing and scaling solutions for LLM inference, including model sharding if necessary
  • Create automated workflows for model retraining, versioning, and deployment
  • Optimize infrastructure costs through intelligent resource allocation, spot instances, and efficient compute strategies
  • Collaborate with PG&E's Cyber team on implementing appropriate security controls for GenAI applications
  • Develop automated testing frameworks to ensure consistent output quality across model updates

Expected Skillset:

  • DevOps + ML : Expertise in Kubernetes, Docker, CI/CD tools, and MLflow or similar platforms
  • Cloud & Infrastructure : Understanding of GPU instance options, cloud services (AWS/Azure/GCP), and optimization techniques
  • Automation : Proficiency in Python, Bash, and infrastructure-as-code tools like Terraform or Ansible
  • LLM-Specific Frameworks : Experience with tools like TensorBoard, MLFLow, or equivalent for scaling LLMs
  • Performance Optimization : Knowledge of techniques to monitor and improve inference speed, throughput, and cost
  • Collaboration : Ability to work effectively across technical teams while adhering to enterprise architecture standards

Job Tags

Remote job,

Similar Jobs

Get It - Professional Services

Senior Actuarial Consultant Variable Annuities - Remote | WFH Job at Get It - Professional Services

 ...study time. Financial Wellness: Secure your financial future with our 401(k) plan (with company match and contribution), employee stock purchase plan, and financial counseling. Health and Well-being: Enjoy competitive medical, vision, and dental plans, plus health... 

SeekTeachers

Nursery Nurse Job at SeekTeachers

Description Nursery Nurse - UAE Embrace the chance to nurture young minds and cultivate an environment where little ones can blossom. My clients esteemed nursery in Dubai extends an invitation to dedicated, compassionate individuals... 

Spectrum Billing Solutions

Utilization Review Specialist Job at Spectrum Billing Solutions

 ...company for healthcare organizations. We are looking to add a Utilization Review Specialist to our growing team. The Utilization Review...  ...working in a cohesive and rewarding environment. This is a remote or office/home hybrid position. Responsibilities include:... 

Hyatt

Valet Attendant (Part Time) Job at Hyatt

 ...hospitality professional at a full-service hotel via our virtual reality experience....  ...hospitality to a new level in the destination.Valet Attendants welcome and create the first...  ...relocate guest automobiles to designated parking locations. This role requires precise communication... 

Pace Staffing Network

Equipment Maintenance Technician Job at Pace Staffing Network

 ...solver with experience maintaining complex equipment in a manufacturing setting? This 8-month contract opportunity is ideal for a technician whos ready to jump into a dynamic production environment and make a quick impact. What Youll be Doing Perform electrical...