Back to jobs
AI Infra Architecture
Successfully
Req. VR-120987
We are seeking an experienced AI Infrastructure Architect with deep expertise in designing and operating scalable, secure, and high‑performance cloud environments for Generative AI and LLM workloads. This role is ideal for someone who combines strong AWS architectural skills with hands‑on experience in GPU compute, MLOps/LLMOps, and enterprise‑grade AI platform design.
You should bring extensive experience building cloud‑native AI infrastructure, optimizing large‑scale model training and inference environments, and collaborating closely with AI/ML teams to enable advanced GenAI capabilities.
You should bring strong experience in designing complex AI systems, creating detailed technical specifications, and collaborating across multidisciplinary teams to ensure seamless implementation.
Design and implement scalable AWS infrastructure to support Generative AI and LLM workloads, including training, fine‑tuning, and inference.
Architect secure, high‑performance environments using AWS core services such as Amazon SageMaker, Amazon Bedrock, Amazon EKS, AWS Lambda, and related cloud‑native components.
Design GPU‑based compute environments (e.g., EC2 P‑series, G‑series) optimized for distributed training, fine‑tuning, and low‑latency inference.
Implement secure VPC architectures, private endpoints, IAM policies, encryption (KMS), and enterprise‑grade data governance controls.
Build and govern MLOps/LLMOps pipelines using SageMaker Pipelines, CodePipeline, and CI/CD best practices.
Architect RAG infrastructure, including vector databases (OpenSearch, Aurora PostgreSQL with pgvector) and scalable storage solutions (S3).
Establish monitoring and observability using CloudWatch, model monitoring tools, logging frameworks, and performance dashboards.
Optimize infrastructure for latency, autoscaling, high availability, and cost efficiency, leveraging Spot Instances, Savings Plans, and right‑sizing strategies.
Define disaster recovery (DR) and backup strategies across multi‑AZ and multi‑region AWS setups.
Implement Infrastructure as Code (IaC) using Terraform or CloudFormation for consistent, repeatable provisioning of AI environments.
Collaborate with AI/ML teams to support LLM fine‑tuning, prompt orchestration, inference endpoints, and model deployment workflows.
Stay current with AWS GenAI advancements, evaluating new services, architectural patterns, and best practices for enterprise adoption.
Must have
Extensive experience (typically 7+ years) in cloud architecture, infrastructure engineering, or platform engineering, with a strong focus on AWS.
Proven expertise designing and operating AI/ML and Generative AI infrastructure at scale.
Deep knowledge of AWS services relevant to AI workloads (SageMaker, Bedrock, EKS, EC2 GPU instances, Lambda, VPC, IAM, KMS, S3).
Hands‑on experience with GPU compute, distributed training, and high‑performance inference environments.
Strong understanding of MLOps/LLMOps practices, CI/CD pipelines, and model deployment workflows.
Experience architecting secure, compliant, and highly available cloud environments.
Proficiency with Infrastructure as Code (Terraform or CloudFormation).
Familiarity with vector databases, RAG architectures, and scalable data storage patterns.
Strong collaboration skills and the ability to work closely with AI/ML, DevOps, and engineering teams.
Excellent documentation and communication skills.
Nice to have
n/a
Languages
English: C1 Advanced
Seniority
Lead
London, United Kingdom of Great Britain and Northern Ireland
Req. VR-120987
AI/ML
BCM Industry
13/02/2026
Req. VR-120987
Apply for AI Infra Architecture in London
*Indicates a required field