Research Engineering, Inference
Bitdeer AI Lab is seeking a Research Engineer to focus on ML systems research for large-scale inference optimization. The role involves working with SGLang and vLLM stacks to optimize serving for LLMs and reasoning models on large GPU clusters. Responsibilities include GPU kernel-level tuning, inter-node communication, and designing end-to-end serving systems. This position offers the opportunity to work on frontier AI models within a rapidly growing global computing infrastructure company.
50k new jobs listed every day. Install TAL to find more jobs like this.

Experience
Experience not specified
Function
Engineering
Work mode
Onsite, Singapore
Company
Tier 2
What you will work on
Bitdeer AI Lab is seeking a Research Engineer to focus on ML systems research for large-scale inference optimization. The role involves working with SGLang and vLLM stacks to optimize serving for LLMs and reasoning models on large GPU clusters. Responsibilities include GPU kernel-level tuning, inter-node communication, and designing end-to-end serving systems. This position offers the opportunity to work on frontier AI models within a rapidly growing global computing infrastructure company.
TAL's take
Solid mid-tier role focused on high-performance infrastructure and inference at scale, clear technical mission but lacks explicit seniority or YOE context.
Very clear scope focused on SGLang/vLLM inference optimization, specific stack requirements, and clear team goals.
Must haves
- Experience with SGLang or vLLM for large-scale model serving
- Understanding of inference optimization for LLMs and reasoning models
- Understanding of GPU kernel-level optimization
- Knowledge of distributed inference communication patterns and collective operations like NCCL
- Understanding of latency, throughput, stability, and GPU efficiency metrics
Tools and skills
About the company
Publicly traded company with global infrastructure, focused on specialized crypto/AI hardware and cloud, but not a FAANG-tier software brand.