Jobs on TAL
All jobsOnsiteEngineeringai/ml infra10+ yearsparallel and distributed systems
OnsiteStaff/Principal/Architectai/ml infra

DGX Cloud Performance Engineer

NvidiaHyderabad, Telangana, IndiaPosted 20 May 2026

Nvidia is seeking a high-level performance engineer to drive performance analysis, optimization, and modeling for DGX Cloud clusters. The role involves developing benchmarks and experiments to identify bottlenecks across HW/SW boundaries in large-scale AI distributed systems. You will collaborate with cross-functional teams to influence the architecture, design, and roadmap of NVIDIA's world-class AI infrastructure. Proficiency in C/C++, Python, and cloud infrastructure is required for this role at scale.

Matched by TAL

50k new jobs listed every day. Install TAL to find more jobs like this.

Install TAL

Experience

10+ years

Function

Engineering

Work mode

Onsite, India

Company

Tier 1

What you will work on

Nvidia is seeking a high-level performance engineer to drive performance analysis, optimization, and modeling for DGX Cloud clusters. The role involves developing benchmarks and experiments to identify bottlenecks across HW/SW boundaries in large-scale AI distributed systems. You will collaborate with cross-functional teams to influence the architecture, design, and roadmap of NVIDIA's world-class AI infrastructure. Proficiency in C/C++, Python, and cloud infrastructure is required for this role at scale.

TAL's take

Quality 90/1005/5 clarityTier 1 company

High-impact role at a Tier 1 company focusing on cutting-edge AI infrastructure and hardware-software codesign.

JD provides clear objectives, core responsibilities, and a defined technology stack for high-scale performance engineering.

Salaries at Nvidia

29.5 LPA average

Based on 43 Grapevine salary entries for Nvidia.

View all salaries

Engineering

0 - 2 years | IC2

14 LPA average

Range: 5 - 21 LPA

Engineering

2 - 4 years | IC3

21 LPA average

Range: 20 - 22 LPA

Engineering

4 - 6 years | IC3

29 LPA average

Range: 23 - 35 LPA

Engineering

6 - 8 years | SDE 2

30 LPA average

Range: 17 - 37 LPA

Must haves

  • 10 years of experience in parallel and distributed accelerator-based systems
  • Expertise in optimizing performance and AI workloads
  • Experience with performance modeling and benchmarking at scale
  • Strong background in Computer Architecture, Networking, and Storage
  • Proficiency in Python and C/C++
  • Expertise with at least one public cloud infrastructure provider

Tools and skills

parallel and distributed systemscomputer architecturenetworkingstorage systemsacceleratorspytorchtensorflowjaxmegatron-lmtensor-llmvllmpythonc/c++gcpawsazureoci

Nice to have: cuda, xla.

About the company

Nvidia is a global leader in AI and semiconductor technology.

Posts mentioning Nvidia