RemoteSeniorai/ml infra

Freelance Agent Evaluation Engineer

MindriftHyderabad, Telangana, IndiaPosted 20 May 2026

Mindrift is looking for freelance engineers to evaluate AI coding agents by building challenging tasks and test criteria. You will work on creating realistic simulated developer environments using Python, Docker, and full-stack tools. The role requires deep experience in testing complex systems and understanding model failure modes. This is a project-based, remote opportunity for experienced software engineers.

Matched by TAL

50k new jobs listed every day. Install TAL to find more jobs like this.

Install TAL

Experience

5+ years

Function

Engineering

Work mode

Remote, India

Company

Tier 2

What you will work on

TAL's take

Quality 35/1005/5 clarityTier 2 company4 watchout

This is a freelance, project-based gig with low compensation, despite requiring significant engineering seniority.

The JD is highly specific regarding the responsibilities, tech stack, and the unique nature of evaluating AI agents.

Watchouts

project-based
not permanent employment
freelance
low compensation

Must haves

5+ years in software development
Proficiency in Python (FastAPI, pytest, async/await)
Experience in full-stack development with React
Experience writing functional and integration tests
Familiarity with Docker and infrastructure tools
B2 level English proficiency

Tools and skills

pythonfastapipytestasync/awaitsubprocessfile operationsreactjavascripttypescriptdockerpostgresqlkafkaredisgithub actions

About the company

Mindrift is an emerging platform for AI-related project work, categorized as a mid-tier entity in the AI services space.