AI Benchmark Software Engineer (Contract)
Turing · Ghana
Job description
About the role
Turing is seeking an experienced software engineer to design and build high‑quality multi‑agent benchmark tasks that reflect real‑world software engineering workflows. You will create tasks grounded in open‑source code changes such as bug fixes, migrations, and refactors, enabling evaluation of AI agents’ ability to understand large codebases and produce correct, testable outputs.
Key responsibilities
- Build multi‑agent benchmark tasks based on real‑world open‑source code changes (bug fixes, migrations, refactors).
- Use the Harbor evaluation framework to run and validate tasks inside Docker environments.
- Write clear, precise task instructions specifying file paths, function signatures, expected behavior, and constraints.
- Design and implement Python‑based verification scripts to validate correctness of agent‑generated code changes.
- Create decomposition strategies that split complex code changes across multiple independent sub‑agents.
- Run, debug, and refine tasks within containerised environments to ensure reproducibility and determinism.
- Evaluate task performance signals and continuously improve task quality, clarity, and difficulty.
Required profile
- 5+ years of professional experience developing in Python and JavaScript.
- Strong experience reading and navigating large open‑source codebases (e.g., Django, Flask, FastAPI, Node.js).
- Familiarity with Git workflows, including pull requests, diffs, cherry‑picking, and working with specific commits.
- Comfortable working with Docker (writing Dockerfiles, building images, debugging container issues).
- Experience writing test scripts using pytest, unittest, or custom assertion‑based testing.
- Ability to write clear, precise, and unambiguous technical specifications.
Required skills
- Python
- JavaScript
- Docker
- Git
- pytest
- unittest
What we offer
- Opportunity to work on cutting‑edge AI projects with leading foundation‑model companies.
- Remote, flexible work with a global team.
- Contract role with an 8‑hour daily commitment and a 4‑hour overlap with PST.
Questions fréquentes
Why are you reporting this job?
Apply in 30 seconds
Enter your email to apply. An account will be created automatically.
By continuing, you accept our terms of use.
Already have an account? Login
Published 2 hours ago
Expires 1 month from now
4 views · 0 applications
Boost your chances
Upload your CV — we will match you with relevant openings.
Analyzing your CV...
Turing
Ghana
Related job offers
-
Remote Business Analyst – Evaluation & Annotation
Turing Ghana -
Computer Science Specialist – Remote (Pay per Task)
Crossing Hurdles Ghana -
Norwegian AI Trainer – Remote (Hourly)
Crossing Hurdles Ghana -
Senior AI DevSecOps Engineer
Dexwin Tech Accra -
System Engineer – IT Infrastructure
FAIRGREEN LIMITED Dzorwulu