Jobiglo

No results.

AI Benchmark Software Engineer (Contract)

Turing · Ghana

New Remote
Remote Senior 🇬🇧 English
Python JavaScript Docker Git pytest unittest

Job description

About the role

Turing is seeking an experienced software engineer to design and build high‑quality multi‑agent benchmark tasks that reflect real‑world software engineering workflows. You will create tasks grounded in open‑source code changes such as bug fixes, migrations, and refactors, enabling evaluation of AI agents’ ability to understand large codebases and produce correct, testable outputs.

Key responsibilities

  • Build multi‑agent benchmark tasks based on real‑world open‑source code changes (bug fixes, migrations, refactors).
  • Use the Harbor evaluation framework to run and validate tasks inside Docker environments.
  • Write clear, precise task instructions specifying file paths, function signatures, expected behavior, and constraints.
  • Design and implement Python‑based verification scripts to validate correctness of agent‑generated code changes.
  • Create decomposition strategies that split complex code changes across multiple independent sub‑agents.
  • Run, debug, and refine tasks within containerised environments to ensure reproducibility and determinism.
  • Evaluate task performance signals and continuously improve task quality, clarity, and difficulty.

Required profile

  • 5+ years of professional experience developing in Python and JavaScript.
  • Strong experience reading and navigating large open‑source codebases (e.g., Django, Flask, FastAPI, Node.js).
  • Familiarity with Git workflows, including pull requests, diffs, cherry‑picking, and working with specific commits.
  • Comfortable working with Docker (writing Dockerfiles, building images, debugging container issues).
  • Experience writing test scripts using pytest, unittest, or custom assertion‑based testing.
  • Ability to write clear, precise, and unambiguous technical specifications.

Required skills

  • Python
  • JavaScript
  • Docker
  • Git
  • pytest
  • unittest

What we offer

  • Opportunity to work on cutting‑edge AI projects with leading foundation‑model companies.
  • Remote, flexible work with a global team.
  • Contract role with an 8‑hour daily commitment and a 4‑hour overlap with PST.

Questions fréquentes

Le salaire n'est pas communiqué publiquement par le recruteur. Vous pouvez postuler et négocier directement avec Turing.
Cliquez sur "Postuler maintenant" en haut de la page. Vous pouvez importer votre CV en 1 clic — Jobiglo extrait automatiquement vos informations et postule pour vous.

Why are you reporting this job?

Thank you for your report. We will review this job.

Apply in 30 seconds

Enter your email to apply. An account will be created automatically.

By continuing, you accept our terms of use.

Already have an account? Login

Published 3 hours ago

Expires 1 month from now

6 views · 0 applications

Boost your chances

Upload your CV — we will match you with relevant openings.

Analyzing your CV...

Turing

Ghana