Jobiglo

No results.

Web Scraping Engineer

Recruitify_HR · Accra

New
Mid 🇬🇧 English
Scrapy Playwright Pydantic JSON Schema LLM-based semantic parsing Model Context Protocol (MCP)

Job description

About the role

The Web Scraping Engineer builds and maintains a next‑generation data acquisition platform that treats web scraping as a declarative, specification‑driven discipline. Instead of writing fragile XPaths for each site, the engineer defines what data is needed using schemas, natural‑language descriptions, or visual blueprints and lets intelligent pipelines retrieve it.

Key responsibilities

  • Design and maintain declarative extraction specifications using Pydantic models, JSON schemas, or domain‑specific languages.
  • Implement pipelines that translate specifications into executable extraction plans, leveraging Scrapy, Playwright and LLM‑based semantic parsers.
  • Build reusable specification libraries for common data types such as product prices, tariff codes, and regulatory texts.
  • Deploy self‑healing spiders that detect website layout changes and repair themselves via Model Context Protocol (MCP) servers.
  • Integrate semantic extraction (Scrapy‑LLM, custom LLM pipelines) to eliminate selector brittleness.
  • Orchestrate multi‑step browsing workflows with agentic frameworks (BMAD/TEA, AutoGPT‑like agents) that adapt to anti‑bot measures.
  • Develop a component‑based extraction platform with shared selectors, login handlers, and pagination logic, including monitoring, alerting and automatic rollback.
  • Champion ethical crawling practices such as rate limiting, robots.txt compliance and GDPR/CCPA adherence.

Required profile

  • Bachelor’s degree in Computer Science or related field.
  • At least 3 years of experience in web scraping or data extraction.

Required skills

  • Python programming.
  • Scrapy and Playwright frameworks.
  • Pydantic and JSON Schema for specification definition.
  • Experience with LLM‑based semantic parsing and integration.
  • Knowledge of Model Context Protocol (MCP) servers.
  • Familiarity with agentic workflow frameworks (e.g., BMAD, TEA, AutoGPT‑like agents).

Questions fréquentes

Le salaire n'est pas communiqué publiquement par le recruteur. Vous pouvez postuler et négocier directement avec Recruitify_HR.
Cliquez sur "Postuler maintenant" en haut de la page. Vous pouvez importer votre CV en 1 clic — Jobiglo extrait automatiquement vos informations et postule pour vous.

Why are you reporting this job?

Thank you for your report. We will review this job.

Apply in 30 seconds

Enter your email to apply. An account will be created automatically.

By continuing, you accept our terms of use.

Already have an account? Login

Published 1 day ago

Expires 1 month from now

2 views · 0 interested

Boost your chances

Upload your CV — we will match you with relevant openings.

Analyzing your CV...

Recruitify_HR

Accra