Business Intelligence Info ← Alle vacatures

Senior AI Engineer - APM Features

Geplaatst 18 Mar 2026 (4d geleden)
Data Science Scale-up Senior Hybrid
AI Samenvatting

Als Senior AI Engineer ontwikkel je AI-gestuurde troubleshooting functies voor APM-workflows, waarbij je gebruikmaakt van LLM's en agentic systemen om complexe prestatieproblemen te diagnosticeren. Deze rol biedt de kans om innovatieve oplossingen te creëren die grote hoeveelheden observatiedata omzetten in duidelijke inzichten, wat bijdraagt aan de verbetering van klantresultaten in een dynamische, samenwerkende omgeving.

Functiebeschrijving

The APM Features team builds intelligent troubleshooting experiences that help customers quickly understand and resolve performance issues in complex distributed systems. We work at the intersection of observability, product, and AI, transforming large volumes of noisy telemetry data into clear explanations, insights, and actionable conclusions. This team is in an early, highly exploratory phase of applying LLMs and agentic workflows to real production APM problems. Engineers collaborate closely to prototype, test, and iterate on ideas, learning from experimentation and focusing on useful, reliable customer outcomes. Correctness, clarity, and product impact are central to how we work. As one of the first AI Engineers in EMEA for this team, you’ll help shape how AI engineering is practiced locally, working closely with peers to influence the future of AI-powered troubleshooting at Datadog. At Datadog, we place value in our office culture - the relationships that it builds, the creativity it brings to the table, and the collaboration of being together. We operate as a hybrid workplace to ensure our employees can create a work-life harmony that best fits them.   What you’ll do: Design and build AI-powered troubleshooting features for APM workflows using LLMs and agentic systems Help users diagnose and resolve performance issues by synthesizing large volumes of observability data, including traces, metrics, and logs Prototype, experiment, and iterate on AI-driven experiences, using evidence and user feedback to guide decisions and focus on real user value Define inputs, outputs, and success criteria for LLM-based systems operating in evolving and sometimes ambiguous environments Build agentic workflows with strong guardrails, balancing autonomy, safety, correctness, and reliability Lead features end-to-end in collaboration with peers and partners, from problem discovery through production and iteration Design and maintain evaluation loops, including offline evaluations, benchmarks, and A/B tests Write and own production backend services, contributing to reliable, scalable systems   Who you are: A senior, product-minded engineer with experience shipping AI systems to production Comfortable working in evolving problem spaces and proactively identifying meaningful opportunities to build Hands-on experience with LLMs or agentic systems, including prompting, tooling, evaluation, and guardrails Experience using AI coding tools such as Cursor, Claude Code, or similar, with the ability to reflect on what worked, what didn’t, and why A strong sense for correctness, failure modes, and how to measure and improve quality in AI systems Comfortable experimenting, learning from outcomes, and iterating thoughtfully Solid ML and applied science fundamentals, including experiment design and statistics You have demonstrated ability to use AI coding tools in day-to-day workflows and validate, critique, and refine AI-generated output Bonus points: These are helpful but not required - we don’t expect candidates to have experience with everything listed below. Exposure to agent frameworks, tool-use orchestration, retrieval-augmented generation (RAG), and indexing large-scale telemetry data Familiarity with SLO/SLA practices and incident response Hands-on experience with distributed tracing systems (OpenTelemetry, Datadog APM), profilers, or logs and metrics pipelines You’re motivated to push the boundaries of how AI can improve software engineering best practices and contribute to building AI-enabled products Distributed systems fundamentals and familiarity with observability concepts. Datadog values people from all walks of life. We understand not everyone will meet all the above qualifications on day one. That's okay. If you’re passionate about technology and want to grow your skills, we encourage you to apply.    Benefits and Growth:  New hire stock equity (RSUs) and employee stock purchase plan (ESPP) Continuous professional development, produ