Tech Lead - Agentic AI & AWS Backend
Northbound
Software Engineering, Data Science
Berlin, Germany
At Northbound, we are building AI agents that automate complex workflows in global logistics and customs operations for large enterprise customers.
As Tech Lead - Agentic AI & AWS Backend, you will be the driving force behind how our AI agents reason, plan, and act. You will own the design and evolution of our agentic architecture – the orchestration layer that coordinates multi-step workflows across regulatory lookups, document validation, downstream system integrations, and complex decision-making.
You will work directly with the technical founder to translate messy, real-world logistics problems into reliable, production-grade agentic systems. This means designing the right abstractions – deciding when to use single vs. multi-agent architectures, how to manage context and memory across long-running workflows, and how to build guardrails that keep autonomous systems safe and predictable.
This is a hands-on role for someone who lives and breathes the rapidly evolving world of agentic AI and wants to apply it to hard, real-world problems at enterprise scale.
What You’ll Do:
- Design and build the agentic orchestration layer that powers our core product – coordinating planning, tool use, memory, retrieval, and self-verification across multi-step workflows.
- Architect multi-agent systems with clear abstractions – deciding when to decompose into specialized agents vs. single-agent flows, and how agents communicate and hand off tasks.
- Design and build tools that agents use to interact with external systems – APIs, databases, document processors, and downstream services.
- Design guardrails, failure handling, and human-in-the-loop mechanisms for autonomous workflows.
- Integrate and operate LLMs in production, making informed decisions on model selection, cost-performance trade-offs, and multi-model orchestration strategies.
- Build and maintain document processing pipelines – extracting structured data from PDFs, Excel files, and scanned documents as inputs to agent workflows.
- Design and operate resilient backend systems on AWS, including serverless architectures, to support agent workflows at scale.
- Contribute to the Python backend platform, ensure software quality through reviews, testing, and CI/CD, and mentor engineers across the team.
Required Skills & Experience:
Agentic AI (Primary)
- At least 1 year of professional experience building and deploying LLM-powered agentic systems in production environments.
- Hands-on experience with LLM platforms such as AWS Bedrock, OpenAI, and Anthropic Claude.
- Strong understanding of orchestration patterns: multi-agent coordination, tool use, planning, memory management, context routing, and state machines.
- Experience designing and building tools for agents – writing reliable interfaces that let agents interact with APIs, databases, and external services.
- Experience with the LangChain ecosystem – LangGraph, LangSmith, and observability tools like LangFuse. Familiarity with alternatives like CrewAI, AutoGen, or OpenAI Agents SDK is a plus.
- Experience implementing structured evaluation frameworks for measuring agent reliability, accuracy, and cost.
- Working knowledge of reflection, ReAct, and self-correction patterns.
- Familiarity with Model Context Protocol (MCP) for standardized tool integration and agent-to-system communication.
Engineering Fundamentals
- 7+ years of professional software development experience with strong Python and backend engineering skills.
- Proven experience shipping and operating production systems in real-world environments.
- Solid understanding of API design, distributed systems, and event-driven architectures.
- Experience with document processing pipelines – PDF/Excel extraction, OCR, and familiarity with services like AWS Textract or similar alternatives.
Highly Valued
- Public GitHub projects or open-source contributions related to agentic AI, LLM tooling, or agent orchestration.
- AWS experience, especially serverless services (Lambda, API Gateway, S3, DynamoDB, SQS). AWS certification is a strong plus.
- Experience with Flask/FastAPI and RESTful API design.
- Knowledge of Docker and containerized deployments.
- Infrastructure as Code (AWS SAM, Terraform, CloudFormation).
- Experience working in early-stage or high-growth startups.
- Exposure to enterprise security and compliance environments.
What We Offer:
- Direct collaboration with the Technical Founder on core product and AI strategy.
- Real ownership of the agentic AI platform powering global enterprise customers.
- Opportunity to shape the technical direction of an enterprise AI company at an early, high-impact stage.
- A modern, well-equipped office in Berlin Kreuzberg, shared with a strong tech community.
- A focused, high-trust engineering culture with minimal bureaucracy.
- Competitive compensation and ESOP/VESOP package.