Beyond "Extract-&-Dump": High-Fidelity Migrations with Drupal AI - Jonathan Bourland & Mike Gifford
Video Description
A common failure in legacy migrations is the ""extract-and-dump"" trap: taking unstructured legacy HTML and shoving it into a single Drupal ""Body"" field – because the cost of field mapping is too expensive in terms of time and money. This creates immediate technical debt that sabotages content flexibility, and site extensibility; it dooms search, SEO, and future redesigns. The manual effort required to funnel unstructured legacy content into sophisticated target schemas remains a resource-intensive and high-friction endeavor.
While semantic migration for humans is hard, it’s easy for AI. So, let’s leverage it to create clean, structured data—and not just raw piles of HTML.
This session explores how the Drupal AI Migration module (https://www.drupal.org/project/ai_migration) bridges architectural vision and automated execution. Instead of letting AI guess your data model, we show a clever use of Drupal’s Serialization API to generate a JSON Schema from well-designed content types, providing the AI with a strict ""contract"" for the target data.
We will discuss how this approach allows an LLM to parse raw legacy HTML and precisely populate structured fields according to your specific schema. You’ll see how combining Drupal’s core strengths with AI-driven semantic parsing creates a high-integrity, queryable data layer that will serve your sites for years to come.
While semantic migration for humans is hard, it’s easy for AI. So, let’s leverage it to create clean, structured data—and not just raw piles of HTML.
This session explores how the Drupal AI Migration module (https://www.drupal.org/project/ai_migration) bridges architectural vision and automated execution. Instead of letting AI guess your data model, we show a clever use of Drupal’s Serialization API to generate a JSON Schema from well-designed content types, providing the AI with a strict ""contract"" for the target data.
We will discuss how this approach allows an LLM to parse raw legacy HTML and precisely populate structured fields according to your specific schema. You’ll see how combining Drupal’s core strengths with AI-driven semantic parsing creates a high-integrity, queryable data layer that will serve your sites for years to come.