DrupalCon Seattle 2019: Considerations of Federated Search and Drupal

Large enterprises often have many digital web properties split among various teams, departments, and supporting partners. Some of these properties may be Drupal, while others may be different platforms altogether. Federated search is a common approach to connect various properties and help end users find the information they seek - even if they are not on the correct web site.

This talk shares an approach for creating a federated search application capable of integrating with Drupal or other web application frameworks. We’ll look at a variety of problems that needed to be solved, including web crawling/parsing, search backends, search front-end user interfaces, and infrastructure needs.  A high level architecture will be presented that connects all of the various systems. And, we will dive into specifics around Drupal that can help mitigate the risk of implementing such an approach with an existing application.

Tools presented will include Scrapy, a Python based web crawling framework, Docker, React, and Drupal. The approach presented is fairly technology-agnostic and should be capable of supporting different web technologies displaying search results.

Drupal is a registered trademark of Dries Buytaert.