Clickstream Data Warehouse and Reporting Help Orbitz Know Customer Behavior

Smarter Data Warehouse Solution Enables Critical User Behavior Reporting

To improve its ability to make intelligent marketing decisions, Orbitz, one of the leading online travel companies, began an initiative to improve its clickstream reporting. Its first attempt resulted in a solution that failed to meet the company's needs, primarily because it was not robust enough.

Specifically, the existing reporting solution was unable to provide...

  • customer online behavior data, specifically exit points, performance SLAs, navigation patterns, etc.  With this information, the company could tune and improve the effectiveness of the user experience.
  • data on customers that searched but did not purchase. Knowledge on this huge group would enable the company to properly target these potential customers.
  • search information. This data would enable marketing managers to create specials based on customer demand.
  • bot identification. The company needed to reduce the huge number of non-authorized bots that scraped information and effected website performance.

Three Critical Challenges and One Powerful Solution

Orbitz selected Dunn Solutions Group to develop a more powerful data warehouse and reporting solution that could provide that important data and overcome these three major challenges:

  • Five million to 15 million rows of data needed to be processed every day in five hours or less.
  • Identified (logged-in) and inferred members had to be tracked.
  • The data model and reporting model needed to be both high-performance and simple to use.

The foundation of the system was a flexible datamodel that provided a framework for expansion, but also would deliver high-performance. The datamodel was built specifically to take advantage of Oracle parallel processing, partitioned indexes and partitioned tables. It was made up of four temporary/staging tables, 20 transactional detail tables (previous 30 days), and two years of aggregation information.

With the data model developed, the next step was the extraction/transformation/load (ETL) process.  Built using a combination of Informatica and Oracle PL/SQL functions, ETL handled the loading and processing of 10,000-plus rows per second without requiring additional hardware resources.

The reporting environment was actually the easiest component of the project to implement: A well-designed datamodel and ETL process makes it possible to create a simple and concise reporting environment. And from this reporting environment these important data feeds were created:

  • "Search Without Buy" campaign management
  • Segment search and conversion rate report for segment managers
  • Campaign management cross-reference between an executed campaign and visit cross-reference
  • Candidate bot e-mail generation and bot rejection list