Clickstream Data Warehouse and Reporting Help Orbitz Know Customer Behavior
Smarter Data Warehouse Solution Enables Critical User Behavior Reporting
To improve its ability to make intelligent marketing decisions, Orbitz, one of the leading
online travel companies, began an initiative to improve its clickstream reporting. Its first
attempt resulted in a solution that failed to meet the company's needs, primarily because it was
not robust enough.
Specifically, the existing reporting solution was unable to provide...
- customer online behavior data, specifically exit points, performance SLAs, navigation patterns,
etc. With this information, the company could tune and improve the effectiveness of the user
experience.
- data on customers that searched but did not purchase. Knowledge on this huge group would enable
the company to properly target these potential customers.
- search information. This data would enable marketing managers to create specials based on
customer demand.
- bot identification. The company needed to reduce the huge number of non-authorized bots that
scraped information and effected website performance.
Three Critical Challenges and One Powerful Solution
Orbitz selected Dunn Solutions Group to develop a more powerful data warehouse and reporting
solution that could provide that important data and overcome these three major challenges:
- Five million to 15 million rows of data needed to be processed every day in five hours or
less.
- Identified (logged-in) and inferred members had to be tracked.
- The data model and reporting model needed to be both high-performance and simple to use.
The foundation of the system was a flexible datamodel that provided a framework for
expansion, but also would deliver high-performance. The datamodel was built specifically to take
advantage of Oracle parallel processing, partitioned indexes and partitioned tables. It was made up
of four temporary/staging tables, 20 transactional detail tables (previous 30 days), and two years
of aggregation information.
With the data model developed, the next step was the extraction/transformation/load (ETL)
process. Built using a combination of Informatica and Oracle PL/SQL functions, ETL handled
the loading and processing of 10,000-plus rows per second without requiring additional hardware
resources.
The reporting environment was actually the easiest component of the project to implement: A
well-designed datamodel and ETL process makes it possible to create a simple and concise reporting
environment. And from this reporting environment these important data feeds were created:
- "Search Without Buy" campaign management
- Segment search and conversion rate report for segment managers
- Campaign management cross-reference between an executed campaign and visit cross-reference
- Candidate bot e-mail generation and bot rejection list