All the technical detail, expertise and advice


Driving Data Loading to find the facts

Published: Tuesday, 24 February 2015 15:24 by Ant Phillips, Senior Developer
Big Data Analytics

It is just a few weeks since we completed the acquisition of Celebrus by IS Solutions. Since then we’ve moved offices from Newbury to Sunbury-on-Thames. Not surprisingly that involved clearing out a whole heap of stuff which has been lying around for far too long. Along the way we unearthed several dusty product release CDs going back over a decade.

Looking back at those earlier releases, it’s interesting to contrast the focus back then with our latest release (v8 update 11). Ten years ago, data collection was focused primarily around reporting.

Lots of totals, averages and aggregations of one kind or another. And the technology matched those requirements. In Celebrus terms, this was, and still is, implemented by our Analytics Server, part of our v8 Big Data Engine. Every so often the Analytics Server fires up and calculates summary information from activity in the last five minutes, hour, day etc. The results of that processing are written to a set of database tables. There’s nothing inherently wrong with the Analytics Server approach, it is simply that the world has moved on.

The focus today is almost exclusively on highly detailed data about individuals, not just summary information. The data also needs to be available in near real-time. This information is crucial to understand each and every journey a customer has had with your brand. Armed with this insight into customer behavior, a whole slew of possibilities unfold which enable you to understand and optimize your business, whether that be to offer a discount to a valuable customer, or to understand why someone chose a competitor’s product. All these use cases and many more start with data.

As Sherlock Holmes once said:

It is a capital mistake to theorize before one has data. Insensibly one begins to twist facts to suit theories, instead of theories to suit facts.

So with all this in mind, you will see in our latest release our new Data Loader. The Data Loader is our go-forward architecture for loading data at lightning fast speeds. The Data Loader scales to support huge amounts of traffic on some of the busiest web sites in the world, and can process 10’s of thousands of events per second (sustained). Not just that, it delivers the data into your systems in less than a minute, making new use cases around streaming analytics possible. We support MySQL, Microsoft SQL Server, Oracle and Teradata out of the box. Better yet, the Data Loader includes a pre-defined database schema covering some 75+ tables and models: everything you might want to understand about your digital customers.

In addition we’ve been working hard with the folks over at MongoDB. This release has been fully certified with MongoDB Enterprise Edition. This makes MongoDB the perfect data store for Celebrus customer journey data. This customer journey data is focused towards operational applications. For example, contact center staff use this information to help them understand a customer’s interactions with your brand.

This is the first release where we have worked with a document database, and it has been a really good experience. The flexibility, simplicity and productivity of MongoDB are tremendous. For example, in MongoDB we simply store all business events in a single collection (rather than lots of normalized relational tables). Each type of business event contains some common attributes (timestamp, session number, customer identifier, event type and so on). An event also contains some data specific to that event, for example the purchase price, quantity and SKU code for a purchase transaction event. All of this just works with MongoDB, no friction, no joins, no complexity. Job done!

Our use of cookiesv20201124

We are serious about data protection and your privacy. We will only collect your personal data and use it with your permission. We use necessary cookies to make our site work and analytics cookies to help us improve it. We will not collect any personal identifiable information unless you enable us to do so by selecting Opted-in browsing.

For more detailed information about the cookies we use, see our cookies policy & settings page.

Opted-in browsing

In addition to collecting and reporting data on how you use our website, we will also collect information that allows to identify you which in turn allows us to communicate with you more precisely. You will be able to opt-out of our communications at any time. In the course of dealing with you, we may need to pass your personal data on to third-party service providers contracted by D4t4 Solutions.