We all know that data is growing, and it’s growing fast. In fact, IDC statistics show the big data market is growing about six times as much as the overall IT market is.
To address the need for super speed and the ability to handle high data volumes, Dimensional Insight has developed a new business intelligence (BI) data engine. Our lab in Cambridge, Massachusetts has been hard at work in developing this project, which we previewed to customers at our annual users conference earlier this summer.
Interested in getting inside the head of the lead developer of Diver’s data engine? Here’s an interview with Dimensional Insight’s Jamie Clark, who has the scoop on this exciting new technology.
Jamie, what is your position at Dimensional Insight?
I’m a senior developer, which means I’ve been at Dimensional Insight a while and I know almost everything there is to know about the company’s software and development. I joined Dimensional Insight in late 2001 after I graduated from MIT (and interned for a security company in Japan). I have a hand in almost every product that we develop, but I’m the primary developer on DivePort and the server components of DiveTab, and I manage and write code for Diver’s data engine.
What does your role entail?
I write a lot of code, both user-facing and behind-the-scenes. A good chunk of my time is spent making sure the software does what it’s supposed to, but I spend almost as much time designing. That means figuring out what the software should actually do in the first place. Like the other developers, I try to make sure the code is maintainable, but more than others, my particular technique for that is to tear it apart and put it back together. For example, I have torn apart and rewritten DivePort three times to ensure that it is doing exactly what we need it to do and can weather different user scenarios. I help other developers work on specifications for their pieces and I try to make sure the software is consistent, powerful and flexible. I also try to look ahead to make sure we’re prepared not just for the short-term, but for what users will want to do later. This was a key part of Diver’s data engine design during its incubation period.
Tell us about Diver’s data engine. Is it really that awesome?
It’s pretty awesome. Diver’s data engine gave us a chance to revisit lots of old axioms – it’s a different engine, from the ground up. We were able to simplify some processes a lot, improve performance in others, and in general, design an engine optimized for recent hardware and analysis practices. The most important fundamental changes are the new storage format (column-oriented and shareable), and the preference for query-time calculations (instead of build-time). Diver’s data engine is fast and flexible, and it helps to simplify data collection. It will impact almost all our products, not only as we integrate it into existing features, but also as we apply the lessons we learned in Diver’s data engine to the next generation of features.
What was the driving force behind developing Diver’s data engine?
At first, Diver’s data engine was just an experiment to deal with a specific kind of query: Multitabs. Multitab queries traditionally take longer than other queries. When we started on Diver’s data engine, we wanted to see how fast we could generate multi tab data if we didn’t need to use the traditional models. The experiment grew to encompass other features, until at some point we realized we were making a new data engine. The reasons to work on Diver’s data engine were basically the same throughout: we wanted to take advantage of hardware and software innovations to better handle new user behaviors – queries that were becoming more difficult as the data grew with the traditional approaches.
What will make developers say “wow” when they are using Diver’s data engine?
I think developers will appreciate how straightforward certain tasks can become in Diver’s data engine, such as setting up a new build script or QuickViews on a DivePort page. Maybe they’ll like some of the little things, like how every cBase (our new lingo for what we used to call a Model, though it’s really less structured, more like a collection of data) has a copy of the build script and log inside so you can see where the data comes from. But the biggest “wow”s from developers will probably come when we show them some of the advanced calculations that Diver’s data engine can do on-the-fly, or from seeing some query go from taking 3 minutes to just 3 seconds.
How about end-users?
If we’ve done it right so far, data consumers won’t see much of Diver’s data engine at all. DivePort users may see some pages render faster, but there are no new displays specifically for Diver’s data engine (yet). ProDiver users will be able to add columns using Diver’s data engine expressions in Markers that are built on top of cBases (whenever the console says “cBases” instead of “Models”), so they might benefit from the power of the new expression language.
What’s next on the development front?
We always have a lot of projects going on simultaneously. Here are some of the things I’m involved in: For DiveTab and Diver’s data engine, we are working on some new ways to use DiveTab for data input, and on a measures system for Diver’s data engine that will eventually benefit our product development road map for healthcare providers and may prove to be useful for other industries and domains as well. In addition, we are in the process of making Diver’s data engine builds even faster. We also have lots of little improvements to Diver’s data engine lined up for the next few months. For DivePort, our team has been working on some neat improvements to the Map Portlet, I’m adding a new “rotated” cousin of the Measures portlet for measures-based dashboards, and over the next few months, we’ll begin working on some big interface changes that will end up in the next major release.
Tell us about the culture of the Cambridge office. Is there anything that would surprise us about the team?
Little is known of the rituals and songs of those who dwell in the Cantabrigian Hermitage overlooking Central Square. Once in a while they can be glimpsed leaving their seclusion, faces shaded from the bright sun, apparently to scavenge for food. Visitors, escorted hurriedly to their cave-like Room of Conference return confused, with scrawled, unintelligible notes. Their numbers have grown over the years, as seemingly normal folk from the area are selected to assist their mysterious work. What data and relics are hidden in their vaults and databases? When on occasion some of them run, what are they escaping? To what use do they put the enormous volumes of soda and the miniature robot figurines? What’s up with the pool table? We may never know the answers to these and many other questions.
Um, okay… that’s pretty classically Cambridge for you! Thanks for your time, Jamie.
- The Data on Working from Home: Does It Improve Productivity? - April 16, 2020
- 5 Feel Good Ways to Help Your Community and Self During a Health Crisis - April 2, 2020
- 5 Reasons to Attend Our 2018 Regional User Meetings - June 14, 2018