Data is a new kind of capital: Oracle’s Senior Data Strategist

Jinoy Jose P Updated - November 14, 2019 at 06:15 PM.

Oracle’s Senior Data Strategist Paul Sonderegger

Oracle’s Senior Data Strategist Paul Sonderegger says Oracle recognise that data really is a new kind of capital, even though accounting rules may not allow for that in all cases and a vast majority of data value creation in the digital economy happens inside the same companies that create their own data. BusinessLine met Sonderegger at the recent Oracle's Cloud Summit in San Francisco. Excerpts from the interview:

How’s the data economy evolving and what are the new challenges in terms of data management?

What we see among our customers, especially among large enterprises, is that they’re starting to get the idea that data is a true asset, that it’s a kind of capital. What this means is data is a factor of production, an economic factor of production in digital goods and services. In fact, a couple years ago, The Economist called data the world’s most valuable asset, but it’s a strange asset, and one of the ways that it’s strange is that most data that gets produced never goes to market. Of course, some data does get bought and sold, and there are some really important privacy and security considerations around that practice.

But, most data never goes to market; it gets used inside the same firm that creates it. This would be like an oil company becoming a multi-billion-dollar firm by burning all of the oil that it extracts rather than selling it to others too. And, what this means is that the majority of the value creation from data in the data economy happens inside of the enterprise that creates it.

So, in each one of our customers, in each one of these companies, there is a hidden data economy, and there is a diverse supply of data coming from a growing number of applications, sensors, smart devices, and all of these things are creating small data assets, sometimes very large piles of data assets. So, that’s the supply side.

Then within these enterprises, there’s growing demand for data; so analysts and data scientists who have to create new analytics or are trying to create new algorithms, they are looking for datasets to work with. Now, there are a number of problems; one of which, the principle of which, is that there are no market signals inside the firm. Every company is basically a command economy. Companies inside do not work on market economic principles. So, there are no demand signals that, for example, would influence application developers to write their applications such that they create data that is then easier to use in algorithms, there is no such signal like that.

At the same time, on the demand side, there’s all kinds of latent demand for different business units, they constantly have new questions because they’re responding to new competitive threats on the outside but they don’t have any good way to express what kind of data products they wish they could get. So, what do you do in such case?

Well, the way that we think about this problem is that this is a hidden data economy, it’s hiding in plain sight inside each company. It’s not going to go away and companies are not all of a sudden going to start working as if they had these internal markets with actual pricing to provide signals.

Instead, what they can do is provide a market exchange of sorts -- a data exchange inside each company that brings the transaction costs of getting the data you want into the shape you need to near zero. There are a couple of ideas that we need to talk about in there. Autonomous data management is the key to bringing down the transaction costs of getting data from its point of origin to its many points of use inside a large company.

How exactly is that done?

Autonomous Data Management now has a couple of responsibilities. One is to make the data assets that are available, that have been created, easier to find, easier to discover for analysts and data scientists, so they know what data assets were available to them in order to make it easier for them to, in fact, create new analytics, create new algorithmic services, and things like that. That’s one of the things that autonomous data management has to do.

One of the other things that autonomous data management has to do is reduce the time and effort on the part of these analysts and data scientists to turn the data assets that they uncover into the structure that he or she actually needs for that specific analytics use case and that specific algorithm. It also has to be the case that whenever a data asset gets used in one of these analytics or algorithms that this doesn’t make it any more difficult to use that very same data asset in a different analytics or algorithm, because what you want is to get data from its point of origin to its many points of views quickly and securely and cost-effectively.

But Autonomous Data Management also has to provide governance on this internal data market, so that the company knows exactly who is accessing this data. Also, has audit and log trails of what they’re doing with it, in what analytics and algorithms do these data observations participate, under what jurisdictions, where in the world are these analyses taking place? And also, it would be good if you could, if this Autonomous Data Management could make it easier for companies to let you know how they are using your data.

And that calls for more transparency...

Right. So that it’s more transparent to you what they’re actually doing with it, so that you could be comfortable that they’re using it and using it in effective ways or using it in ways that you’re okay with. How do we do this? The Autonomous Database is the first step.

With the Autonomous Database we are making these data management tasks around tuning the database, around securing it, around monitoring the infrastructure that it runs on and anticipating and avoiding sick servers, failed servers; but also, anticipating and avoiding, bottlenecks created by particular application structures and the way that they write queries.

So all of that, what we’ve done with the Autonomous Database is to take decades of individually automated functions and made it possible for them to adapt to new circumstances on their own, working together. We’ve made it possible for these individual automated functions to now automatically interact and be able to adapt to changing circumstances.

One of the best examples of this is Autonomous indexing where the database is watching the actual queries that are coming in and looking at the actual workloads that it’s conducting, and analyzing the structure of those queries, it then hypothesizes indices that should make them go faster, it then tests those indices and only the ones that produce a measurable improvement in performance get published into the database, so that they are visible and usable.

But even then, these indices only apply to the queries for which they actually improve performance and the Autonomous Database is doing this on a continuous basis. And so, as the workloads change, it adapts to those changing workloads.

All these things, are they being done in real time?

It depends, because sometimes you want it to and sometimes you don’t; and the other part of the answer is sometimes, even when you want these analysis to be happening in real time, you don’t necessarily want the system to take an action automatically, sometimes you want it to raise an alert and give it to; and there’s a human in the loop. So sometimes you do want this kind of processing to happen automatically and you want some machine learning to happen automatically in real time. This is the case with fraud detection, for example.

So, what’s Oracle’s approach here?

We recognise that data really is a new kind of capital, even though accounting rules may not allow for that in all cases and a vast majority of data value creation in the digital economy happens inside the same companies that create their own data; and these internal data economies, inside each one of these enterprises, the transaction costs of getting data from its point of origin to its multiple points of views are too high.

An Autonomous Data management platform brings that down. To do that, we are creating the Autonomous Database, which simplifies and reduces the level of effort required to create new applications, and also simplifies and reduces the effort required to create new analytics and algorithms.

At the same time, it also reduces the time, cost, effort of securing and protecting that data not only in terms of access, who gets access to it, but also in terms of building new services to monitor and audit, has it been tampered and how is it being used and where and are all of those things consistent with governing laws. This is the way that we’re helping companies create more value from their data to increase their return on data capital.

In future, how this will go forward from here?

Well, I mean, there are three big impacts: the first big impact is an increase in data productivity, where, right now a lot of data assets are just not used. Autonomous Data management brings down the time, the cost, the effort to use these data assets, so a single dataset then gets used more for less.

The second big impact is an increase in tech labor productivity. Developing applications more quickly, developing algorithms and analytics more quickly, but also a dramatic reduction in the amount of time that IT specialists have to spend keeping this whole data tier performance.

And, the third big impact is in increasing data value; and the reason that we can make that claim is because reducing the time, the effort, the cost to use these data assets makes it easier to uncover and capture optimal value in your data, additional uses that had not been anticipated when the data was first created.

The interviewer visited San Francisco recently to attend a cloud computing conference at the invite of Oracle

Published on November 14, 2019 12:38