CIO

NetApp sets its sights on cloud data management: A Q&A with CEO Tom Georgens

Moving virtual servers around a hybrid cloud environment isn't hard, but managing the data is.
  • John Dix (Network World)
  • 22 July, 2014 06:50

 Moving virtual servers around a hybrid cloud environment isn't hard, but managing the data is. That's why NetApp wants to be "the enterprise data-management standard across the enterprise," says CEO Tom Georgens. Network World Editor in Chief John Dix recently caught up with Georgens to get his take on what changes in the cloud computing world.

NW:     You announced a cloud strategy late last year so why don't we start with an update on what cloud means to NetApp.

TG:      The cloud could be a ten-hour conversation, but to distill it down, I think customers are wrestling with the role of the cloud. We all get Software-as-a-Service. And certainly we see workloads that are temporary or changing in nature that could use access to infrastructure that you can use on demand and then unwind. That transient capability is something that just can't be emulated with on-premise computing. And we see use cases with very, very low utilization, where data wouldn't be stored if it weren't the cloud. And things like archiving and backup, where you see cloud as a repository, those use cases make a lot of sense as well. But on the other hand, cloud is not a panacea. Leaving aside things like performance and security, the cloud is not that flexible and is not that inexpensive.

A CEO said to me recently, "Nobody goes to the cloud for price." I found myself coming to the defense of Amazon, but his point was that, given the transaction orientation and the bandwidth requirements of his workloads, the economics of doing it himself were far better. Twenty four hours later I was at a SaaS vendor that is half in the cloud and half on-premise, and they said they need to move on premise because of the economics.

So I think there are workloads where it is compelling, where it offers a set of capabilities that can't possibly be matched on premise. But there are other workloads that are problematic from an economic point of view, or where security is the concern. It's inevitable that enterprises, for a very long period to come, are going to have some combination of on-premise and off-premise computing. The hybrid cloud is going to be the dominant model here.

The real challenge, then, is how do you enable customers to create a seamless extension of what they do on-premise, so when they go to the cloud they don't have to run different sets of data management tools. That is, if I can move my apps from on-premise to the cloud and back, how do I make the data management seamless?

Since data management is the hardest problem, NetApp wants to be the enterprise data-management provider. NetApp's value proposition is primarily the software. We already manage data on other people's hardware, and we sell systems bundled with our software and also unbundle the software. We have a version of our product that basically allows access to the elastic compute of an Amazon or a Microsoft, while letting you maintain control of your data, and you can expect to see us offer more tools that even more closely integrate with the cloud.

NetApp wants to manage data whether it's on our equipment or not, and whether it's on-premises or not. We want to be the enterprise data-management standard across the enterprise. We don't make disk drives. We're not Seagate. We basically make software that makes disk drives reliable, high performance and easy to manage. And if we viewed Amazon and some of these services as basically the new disk drives, we can manage those as well and enable customers to operationalize the cloud.

We have no desire to emulate the cloud. We have no desire to offer an undifferentiated cloud service. Our view is the cloud is not a target, the enterprise customer is the target and the cloud is a tool that is available to them.

NW:     Do you need to partner with cloud providers to do that?

TG:      Actually there are some clever things we're doing that are unique and make the cloud providers want to partner with us. I mentioned the cloud enabling you to spin up 500 servers for a short period of time and then spin them down, but once you move data there it costs you money to pull it out, it costs you money to access it. So we have this collaboration with Amazon that we call NetApp Private Storage for Amazon that allows you to keep the data on your network and connect through a high-bandwidth pipe to their compute farm. So you keep the data under your control, under your tools, with all of your security, but still have access to the elastic compute.

We see people using that for everything from test and development to backup. Disaster recovery is another interesting application. You keep replicating data and if there is a disaster you spin up all the servers and networking.

Customers see value in the cloud, but they don't want to have to use a separate set of tools, a separate set of processes, and they're having a tough time operationalizing it. We want to come in and say, "If you standardize on NetApp data management we will seamlessly integrate the cloud for you and it will make the cloud look like it's actually part of your own infrastructure."

NW:     Do your customers typically refer to their internal systems as private clouds?

TG:      The vast majority of enterprise deployments would be described as private clouds. If you go back to the server virtualization revolution of five years ago, the original premise was, I have all these servers running individual apps and they're all grossly underutilized. So, if I can encapsulate those apps and run multiple apps simultaneously on servers, I can drive up utilization rates, lower my server footprint, and the economics are compelling. And that's all true.

But now that my apps are no longer tied to hardware, I can move them from server to server for load balancing, or data center to data center for disaster recovery, so I can think about my infrastructure independent of my apps.

So I can build a very homogeneous, very highly automated, very efficient infrastructure that's application-dependent, and I can run many, many apps on it with a set of tools that automate all that. That's my definition of the private cloud. And there's no doubt that companies are moving to that model for on-premise computing, outside of very specific dedicated applications, simply because of the flexibility and the cost. But that said, if you own infrastructure you'll never match the flexibility of being able to turn on 1,000 servers and then turn them off a month later like you can with an Amazon.

NW:     This idea of pooled resources started with storage before it even emerged with servers, but then seemed to stall a bit. How would you classify where we stand today?

TG:      I don't think it's stalled. In fact, it's our big bet. You've got dedicated infrastructures around a handful of apps, and the rest is being pooled for the flexibility and the economics and the commonality. So clearly the trend is towards a virtualized private cloud.

From NetApp's point of view, when virtualization came out there was this Wall Street sentiment that virtualization was bad for us because the conversation was going to be dominated by server vendors. What they didn't realize was that virtualization, and ultimately its enablement of the private cloud, had big implications for storage. The simplest rational was that if people were going to build a common infrastructure across many apps that's highly efficient, highly automated and highly homogeneous, they wouldn't want five or six different storage products because that would basically return them to islands of infrastructure.

So our view, and how we grew from effectively a standing start to the number two storage provider in the world, is that we have a common architecture that spans these business applications. We can come to a customer that is making a big virtualization push and say, "We've got one single architecture that is also highly efficient, highly automated, that you can put behind your virtualized infrastructure."

So when the world was relatively negative about the impact of virtualization on NetApp (it was certainly reflected in our stock price at the time), our internal mantra was, we want to be the unquestioned technology leader in storage for virtualized infrastructures, and we came out of that last downturn and grew 30% the next year and 25% the year after that.

NW:     You mentioned islands of computing, yet in some of these modern data centers companies are building around pods. What is the difference between a pod and an island?

TG:      With virtualized infrastructure, where people could be running a dozen applications per server and they could have hundreds of servers, the idea of building a storage pool behind it doesn't mean you end up with one gigantic box. In fact, that's not the practical way to build it.

We want you to basically build a cluster. NetApp's big innovation over the last few years is taking this operating system we have -- which is the number one storage operating system in the world because we're multi-protocol while our competitors have multiple point products and adding clustering so we can deliver effectively unlimited scale, unlimited performance and non-disruptive operation.

So while this infrastructure is a bunch of boxes, they're all networked together and pooled with transparent volume migration between them, all managed with one set of tools. From a manageability point of view and a provisioning point of view it's all one big pool, but the physical reality is it's a bunch of boxes that are interconnected with a high-speed network.

So coming back to your question about pods; this is a little bit different. NetApp has been collaborating with Cisco over the last couple of years on something we call FlexPod. FlexPod is a very tight integration between NetApp storage, Cisco server, Cisco networking and VMware. Basically we create a pre-integrated server/networking/storage stack.

The rationale is that IT spending is constrained and the customer's ability to invest in engineering is going down, so their ability to evaluate, certify, test and integrate products is limited. If likeminded players can come together like a Cisco and a NetApp and a VMware, we can integrate our solutions and then say to a client, "We have something that is every bit as integrated as anything you could buy from an HP or an IBM, but it's made of best-of-breed components and, since it is pre-integrated, we lower your risk."

Now that doesn't mean that the pods are isolated around individual apps. The customer could buy 50 of these pods and then integrate them with the clustering software and still have a big pool of storage, even though they're buying it as pre-integrated pods.

NW:     Switching gears a bit, what do you make of the promise of big data?

TG:      People are intrigued with the possibility of gleaning more intelligence from all this data. But the transition into production is complicated. For NetApp, we have our own big data project that has to do with how we monitor all of our equipment and analyze the data that gets reported back to us. And that's been very successful.

NetApp is not an analytics company, so we don't compete with Oracle, Teradata or SAP. All three are important partners for us. The NetApp advantage is we've got multi-protocol storage and we can build a big pool of storage, whether its file-oriented or database-oriented or block-oriented, and it's easy to extract that data and have access to your analytics tool.

NW:     Are customers that are pursuing some kind of big data application relying on existing storage infrastructure or building anew to support that?

TG:      It varies. If their data is hard to get at -- in other words, it's in all different forms -- then they will tend to extract it from its current locations and create another data set to run the analytics against. But with NetApp you don't need to do that. You're got one set of access methods to get at our data and we have one set of tools to manage it.

We see a fair amount of proof-of-concepts being done with some of these newer technologies like Hadoop, and we also see proof-of-concepts being done in the cloud to avoid big upfront investments. And a lot of these activities are driven by lines of business. Marketing wants to understand customer behavior or operations wants to understand product yields, and some of this activity is being done outside of the CIO function.

But as these things become mature and they want to go to production, and if they're going to interact with the systems of record or in fact become systems of record, some tension arises between the lines of businesses and the core functionality of the company. Because if this is going to scale, if it's going to be reliable, if it's going to be secure, then perhaps some of this proof-of-concept work might not be on the infrastructure you want to use.

NW:     In closing, anything that surprises you in terms of how the industry is evolving?

TG:      Customers have a lot on their mind and a lot to consider ... questions about cloud, economic concerns, security concerns, new technologies like flash, software-defined everything, etc. And that is clearly slowing down decision making. But you can't just stand pat and let your competitors trample all over you, and you can't be reckless. There are enormous opportunities for CIOs to navigate the landscape of all these new technologies, manage the risks, understand the risks, and ultimately deliver competitive advantage to their firm.

It was like 15 years ago when we were living with the question, "Does IT matter?" I think IT matters a lot and I think there's a much greater range of potential outcomes, both positive and negative, to organizations based on how IT performs and how they make choices around these available technologies. So I don't know if it's a surprise, it's just that there's a lot of things coming at us.