Wednesday, June 29, 2011


As we mark the fifth anniversary of our annual study of the digital universe, it behooves us to take stock of what we have learned about it over the years.  We always knew it was big – in 2010 cracking the zettabyte barrier.  In 2011, the amount of information created and replicated will surpass 1.8 zettabytes (1.8 trillion gigabytes) - growing by a factor of 9 in just five years.

But, as digital universe cosmologists, we have also uncovered a number of other things — some predictable, some astounding, and some just plain disturbing.

While 75% of the information in the digital universe is generated by individuals, enterprises have some liability for 80% of information in the digital universe at some point in its digital life.  

The number of "files," or containers that encapsulate the information in the digital universe, is growing even faster than the information itself as more and more embedded systems pump their bits into the digital cosmos. In the next five years, these files will grow by a factor of 8, while the pool of IT staff available to manage them will grow only slightly.

Less than a third of the information in the digital universe can be said to have at least minimal security or protection; only about half the information that should be protected is protected.
The amount of information individuals create themselves — writing documents, taking pictures, downloading music, etc. — is far less than the amount of information being created about them in the digital universe.

The growth of the digital universe continues to outpace the growth of storage capacity. But keep in mind that a gigabyte of stored content can generate a petabyte or more of transient data that we typically don't store (e.g., digital TV signals we watch but don't record, voice calls that are made digital in the network backbone for the duration of a call).  

So, like our physical universe, the digital universe is something to behold — 1.8 trillion gigabytes in 500 quadrillion "files" — and more than doubling every two years. That's nearly as many bits of information in the digital universe as stars in our physical universe.  
However, unlike our physical universe where matter is neither created nor destroyed, our digital universe is replete with bits of data that exist but for a moment — enough time for our eyes or ears to ingest the information before the bits evaporate into a nonexistent digital dump.

This is not to diminish the value of the temporary existence of these bits that can serve a variety of purposes during their short lives, such as driving consumption (to increase ad revenue from Web site traffic) or real-time data analytics (to optimize existing operations and create entirely new markets).

What are the forces behind the explosive growth of the digital universe? Certainly technology has helped by driving the cost of creating, capturing, managing, and storing information down to one-sixth of what it was in 2005. But the prime mover is financial. Since 2005, the investment by enterprises in the digital universe has increased 50% — to $4 trillion. That's money spent on hardware, software, services, and staff to create, manage, and store — and derive revenues from — the digital universe.  

In an information society, information is money. The trick is to generate value by extracting the right information from the digital universe — which, at the microcosmic level familiar to the average CIO, can seem as turbulent and unpredictable as the physical universe.

In fact, thanks to new tools and technologies, and new IT and organizational practices, we may be on the threshold of a major period of exploration of the digital universe. The convergence of technologies now makes it possible not only to transform the way business is conducted and managed but also to alter the way we work and live.


New capture, search, discovery, and analysis tools can help organizations gain insights from their unstructured data, which accounts for more than 90% of the digital universe. These tools can create data about data automatically, much like facial recognition routines that help tag Facebook photos. Data about data, or metadata, is growing twice as fast as the digital universe as a whole.

Business intelligence tools increasingly are dealing with real-time data, whether it's charging auto insurance premiums based on where people drive, routing power through the intelligent grid, or changing marketing messages on the fly based on social networking responses.  

New storage management tools are available to cut the costs of the part of the digital universe we store, such as deduplication, auto-tiering, and virtualization, as well as to help us decide what exactly to store, as in content management solutions.

An entire industry has grown up to help us follow the rules (laws, regulations, and customs) pertaining to information in the enterprise. It is now possible to get regulatory compliance systems built into storage management systems.

New security practices and tools can help enterprises identify the information that needs to be secured and at what level of security and then secure the information using specific threat protection devices and software, fraud management systems, or reputation protection services.

Cloud computing solutions — both public and private and a combination of the two known as hybrid — provide enterprises with new levels of economies of scale, agility, and flexibility compared with traditional IT environments. In the long term, this will be a key tool for dealing with the complexity of the digital universe (see Figure 1).

Cloud computing is enabling the consumption of IT as a service. Couple that with the "big data" phenomenon, and organizations increasingly will be motivated to consume IT as an external service versus internal infrastructure investments.  

Journey to the Cloud  

As the digital universe expands and gets more complex, processing, storing, managing, securing, and disposing of the information in it become more complex as well.  

Consider this: Over the next decade, the number of servers (virtual and physical) worldwide will grow by a factor of 10, the amount of information managed by enterprise datacenters will grow by a factor of 50, and the number of files the datacenter will have to deal with will grow by a factor of 75, at least. Meanwhile, the number of IT professionals in the world will grow by less than a factor of 1.5.

As a result, the skills, experience, and resources to manage all these bits of data will become scarcer and more specialized, requiring a new, flexible, and scalable IT infrastructure, extending beyond the enterprise. Today we call it cloud computing.

And while cloud computing accounts for less than 2% of IT spending today, IDC estimates that by 2015 nearly 20% of the information will be "touched" by cloud computing service providers — meaning that somewhere in a byte's journey from originator to disposal it will be stored or processed in a cloud. Perhaps as much as 10% will be maintained in a cloud.  

Much of the current movement to cloud architectures is being enabled by pervasive adoption of virtualization. Last year was the first year in which more virtual servers were shipped than physical servers. IDC estimates that today nearly 10% of the information running through servers is doing so on virtualized systems and expects that number to grow to more than 20% in 2015. This percentage increases along with the size of the organization. Some larger environments today operate with 100% virtualized systems.  
Of course, cloud services come in various flavors — public, private, and hybrid. For organizations to offer their own cloud services, they have to do more than just run virtual servers. They must also allow for virtualized storage and networking, self-provisioning, and self-service — and provide information security and billing. Few enterprises are here yet, so the impact of private clouds on the digital universe today is small (see Figure 3). But by 2015, when the virtualized infrastructure is more common, the rate of growth will accelerate.

Big Value from Big Data

Big data is a big dynamic that seemed to appear from nowhere. But in reality, big data isn't new. Instead, it is something that is moving into the mainstream and getting big attention, and for good reason. Big data is being enabled by inexpensive storage, a proliferation of sensor and data capture technology, increasing connections to information via the cloud and virtualized storage infrastructures, and innovative software and analysis tools. Big data is not a "thing" but instead a dynamic/activity that crosses many IT borders. IDC defines it this way:

Big data technologies describe a new generation of technologies and architectures, designed to economically extract value from very large volumes  of a wide variety of data, by enabling high-velocity capture, discovery, and/or analysis.

Big data is a horizontal cross-section of the digital universe and can include transactional data, warehoused data, metadata, and other data residing in ridiculously large files. Media/entertainment, healthcare, and video surveillance are obvious examples of new segments of big data growth. Social media solutions such as Facebook, Foursquare, and Twitter are the newest new data sources. Essentially, they have built systems where consumers (consciously or unconsciously) are providing near continuous streams of data about themselves, and thanks to the "network effect" of successful sites, the total data generated can expand at rapid logarithmic rates.  

It is important to understand that big data is not only about the original content stored or being consumed but also about the information around its consumption. Smartphones are a great illustration of how our mobile devices produce additional data sources that are being captured and that include geographic location, text messages, browsing history, and (thanks to the addition of accelerometers and GPS) even motion or direction (see Figure 4).


Source and/or and/or more resources and/or read more: ─ Publisher and/or Author and/or Managing Editor:__Andres Agostini ─ @Futuretronium at Twitter! Futuretronium Book at