It's not uncommon to see data described as the currency of the modern digital economy, suggesting a level of importance and value exponentially higher than in previous times. Getting to this point did not happen overnight, but rather through a series of developments on many fronts of the data story, such as ubiquitous connectivity and more available compute power for number crunching. Taken together, these factors have ushered in the era of data. The concept of big data has emerged as a new way to handle new forms of data, but it's best understood as part of a larger data strategy.
As big data practices have matured, the definition of big data has also evolved. Originally, big data was marked by “the three V's”—volume, variety and velocity. Over time, additional characteristics have been added to the definition in an attempt to describe a complete data management strategy. Variables such as veracity, validity and value should certainly be considered when trying to extract insights from data, but many of these characteristics could easily be applied to any type of data strategy. In terms of defining a threshold for big data, the original three traits still work well.
However, defining a threshold for big data can still draw attention away from the fact that these traits also might apply to a full spectrum of data strategies. Basically, big data refers to the use of new technology tools to handle data that previously could not be handled with the existing tools. There are many companies trying to drive better data practices that can still use traditional tools for their growth. The three V's can describe the threshold where new big data tools are needed, but they can also describe other changes that companies may make:
Regardless of where a company sits on the continuum of data usage, there are some steps that should be taken to ensure a solid foundation for an ongoing data strategy. Big data tools and techniques have limited use if a business does not have solid processes in place.
The first step is understanding all the data within the company. Most businesses report some degree of data silos (and some of the businesses that do not report data silos may simply be unaware of their existence). Modern data techniques typically assume that the full set of data is accessible so that connections can be made between different components. In order to get insights that will drive business growth, there must be full knowledge of how current data is handled and a robust plan for gathering any new data in the future.
As part of understanding the corporate data blueprint, a business must understand the way that data is stored. There are a wide variety of storage options available, from local datacenters to devices to a variety of cloud offerings. Again, the storage should ideally be tied together in some way, and different storage options should be used depending on how often the data might be needed.
Storage is just the first of many tools in the growing data toolbox. Many types of databases, analytic software, and visualization packages exist, all offering unique functionality for specific types of data. Depending on the data a company currently has, the data they expect to gather, and the goals they have for the data, companies should choose the right applications for their data architecture.
Finally, security and privacy are crucial considerations for today's environment. As data has become currency, there are a number of ethical questions that have been raised with regards to the use of data. Legacy security practices will be insufficient for data in a cloud and mobile world, and transparency regarding data collection will be a key factor in maintaining trust with customers and third parties.
Big data has not seen the same adoption pattern as cloud computing, mostly because data practices are not as well established as infrastructure practices for most companies. For successful digital transformation and to build competitive advantage, businesses must create a comprehensive strategy around data collection, processing and analytics. This will enable them to fully utilize the emerging possibilities of big data.