As data management matures, unstructured data evolves from being a storage cost center to being a value creation center.
Enterprise data is on the rise – no wonder there. It’s the current rate of data growth that’s really staggering. In 2010, the amount of data generated, consumed, and stored was 2 zettabytes, according to Statista, and companies like IDC have projected massive data growth overall over the next few years: from 64.2 ZB of data in 2020 to 175 ZB in 2025. Almost triple the growth in five years. Nearly 80% of all data is unstructured: file and object data, including documents, medical images, video and audio files, design data, research data, and sensor data.
By some estimates, less than 5% of this data is used for any purpose, and the organization’s IT teams have no less insight into their data and its value. So they store it forever because that’s the safest thing to do. The bottom line: massive storage spending and the inability to leverage data for new and valuable use cases. A recent study by Accenture revealed that 68% of companies are unable to realize tangible and valuable benefits from data.
However, consider the opportunity: from real-time analysis of adverse events to informing patient safety measures and new drug development, identifying early product defects in manufacturing, customer sentiment, and post-release chat analysis to improving market entry strategies or applying machine learning algorithms ( ML) provides real-time seismic data and satellite imagery to predict natural disasters. According to Forrester, organizations that take a data-driven approach to decision making are growing more than 30% annually.
To take advantage of unstructured data for competitive gain, it is important to develop a strategy for managing it to meet the dual needs of cost efficiency and income generation. Here is a 5-stage maturity model to follow for organizations looking to modernize their unstructured data management practices.
See also: Avoid a culture of skepticism about when data is down
Unmanaged unstructured data: At this point, volumes of unstructured data are large and distributed across on-premises and cloud repositories, resulting in minimal visibility and little, if any, insights across the entire data storage ecosystem. In many cases, data is treated the same way: most or all of the data is on primary storage that is expensive and not managed appropriately to save money or meet the needs of distinct groups and workloads. Meanwhile, there is pressure from above to manage costs, moving away from exorbitant data center costs on hardware/maintenance to more flexible on-demand cloud storage. But without proper insight into data assets, requirements, and value, it is difficult for IT and storage professionals to plan and manage effective cloud data migrations. Many will opt for a lift method and a basic change, which may actually increase costs.
- Offline storage silos limit visibility into data assets
- Storage, backup, and data recovery costs are high as a percentage of the IT budget
- Tension between storage IT professionals and users/department heads regarding data management decisions
- No expected ROI from cloud storage migrations or tiers.
Storage-centric data management. This phase is characterized by moving to better control data storage costs by using the storage vendor’s data management capabilities for unstructured data migration, replication, and scheduling. Storage-centric data management may be effective in environments where there is only one storage vendor, but most environments include multiple sites, additional vendors, as well as cloud deployments. Storage administrators are required to use various tools to migrate, replicate, and analyze data within these storage silos. This approach brings some cost savings but may not reduce complexity, reduce flexibility and still leave money on the table. If an organization wants to access data after it has been moved to the cloud through storage vendor tools, the IT department must maintain the storage and pay an exit fee.
- An unclear strategy to transition to low-cost storage
- Multiple tools used for migration and other data management tasks
- Hidden costs from storage vendor tiers to the cloud
- A planned migration to new platforms is often behind schedule or delayed due to complexity.
Independent unstructured data management. With enterprise unstructured data reaching petabytes and beyond, and hybrid cloud IT infrastructure dominating, the need to separate data management from storage management becomes apparent. Storage teams will look to adopt an independent approach to data management – sometimes called the data fabric. Teams rely on analytics to search across storage silos and identify savings opportunities. For example, moving “cold” data that hasn’t been accessed in a year or more to cheaper storage (such as in the cloud) frees up space on expensive, high-performance NAS storage.
- Standardization of data management tools
- IT can manage data regardless of storage technology or service
- Ability to reduce 70% or more of storage and backup costs by identifying and moving cold data to secondary storage
- An unstructured data management solution should not affect end user data access performance.
Policy-driven unstructured data management. Organizations at this point go beyond cost savings to better support security, compliance, and research requirements. Data policies and open data formats are critical. Organizations automatically and continually move data to the correct storage based on business priorities, cost, or monetization opportunities. For example, an electric car manufacturer wants to understand how its cars perform under different weather conditions, so it creates a data management policy to constantly pull tracking files from the cars at regular intervals into data lakes and analyze them. Once the study is over, this policy expires, and the transferred data is deleted or moved to deep archive storage.
- Storage teams have moved from operations focused on storage to focusing on appropriately managing data throughout its lifecycle with self-service capabilities for users.
- Increased automation for moving data to the right storage at the right time, and expanding use cases for unstructured data management.
- Data management policies run automatically until they are changed or deleted, eliminating error-prone manual policy management.
The value of unstructured data management. Some datasets contain a value that overrides the original application that created them. With advances in scalable and affordable services like cloud-based data lakes and machine learning, business leaders are eager to see what their pools of stored data can offer in terms of new insights that benefit research and development, operations, and customer relationships. At this ultimate level of unstructured data management maturity, the new prize is data management for long-term value. Capabilities include the ability to search across storage and cloud silos to find accurate data sets and then move the data to cloud analytics environments for access by analysts and data scientists. Mature organizations can tag files with additional metadata throughout the lifecycle, enhancing search and query capabilities. Storage teams work closely with business/departmental stakeholders to understand data needs for proper planning and long-term goals.
- Unstructured data management tools allow the seamless movement of data to external data analysis platforms and services.
- Comprehensive workflow automation eliminates the steps of discovering unstructured data and delivering it to the platforms of your choice.
- Storage administrators, in turn, rise from configuring and managing storage technologies to managing data for market gain.
- Data management becomes a flexible framework that future-proofs data for new applications and business use cases as they evolve.
- IT can measure the increased revenue generated by unstructured data insights.
No matter where your organization is on the maturity curve, it’s time to stop buying endlessly more storage without gaining insight into data and stop treating all data the same way. Instead, start analyzing and understanding the data to manage it appropriately and according to policy so that you can take full advantage of cloud storage and avoid waste. Start spending time on strategies to deliver more value to data, including connecting with data teams to build a new analytics infrastructure.
#Unstructured #Data #Governance #Maturity #Index #RTInsights