Every company today is a data company as data and analytics (including AI) is redefining business models and enabling new revenue streams, reducing costs, and mitigating business risks. A McKinsey report says data-driven organizations provide EBITDA (Earnings Before Interest, Taxes, and Depreciation) increases of up to 25 percent [1]. A study conducted by Boston Consulting in 2022, found that the first 9 of the top 10 innovative companies in the world are data firms [2]. Overall, data today is considered the key enabler of improved business performance, and in the process, enterprises generate and manage vast amounts of data daily.
The data lifecycle describes the stages through which data is effectively managed for improved operations, compliance, and analytics. The data lifecycle (DLC) includes creation, storage, processing, consuming, and even purging [3]. One of the key phases in the DLC is data storage where the created data is stored for processing for immediate or future use. This stage also involves archival including selecting appropriate storage technologies and systems to ensure that data is stored in the right place at the right time.
The data stored in the enterprise can be on-premises or in cloud systems. On-premises storage refers to maintaining data storage infrastructure within the organization; technically known as “inside the firewall”. It provides complete control and security over sensitive or regulated data and offers low-latency, high-speed access, making it ideal for performance-critical applications. However, storing data in on-premise servers requires substantial initial hardware, software, and maintenance investments. Scalability is a challenge, as expanding data storage often necessitates significant capital investment. Cloud storage, on the other hand, involves using third-party-managed remote data centers. It offers flexibility and scalability, allowing data storage capacity to adjust quickly to demand with pay-as-you-go pricing. Cloud systems enhance collaboration and mobility through internet-based accessibility and reduce management overhead by outsourcing infrastructure management to third-party providers. Despite its advantages, cloud storage poses potential challenges, including data sovereignty concerns, latency issues, and ongoing operational costs. Hence many enterprises adopt a hybrid storage approach, combining on-premises and cloud storage to balance performance, cost, and compliance.
Enterprise Data Storage Technologies
Regardless of the type of enterprise data storage systems i.e. on-premise or cloud storage servers, the enterprise data storage technologies used are practically the same. There are many enterprise data storage solutions and selecting the right storage technology depends on factors like performance, cost, scalability, and the nature of the data being stored. Each enterprise data storage technology serves specific use cases, and enterprises often use a combination of these technologies to meet diverse requirements.
- Direct-Attached Storage (DAS) connects directly to a server, providing low latency and high performance for single systems. But they have limited scalability and data sharing.
- Network-Attached Storage (NAS) offers centralized, network-accessible storage, ideal for data sharing and collaboration. While scalable, NAS may encounter performance bottlenecks in large deployments.
- Storage Area Network (SAN) delivers a high-performance network of storage devices, suitable for enterprise applications requiring high throughput and reliability. However, SAN systems are expensive and complex to implement and manage.
- Solid-State Drives (SSDs) offer faster performance and lower latency but are costlier, while Hard Disk Drives (HDDs) provide economical storage for large-scale needs, albeit with slower access speeds.
- Object Storage stores data as objects with metadata, making it highly scalable and cost-effective for unstructured data like videos, audios, and backups.
- Tape Storage is a cost-effective solution for long-term archiving, though its slower retrieval times make it unsuitable for frequently accessed data.
Regardless of the type of enterprise data storage technology, storing data is expensive and significantly impacts the environment. Storage costs stem from purchasing and maintaining hardware, electricity consumption, cooling systems, and data management software. Additionally, inefficient data practices, such as retaining redundant or outdated information, inflate storage requirements and expenses. The rising demand for high-capacity storage driven by growing data volumes further exacerbates these costs. Energy consumption in data centers is substantial, with servers and cooling systems requiring continuous power. The increasing demand for storage further amplifies energy usage. Additionally, frequent hardware upgrades generate electronic waste, compounding environmental challenges. Storing redundant or outdated data also leads to unnecessary resource utilization. Below are some strategies for enterprises to effectively manage data storage.
- First and foremost, companies need to store data only if they need it. The purpose of data in an enterprise is for operations, compliance, and analytics. If the data captured and stored, doesn’t satisfy the three main reasons, that data is unnecessary.
- Secondly, enterprises should implement data tiering by categorizing data by access frequency and importance, where frequently accessed data is stored in high-performance systems and archival data in cost-effective storage solutions.
- Thirdly, use cloud storage for non-critical or elastic workloads to benefit from scalability and cost-efficiency and reduce storage requirements by compressing data and eliminating duplicates.
- Fourthly companies should regularly audit data to identify and delete redundant or obsolete information called dark data [4]. They should also educate employees on data retention policies and the importance of efficient storage usage.
- Last, but not least, enterprises should mitigate the carbon footprint of data storage, organizations can have data centers powered by renewable energy. Optimized data retention technologies can eliminate redundant or obsolete data, freeing up storage and minimizing waste. Organizations can also use energy-efficient hardware, such as modern servers and storage devices, to further reduce power requirements. Virtualization is another effective approach, consolidating workloads and reducing the need for physical hardware. Additionally, edge computing minimizes energy-intensive data transfers by processing and storing data closer to its source. These strategies, not only lower environmental impact but also offer operational efficiencies.
Today, effective enterprise data storage strategies are essential for managing the growing volume of data while balancing cost, performance, and sustainability. By understanding the data lifecycle, leveraging appropriate enterprise data storage technologies, and adopting sustainable practices, enterprises can turn data into a critical business asset.
References
- Bokman, Alec; Fiedler,Lars; Perrey,Jesko; Pickersgill, Andrew, “Five facts: How customer analytics boosts corporate performance”, https://mck.co/2Ju0xYo, Jul 2014.
- https://www.bcg.com/publications/most-innovative-companies-the-collection
- Southekal, Prashanth, “Data Quality: Empowering Businesses with Analytics and AI”, John Wiley, Feb 2023
- https://www.forbes.com/councils/forbestechcouncil/2021/04/06/can-data-be-a-liability-for-the-business/
Read Prashanth Southekal’s other article:
Is BAD DATA better than NO DATA?
I’m a consultant, author, and educator with over 80 client collaborations, including P&G, GE, Shell, Apple, and SAP. I’ve authored three books—Data for Business Performance, Analytics Best Practices, and Data Quality. My second book, Analytics Best Practices, was recognized by BookAuthority in May 2022 as the #1 analytics book of all time. Alongside consulting and advisory work, I’ve also had the privilege of training over 4,500 professionals worldwide in data and analytics.