` ESG Solution Showcase EMC Isilon: Data Lake 2.0 Date: November 2015 Author: Scott Sinclair, Analyst Abstract: With the rise of new workloads such as big data analytics and the Internet of Things, data scales not only in the data center, but also at enterprise edge locations and in the cloud. With the release of IsilonSD Edge and Isilon CloudPools, EMC is extending data awareness and understanding outside of the data center to the next-generation data lake. Introduction When discussing the challenges of IT storage environments, identifying the underlying culprit can often be oversimplified by focusing solely on the rapid rate of data growth. With the amount of data created and the length of time organizations wish to store data increasing, the challenge of data growth is a very real phenomenon that can extend well beyond the simple cost of storing and managing additional capacity. Higher levels of data growth can impact backup and protection schemes, and create power and cooling challenges. While these challenges have created and will likely continue to create concerns for IT storage leaders, many IT organizations are also grappling with an added layer of data storage complexity resulting from the advent of new generation workloads such as business intelligence (or big data) analytics and the Internet of Things (IoT). Digital repositories for business intelligence analytics are often referred to as data lakes. While these architectures may provide the scale to store the added influx of content, a greater level of flexibility and manageability may be required to make data lake architecture truly effective. In many cases, these newer workloads extend the acts of data creation and access well beyond the centralized and somewhat predictable confines of the centralized data center. As businesses integrate IoT workloads, sensor data may be created at the edge (i.e., a remote site or system) just as often as it is created within the data center. Additionally, as more departments in the business look to leverage business intelligence analytics, broad access to digital content will likely be desired from a wider range of locations. As the viability of the traditional storage silos looks to be coming to an end, global organizations appear to require the next generation of the data lake architecture. EMC, a market leader in storage, understands the evolving infrastructure demands of IT organizations and has augmented its Isilon storage technology to enable the next generation of the data lake. With the release of IsilonSD Edge and Isilon CloudPools, the capabilities of Isilon’s OneFS file system are extended well beyond the data center. IsilonSD Edge delivers Isilon’s OneFS with a software-deployment model for a software-defined storage solution that can leverage new or existing commodity hardware as well as help simplify storage manageability at the edge. CloudPools extend its capability to public cloud deployments as well. The resulting solution allows a content repository to take advantage of the benefits of a public cloud infrastructure while offering the seamless accessibility of data on-premises. With these two additions, Isilon delivers a next-generation data lake offering with an expanded level of flexibility to serve a new generation of workloads. This ESG Solution Showcase was commissioned by EMC and is distributed under license from ESG. © 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved. Solution Showcase: EMC Isilon: The Next-generation Data Lake The Need for a New Storage Architecture In recent years, IT organizations have often looked toward scale-out storage architectures as a means to keep pace with the challenges of data growth. With the advent of big data analytics, these scale-out architectures added new capabilities such as broader protocol support to serve a wider variety of applications. The goal was to deliver what some in the industry refer to as a data lake—a single, scalable storage repository of digital content that can be leveraged for business intelligence and big data analytics. Recently, however, new innovations are driving organizations to seek a more flexible and capable storage infrastructure layer. For example, the rise of IoT workloads and the collection of sensor data have expanded the breadth of locations where data may be created. The emergence of public cloud storage has enticed IT organizations to migrate data off-premises to free up on-premises resources. The net result is an increased desire to extend the data lake concept to a more flexible and more capable storage solution. In an attempt to quantify some of these trends, ESG recently surveyed IT decision makers responsible for their organizations’ data storage environments, which revealed a number of insights including: The rapid rate of data growth continues to be a top storage challenge. The application/ workload most widely identified as driving this data—and subsequent storage capacity—growth spending over the next 24 months was business intelligence and analytics. There is an early awareness of and focus on IoT and its potential impact on data storage infrastructure and strategy.1 In other words, the demand for the data lake storage infrastructure will likely continue to increase. However, as mentioned previously, the storage silo architecture, regardless of its scalability, will likely be sub-optimal for IT organizations as they seek to extend their business intelligence capabilities. These organizations will likely require a next-generation data lake. To address the growing data storage demands, next-generation data lake architectures must continue to be resilient and highly available, but for global scale, geo-dispersed protection capabilities are ideal. Planned or unplanned downtime of the data lake can have a critical impact on the business. Additionally, the next generation data lake cannot simply be isolated to the data center; it should integrate data from the edge and leverage public cloud resources as well. Software-defined Storage: IsilonSD Edge Software EMC’s Isilon storage is a market leader in scale-out file storage and offers a robust level of capabilities designed for growing unstructured data environments. In addition to providing a scale-out storage architecture, Isilon offers support for a variety of storage protocols including NFS, SMB/CIFS, HDFS, and Openstack Swift, along with automated data migration across tiers and a solid complement of data protection capabilities including snapshots and replication. With the advent of IsilonSD Edge, EMC is able to offer Isilon OneFS storage technology as a software-only option. As a software-defined storage component, EMC is able to extend the benefits of OneFS to remote office environments with a simple and flexible software deployment model. IsilonSD Edge can be deployed on new or even existing commodity hardware to simplify the deployment and help reduce the cost of storage equipment, power, and cooling. IsilonSD Edge continues, however, to offer the same levels of capability as Isilon OneFS in addition to leveraging the same management tools, interfaces, and VMware integration, increasing management simplicity. ESG 2015 storage research conducted earlier this year identified that the emergence of software-defined storage (SDS) has seen an emphatic level of interest from the IT industry. When IT decision makers were asked to identify their organization’s 1 Source: ESG Research Report, 2015 Data Storage Market Trends, October 2015. © 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved. 2 Solution Showcase: EMC Isilon: The Next-generation Data Lake 3 perception of software-defined storage, 60% of IT decision makers reported that their organizations are committed to SDS as a long-term strategy (68%) or at least conceptually interested in SDS (26%).2 For additional detail on the rationale driving this high level of interest, the data in Figure 1 offers perspective on the factors responsible for the consideration of software-defined storage.3 Although many respondents identified the potential benefits that focus on cost savings by means of reducing operational or capital expenditures as drivers, the most oftencited response for all factors was simplified storage management. This data provides credence to the potential impact that the flexibility enabled by SDS environments can have on simplifying storage management. It is this simplicity that contributes to a large portion of the benefit behind IsilonSD Edge and Isilon Cloud pools. Figure 1. Factors Responsible for Organization’s Consideration of Software-defined Storage To the best of your knowledge, which of the following factors are responsible for your organization’s consideration of software-defined storage? (Percent of respondents, N=307) 17% Simplified storage management 55% 15% Reduction in operational expenditures 50% 13% Reduction in capital expenditures 50% 17% Total cost of ownership (TCO) Greater agility to better align with evolving and fluid needs of the business Support server virtualization workload consolidation Support virtual desktop infrastructure (VDI) deployment Most important factor driving consideration of software-defined storage 50% 13% 47% 14% All factors driving consideration of software-defined storage 44% 10% 43% 1% 1% Don't know 0% 10% 20% 30% 40% 50% 60% Source: Enterprise Strategy Group, 2015 IsilonSD Edge and Isilon Cloud Pools: Delivering the Next-generation Data Lake As mentioned previously, the data lake concept potentially only represents the first step in delivering an architecture to serve not only the rapid rate of date growth, but also the new types of workloads being deployed by IT organizations. The Next-generation Data Lake The promise of big data or business intelligence can be quite alluring: Take the data you are storing already and run some additional analysis to glean business insights, then use those insights to help your business run more efficiently and effectively. The effectiveness of these analytics applications, however, can be limited by the storage infrastructure. If the underlying storage foundation does not scale enough or does not offer the right performance, the completeness of the results could suffer. As a result, the concept of a data lake emerged, offering a storage foundation designed to present the 2 3 Source: ESG Brief, Software-defined Storage Trends, September 2015. Source: ibid. © 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved. Solution Showcase: EMC Isilon: The Next-generation Data Lake benefits of storage consolidation, which provides simplified management and reduced infrastructure costs, but those benefits are often limited to the data center. To deliver a next-generation data lake, EMC has introduced a new OneFS operating system for Isilon that provides increased reliability and availability at the core of the data center. It includes support for non-disruptive operations, nondisruptive upgrades, and rollback of upgrades. As data lakes grow in size and become critical repositories of massive scaled-business data, resiliency is key for the data lake. OneFS now also includes support for Microsoft’s SMB3 Continuous IsilonSD Edge Benefit Overview Availability protocol, which enables newer Windows clients to seamlessly fail over in case of any outage. Extends enterprise data lake from data center to enterprise edge locations. Simple, software-only deployment model. Ability to leverage (new or existing and unused) commodity hardware storage infrastructure, and reduce EMC’s has also introduced IsilonSD Edge and Isilon CloudPools to extend the Isilon ecosystem well beyond the data center, delivering a nextgeneration data lake architecture that can extend the aggregation and accessibility benefits to both the edge (e.g., remote offices or sites) and the cloud. The net result significantly increases deployment and infrastructure flexibility, helping the IT organization to design the optimal storage ecosystem for its specific workload needs. Addressing the Challenges of the Edge power and cooling. The management of data at remote sites can create a challenge for IT Improved data protection at the administrators. Lack of direct accessibility to storage hardware can often add an extra layer of management complexity, slowing both planned and edge with the capabilities of Isilon. unplanned maintenance tasks. ESG recently conducted a research study Support for a number of emerging into the challenges associated with managing remote office environments. use cases including IoT, analysis at When IT decision makers were asked to identity their top IT priorities with the edge—health care, video respect to supporting ROBO locations, four of the top five most-cited surveillance, and content involved the protection, storage, and accessibility of data; improving collaboration. information security measures (45%), managing data growth (37%), improving backup and recovery processes (37%), and improving employees’ abilities to share files/collaborate with other employees (36%).4 When considered in aggregate, these priorities can represent a myriad of specific IT ecosystem concerns. For example, multiple challenges such as the need for greater efficiency, management simplicity, reduced power and cooling, and superior data protection and security can fall under managing storage growth. These priorities further support the rising interest and demand for next-generation data lake environments that can consolidate data from the edge into a central data lake ecosystem, simplifying the management of data on the edge. The Promise of Cloud Infrastructure While managing data at the edge with traditional storage can create challenges, the emergence of the public cloud storage tiers introduces opportunities. Often, IT organizations look to off-premises cloud storage as a potential low-cost bastion for unused, “cold” or “frozen” data storage. In ESG’s aforementioned storage research, more than one-third (37%) of IT decision makers identified leveraging public cloud-based storage as an initiative expected to impact storage spending over the next 12 to 18 months. This data is understandable given the cost savings often associated with leveraging public cloud storage tiers. These savings result in benefits such as reduced infrastructure, simpler manageability, and reduced power and cooling, to name a few. 4 Source: ESG Research Report, Remote Office/Branch Office Technology Trends, May 2015. © 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved. 4 Solution Showcase: EMC Isilon: The Next-generation Data Lake For some years, Isilon has offered policy-based, automated storage tiering on an Isilon cluster with SmartPools software to provide the most appropriate storage resources for specific data sets. EMC’s Isilon CloudPools leverages the SmartPools policy engine to extend storage tiering to cloud storage resources as Isilon CloudPool Benefit Overview part of a larger Isilon storage ecosystem. With CloudPools, Isilon provides automated and policy-based data migration to the cloud as a Integrates data center with cloud storage new storage tier for less active data sets. To secure this data, all data resources. that is moved to the cloud with CloudPools is sharded (divided up and Simple solution management. separated) and then encrypted. In addition to the ability to automatically migrate data to cloud, Isilon’s CloudPools also provides Seamless viability to content on the cloud the ability for the data to remain accessible as a part of the enterprise’s Access to low-cost public and private cloud Isilon data lake. This capability lets organizations more effectively leverage public cloud resources by allowing local on-premises resources for cold, unused data. workloads to retain access to data even when it has been migrated off Date stored on cloud resources remains premises. The net result can allow for more efficient utilization of both accessible for analytical analysis performed in on- and off-premises resources, reducing the cost and complexity of the data center on the entire data lake. data management while being transparent to users and applications. The Bigger Truth Ultimately, an organization’s data and its technology should enable the business to do more, be more competitive, and be more successful. Achieving these goals requires a storage architecture similar to that which Isilon is delivering with IsilonSD Edge and CloudPools, where data can reside at the right location for the business—whether that is in the data center, at the edge, or in the cloud—while providing the management and simplicity of one single pool. This design has become increasingly important as organizations continue to increase their usage of analytics and extend the collection and analysis of digital content to a wider variety of locations. The new data lake extends beyond the data center to the edge and to the cloud, which simplifies management and reduces storage costs and complexity. When looking to deploy a foundation for the next generation of digital workloads, organizations should ensure that the storage foundation can provide the resiliency and flexibility to extend to all the locations where data may be created, analyzed, and retained. EMC understands that organizations require a storage solution that can evolve to meet the specific needs of their environments, and that those requirements will continue to evolve with the organization’s demands. As such, EMC Isilon is delivering the next-generation data lake that supports traditional and next-generation workloads. All trademark names are property of their respective companies. Information contained in this publication has been obtained by sources The Enterprise Strategy Group (ESG) considers to be reliable but is not warranted by ESG. This publication may contain opinions of ESG, which are subject to change from time to time. This publication is copyrighted by The Enterprise Strategy Group, Inc. Any reproduction or redistribution of this publication, in whole or in part, whether in hard-copy format, electronically, or otherwise to persons not authorized to receive it, without the express consent of The Enterprise Strategy Group, Inc., is in violation of U.S. copyright law and will be subject to an action for civil damages and, if applicable, criminal prosecution. Should you have any questions, please contact ESG Client Relations at 508.482.0188. © 2015 by The Enterprise Strategy Group, Inc. All Rights Reserved. 5
© Copyright 2024