
Information Technology Thread, What is "massive array of idle disks" aka MAID? in

Science & Technology | Information Technology; Power, cooling, and space limitations are key issues faced by IT managers. Demand-side contributors include decreased hardware acquisition costs, sharp ...
-
Multi-tier storage saves energy, space
Power, cooling, and space limitations are key issues faced by IT managers. Demand-side contributors include decreased hardware acquisition costs, sharp increases in server density, and an exponential rise in the volume of data being stored and managed.
Many companies are growing their data centers at an exponential rate. The demand for increases in real estate and floor space is at a premium. Many IT organizations are faced with consolidating resources through virtualization and optimization of equipment and processes or be faced with up to a 200% increase in infrastructure cost to build or expand into a new facility.
On the supply side, electric utilities already struggle to keep up with demand. Brown-out restrictions can be commonplace during peak consumption periods, and many IT directors are discovering they cannot increase the amount of power they source from the local power grids. IT analysts believe that by the end of the decade, the world’s data centers will have run out of power.
The EPA has predicted that data-center power usage will double over the next five years. In fact, a paper by Stanford University professor Dr. Jonathan G. Koomey stated that 1.2% of all power purchased in the US was consumed in the operation and cooling of data-center equipment. With that figure rising, the need to focus on “green” computing is clear.
Multiple tiers of storage
One outcome of increasing resource pressure is a trend toward multi-tier storage architectures that use both tape and disk technology. Primary storage volumes are decreasing as companies begin to segregate their data according to business requirements for data access and retention versus operational constraints on power, space, and cooling.
The logic of this is clear: Tape cartridges in a tape library consume power at lower rates than disk systems, and tape cartridges stored in a vault consume the least of all, as well as providing the lowest aggregate cost per gigabyte. So, for data that must be retained for quarters and years and do not need as much fast recovery, a tape-based, tertiary storage tier conserves energy and space resources. This class of storage is typically used for data retention beyond three to five years.
The other factor at work in the evolution toward multi-tier architectures is data de-duplication, which dramatically reduces disk requirements through the elimination of redundant data. The SNIA Data Management Forum explains data de-duplication as “the process of examining a data set or I/O stream at the sub-file level and storing and/or sending only unique data. The definition of ‘what is a duplicate’ is predicated upon the method used to evaluate, identify, track, and avoid duplication. The de-duplication process includes updating tracking information, data that is new and unique, and disregarding any data that is a duplicate.”
Disk solutions with data de-duplication strategically combined with tape storage enable IT managers to effectively manage data growth, data protection, and energy usage by creating significant operational efficiencies. Through the use of data de-duplication, savings in space, power, and cooling can be significant.
Optimizing tiers with de-dupe
Data de-duplication reduces the amount of disk required to protect a given amount of primary data by detecting and eliminating redundant blocks within files and, in some cases, between different files and file types. Data de-duplication enables users to exploit disk performance while dramatically reducing capital and operating expenses, including power, cooling, and space.
Some implementations of data de-duplication also include enhanced disaster-recovery protection by increasing the replica frequency to improve the recovery-point objective (RPO). Using the same technology that identifies duplicate segments within data sets, some data de-duplication systems can also reduce the bandwidth needed to transmit backup sets over a network. Once systems are synchronized, whole backup sets can be replicated while only changed blocks are actually moved. For example, if a new backup is only 5% different from a previous one at a block level, bandwidth needed for transmission can be reduced by up to 95%.
Conventional disk systems without data de-duplication make sense only for a subset of data where recovery-time objectives (RTOs) override other considerations such as performance, long-term retention, and total cost. Data sets with normal access requirements can use disk with data de-duplication technology, both as a first site for backup data and for medium-term retention. Data de-duplication technology reduces space, power, and cooling requirements enough to make disk economical as a retention medium for weeks or months, at which point tape becomes an ideal medium.
Easing the pressure
Finding power-aware solutions that keep data accessible and protected will likely be an IT priority well into the future. But a tiered architecture that combines the efficiencies of tape with de-duplicated disk storage is an approach businesses can employ today to ease operational pressures without compromising service levels. These techniques will provide much-needed breathing room as additional green solutions continue to develop.
Storage vendors are on board, offering the technologies and capabilities required to bring all these elements together. The SNIA Data Duplication and Space Reduction (DDSR) Special Interest Group (SIG) in collaboration with the SNIA Green Storage Initiative (GSI) is promoting common metrics, features, and functionalities that could help the storage industry reduce the storage footprint through data de-duplication.
Composed of IT professionals, integrators, and storage vendors, the SNIA DDSR SIG focuses on advancing space reduction in all network storage technologies. By defining and promoting efficient network storage solutions and common implementations, the DDSR SIG is enabling sustainable data storage operations that reduce both storage cost and the environmental impact of data-center infrastructures.
You can find out more about the SNIA DDSR SIG and other SNIA forums by visiting www.snia.org.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Re: Multi-tier storage saves energy, space
ufff itna technical mein nahin perh sakti......
Hina
-
-
-
-
Re: Multi-tier storage saves energy, space
aap ko hi mubarak ho yeah sab....no interest...koi kaam ho ga tu aap sey keh doon gi i m sure aap ker dein gey
Hina
-
-
-
-
What is "massive array of idle disks" aka MAID?
In storage terminology a massive array of idle disks, or MAID as it is abbreviated, is a technology that uses a large group of hard disk drives, hundreds or even thousands, with only those drives that are needed actively spinning at any given time. MAID is a storage system solution that reduces both wear on the drives and also reduces power consumption. Because only specific disks spin at a given time, what is not in use is literally a massive array of idle disks, which also means the system produces less heat than other large storage systems.
One type of MAID is called the Copan array (Copan Systems Inc.). The Copan array treats drives in the array similar to a tape library (VTL) where only what is needed is actually powered. A Copan array can contain hundreds of terabytes of disks which share supply, controller, and cabinet.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Re: What is "massive array of idle disks" aka MAID?
In computing, a massive array of idle disks (more commonly known as a MAID) is a system using hundreds to thousands of hard drives for near-line data storage. MAID is designed for Write Once, Read Occasionally (WORO) applications. In a MAID each drive is only spun up on demand as needed to access the data stored on that drive. This is not unlike a very large JBOD but with power management.
Compared to RAID technology a MAID has increased storage density, and decreased cost, electrical power, and cooling requirements. However, these advantages are at the cost of much increased latency, significantly lower throughput, and no or lower redundancy. Most large hard drives are designed for near-continuous spinning; their reliability will suffer if spun up repeatedly to save power. Drives designed for multiple spin-up/down cycles (e.g. laptop drives) are significantly more expensive. Latency may be as high as tens of seconds. MAID can supplement or replace tape libraries in hierarchical storage management.
With the advent of SATA disk drives that are designed to be powered on and off, MAID architecture has evolved into a new storage platform for long term, online storage of persistent data. Large scale disk storage systems based on MAID Architecture allow dense packaging of drives and are designed to have only 25% of disks spinning at any one time. This allows for high throughput to get data to this platform quickly. Since persistent data is accessed very little, any data can be accessed at any time, and stay within the power budget of 25% of drives spinning.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
New product recap
November 20, 2008 -- Recent product announcements from LSI, Virtual Instruments, and Quantum...
LSI enhances 7900 HPC storage system
LSI's new Engenio DE6900 high-density SATA drive enclosure and 8Gbps Fibre Channel (8GFC) host connectivity are enhancements to the company's Engenio 7900 HPC storage system. Designed for compute-intensive applications and high-bandwidth workloads, the storage system combines performance with availability and reliability features for applications requiring continuous high-speed data accessibility, according to company claims.
Based on sixth-generation LSI architecture, the Engenio DE6900 SATA drive enclosure enhances the 7900 HPC system's ability to handle the massive data requirements of high-performance computing (HPC) applications. The unit is a 4U, 19-inch rack-mountable drive enclosure capable of housing a total of 60 3.5-inch high-capacity SATA drives for a maximum capacity of 60TB in a single enclosure.
The new enclosure offers a 2.8× density increase compared to the current 16-drive Fibre Channel enclosure, enabling more drives per unit of rack space and scaling capacity to 480 SATA drives within the same footprint. The increased density translates into more than 30% savings on power consumption and greater than 65% savings on floor space to achieve similar performance and capacity.
The Engenio 7900 HPC storage system is presently available through select OEM partners. LSI, www.lsi.com/7900hpc/
Virtual Instruments intros TAP
Finisar spin-off Virtual Instruments recently began shipping an enhanced Traffic Analysis Point (TAP) device that can be integrated as a component in the company's NetWisdom Enterprise SAN monitoring suite or used as a stand-alone device. The analysis appliance enables real-time Fibre Channel network transaction monitoring, analysis and diagnostics with the goal of reducing SAN downtime and optimizing performance.
NetWisdom provides application-to-SAN monitoring and "visibility," and the TAP device allows storage network administrators to view data flowing between their SANs and applications. The TAP device provides a passive diagnostic layer for network maintenance, access points for monitoring equipment and detecting failures, monitoring of disk array ports, and "health check" functionality for network optimization.
Each TAP appliance comes with four ports (with an optional 1U chassis that supports up to 16 ports), with pricing starting at $300 per port.
Virtual Instruments was spun off from Finisar in June. In its first three months of operation, the company claims to have shipped more NetWisdom products than Finisar shipped in the preceding 12 months. Finisar is a Virtual Instruments partner and investor.
Virtual Instruments, www.virtualinstruments.com
Quantum debuts streamlined de-duplication
Quantum now offers a preconfigured Dxi appliance model optimized for small and medium-sized enterprise environments, as well as a new configuration. Together, the DXi7500 Express disk backup system and QuikFit configuration program offer a streamlined path to deploying and managing de-duplication in a wide range of distributed and midrange offerings.
The DXi7500 Express is a turnkey DXi7500 de-duplication offering with all the hardware and software needed for midrange environments, including VTL and NAS interface licenses. Customers can upgrade the system with replication and path-to-tape capabilities for systems designed to be used in multi-site environments and edge-to-core architectures.
The DXi7500 Express is part of Quantum's family of DXi-Series disk-based backup solutions, which also includes the DXi3500 appliance. Quantum's DXi-Series solutions are scalable in the field to meet the needs of enterprise environments while preserving their data with direct tape creation, policy-based deduplication, and enterprise-level capacity and performance.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
BUSINESS BRIEFS; December 8, 2008
BUSINESS BRIEFS
RAID Inc. has entered into an OEM agreement with NEC Corporation of America. RAID Inc. will license NEC's HYDRAstor HS8 grid storage platform products and the HYDRAstor trademark and sell them under the brand Grid-X Backup. The new product line will be incorporated into RAID Inc.'s suite of storage solutions for the government, high performance computing and education markets.
Intel and Hitachi GST have announced plans to jointly develop NAND-based solid-state disk (SSD) drives, with production shipments expected in 2010.
Oracle has contributed block I/O data integrity code, which was developed in part with Emulex, to the Linux community. The code will be integrated into the Linux 2.6.27 kernel. The data integrity code exposes data protection information to the Linux kernel, enabling storage subsystems to use data integrity features. The overall goal is end-to-end data integrity (e.g., from applications to HBAs to storage devices).
DataCore Software and Promark Technology have announced a new distribution relationship under which Promark will distribute DataCore's storage virtualization software in the US.
Pillar Data Systems announced compatibility of its Application-Aware profiles with the Citrix Application Delivery Infrastructure.
Exanet has selected Chelsio's 10GbE adapters for use in its EX1500 NAS systems. Using the Chelsio adapters, Exanet achieved performance of 119,550 operations per second on its ExaStore systems.
Hifn announced that Gresham Enterprise Storage has chosen its Express DR 1050 capacity optimization cards to integrate into Gresham's Clareti Storage Director, a tape virtualization and backup virtualization solution. Utilizing the Hifn Express DR 1050 card, Gresham can double the capacity of disk and tape devices attached to the Clareti Storage Director.
PC backup vendor Rebit announced a partnership with Hammer, a European distributor dedicated to storage. Hammer is now reselling Rebit's appliances and software to its channel partners in Europe.
Nirvanix, a cloud storage platform provider, announced that its Storage Delivery Network (SDN) has been integrated into Ooyala's Backlot video management platform. Backlot is a scalable video platform with analytics, content syndication controls and monetization features.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Users weigh de-dupe options
November 26, 2008 -- End users are raising the level of conversation around data de-duplication from one of performance and data-reduction ratios to a discussion about data integrity, recoverability and ease of use, according to research from TechValidate.
The research shows that IT professionals are open-minded regarding how they evaluate compression and de-duplication technologies either as new standalone products or as features of their existing storage infrastructure.
TechValidate's research data is compiled through a Software-as-a-Service (SaaS) application that combines elements of mediated social networking, market research, and automation to allow IT professionals to anonymously share hands-on experiences with others. TechValidate uses the SaaS platform to collect and verify IT deployment details from a community of more than 18,000 IT professionals and 30 participating IT vendors.
According to TechValidate, "new entrant" vendors such as Data Domain and Sepaton have made strong market cases for a ground-up, integrated approach to data de-duplication, but vendors such as NetApp and Quantum clearly are making the argument for de-dupe as a feature of existing product lines.
TechValidate's research shows that 46% of the end users surveyed say they prefer to buy de-duplication as a new feature for legacy storage, while 44% say they want it in the form of a new, standalone product. The remaining 10% are undecided.
The research also reveals that most IT professionals (75%) want de-duplication in hardware, whether as a feature of their disk array or in the form of an appliance, rather than in software such as backup applications. Of those IT pros that identified software as their preferred approach for de-duplication, "architectural issues" was identified as a top reason. For IT professionals desiring a hardware or appliance-based approach, "ease of management" was noted as their top rationale.
It is no secret that vendors are scrambling to extend de-duplication across their product lines through in-house development or, in many cases, by collaborating with other vendors.
Several storage suppliers have made news in the de-duplication arena in the past couple months. Dell raised some eyebrows by officially inking a deal with Quantum and EMC to develop a single de-duplication architecture -- based on Quantum's technology -- that will be used across Dell's PowerVault, EqualLogic, and Dell/EMC product lines.
De-duplication frontrunner Data Domain announced a partnership with file virtualization vendor F5 Networks to co-market a jointly developed de-duplication system that automates the movement of static and archive data from expensive primary storage to a lower cost secondary storage tier.
NetApp also increased its presence in the de-duplication market by making the technology available on its family of virtual tape libraries (VTLs). The addition of de-dupe to the VTLs represents the final piece of the puzzle for NetApp as it now offers de-duplication as a free feature on its backup, archive, and primary storage platforms. NetApp claims more than 16,000 systems deployed with de-duplication in 3,500 customer environments.
It could be that offering de-dupe as a free feature of its operating system has been the key to NetApp's initial success. Regardless of the implementation, the market is now educated and users are beyond the noise of de-duplication ratios and the post-processing versus inline argument. TechValidate CEO Brad O'Neill says the de-duplication market is showing significant signs of maturation as users move beyond the tire-kicking stage.
"The vast majority of end users are concerned with how well de-dupe technologies will integrate with their environments," says O'Neill. "They're also looking at how to deploy it in a scalable way and have concerns about data integrity and recoverability. These top concerns indicate that we are moving into a mature phase of the technology and its adoption."
In fact, more than 50% of all surveyed respondents list preservation and assured access to data as key considerations, followed by integration with the existing infrastructure. Performance and scalability are, surprisingly, lower on the list of concerns. The bottom line, according to O'Neill, is that IT professionals expect to see evidence that de-duplication solutions are reliable and preserve existing environments and workflows.
"The expectation in the customer base now is that every major vendor is addressing de-duplication in some way," says O'Neill. "Users will look for it as a feature of their existing products or as a standalone appliance. It's becoming trench warfare for the vendors."

Pakistan is the Twenty20 Champions for the Year 2009

-
-
FalconStor brings de-dupe to NAS-based backups
December 2, 2008 -- FalconStor Software today extended the data de-duplication capabilities of its widely used Virtual Tape Library (VTL) product to a new segment of the data protection market – NAS-based disk-to-disk (D2D) backups.
The company has announced a new file-interface data de-duplication system that opens up CIFS and NFS connectivity for block-level de-duplication and immediate file-level access to the de-duplication repository. The file-interface de-duplication system (which will be given a proper product name once it is released during the first quarter of 2009) presents a network share interface as a backup repository, offering users a space-saving option for writing data efficiently to disk.
With the addition of a network interface to FalconStor's de-duplication product line, customers can choose a network interface or a VTL interface or both, depending on data center requirements.
FalconStor's director of marketing, Fadi Albatal, says the new file-interface de-dupe system extends de-duplication from the SAN to the LAN. "Right now the software supports our VTL, which basically connects to a backup server," says Albatal. "Expanding it to a LAN-based network interface allows us to de-dupe disk-to-disk backups from other vendors' systems."
In addition to the de-dupe enhancements, FalconStor has upped its support for high-performance network connectivity. FalconStor's VTL 5.1 software now supports 10Gbps iSCSI and 8Gbps Fibre Channel connectivity for faster backups. With support for these technologies, FalconStor VTL 5.1 systems can now scale in performance up to 1.5GBps per node with up to eight nodes per logical deployment of a VTL. This extends the FalconStor VTL's backup capability up to 43TB per hour for each VTL deployment.
In addition, the software touts tighter integration with Veritas NetBackup OpenStorage on various operating systems, real-time performance statistics, tape caching with de-duplication, enhanced repository management and replication, and additional library and tape-drive emulation.
FalconStor's file-interface de-duplication system is currently in beta testing with general availability scheduled in the first quarter of 2009. List pricing starts at $13,000 for a standalone software version ready for installation on any hardware appliance, with replication included. As a virtual appliance without replication, pricing starts at $5,000, plus $2,000 for the replication option.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Brocade offers new fabric monitoring service
December 3, 2008 -- Brocade is revamping an acquired network monitoring technology to offer customers a more comprehensive remote monitoring, alerting and reporting service for data center fabrics.
The newly announced Brocade Network Monitoring Service (NMS) provides real-time information about the health and status of fabrics to help boost network efficiency, availability, and uptime. NMS also generates reports with information and rules-based business intelligence for improved planning, application availability and resource utilization.
NMS collects and analyzes a variety of performance, utilization, and throughput data. The service sends alerts administrators of networking issues, bottlenecks, or potential system outages in an effort to avoid outages.
Brocade's NMS is based on a service originally offered by Computer Network Technology (CNT), which was acquired by McData in 2005. Brocade subsequently bought McData in 2006 and, according to Mike Schmitt, director of network monitoring solutions for Brocade Global Services, the company is now expanding the service.
"When we bought McData they had an existing monitoring business [via CNT] for extension devices that was entrenched with Fortune 100 customers. However, it was focused just on SAN extension devices," says Schmitt. "Our customers want this type of monitoring services, but they want them on a data center fabric level. They want an end-to-end view."
The current NMS is essentially a Software-as-a-Service (SaaS) offering that uses basic SNMP trap monitoring to monitor fabrics. Schmitt says Brocade is in the process of fleshing out NMS to achieve the ultimate goal of sustaining application uptime.
New features of NMS include expanded monitoring and reporting from extension devices into the data center across all Brocade products and some third-party products, policy-based monitoring, event correlation, "yellow light" alerts, and fault determination for faster problem resolution.
Brocade has released a roadmap for the NMS that shows several upgrades planned for 2009. The service will be expanded next year to include monitoring for third-party devices, and integration with Brocade's Data Center Fabric Manager (DCFM) and SAN Health management tools. The result should be end-to-end data center monitoring and reporting using flow-based management by the end of next year.
"This is about more than monitoring speeds and feeds. Being able to provide a yellow alert in advance of a potential problem is much more important than letting them know that a problem has already occurred," says Schmitt.
NMS is targeted at the large enterprises, and is packaged in Basic, Premium and Premium-Plus tiers of service, each with its own set of monitoring and reporting options.
Pricing has not been disclosed, but NMS is sold on an annual, per-device basis. Schmitt says that an example would be $10,000 per year to monitor a Brocade DCX Backbone switch.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
CA offers SaaS-based DR service
November 25, 2008 -- Add CA to the list of software vendors turning their data protection products into services. The company has announced the availability of CA Instant Recovery On Demand, a hosted and managed Software-as-a-Service (SaaS) version of its CA XOsoft High Availability technology that protects application servers in small to medium-sized business (SMB) environments.
The new business continuity and disaster recovery service is targeted at SMBs that need continuous availability, but suffer from the staff and budget restrictions inherent to today's economic climate and prevent them from shelling out cash for in-house products.
The Instant Recovery On Demand service is based on CA XOsoft High Availability software, which protects business applications and data by providing real-time replication and automatic failover for Microsoft Exchange, SQL, IIS, SharePoint, Oracle, Blackberry Enterprise Server and File Servers. Heterogeneous replication support is available for Linux and UNIX.
Adam Famularo, senior vice president and general manager for CA's recovery management and data modeling business unit, says Instant Recovery On Demand is being sold exclusively through CA's reseller partners and that CA has built in some incentives to make the service attractive to both end users and partners.
"There is no extensive training involved with this model for our partners or customers. All the reseller has to do is connect the customer to us through a VPN and we manage the rest of the process," says Famularo. "Once the setup is complete, users can begin replicating to our site in about 20 minutes."
Famularo also explains that CA made sure the SaaS service is profitable for all those involved. "We knew we had to do this in a way that would build in hefty margins for our partners compared to the software and services they sell. Our partners are currently making 30 to 40 points on margin with this service without any capital costs," he says. "All the reseller does is manage the monthly billing and then pay us."
The pricing model for CA's Instant Recovery On Demand service is based on the number of servers protected per month and can range from $400 to $800 per server. Length of contract also plays into the pricing.
According to Famularo, an average customer deployment typically involves a three-year commitment with about three protected servers for an average monthly cost of $600 per server.
In addition to Instant Recovery On Demand, CA announced the availability of a pair of SaaS-based services for project and compliance management. The new CA GRC Manager On Demand service helps manage governance, risk and compliance initiatives, and CA Clarity PPM On Demand is a SaaS version of the company's project and portfolio management application that helps organizations govern IT expenses, staffing, investments and projects.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
IBM validates cloud services
December 1, 2008 -- Cloud computing can be confusing, and in an effort to make sense of it all IBM has kicked off a new set of consulting services and the "Resilient Cloud Validation" program, designed to help users take advantage of cloud computing and validate the resiliency of any company delivering applications or services to clients in the cloud environment.
In a nutshell, cloud computing is a model under which users see only services and not the underlying implementation or infrastructure. According to IBM's director of business continuity and resiliency services, Brian Reagan, unpredictable performance and some high-profile downtime and recovery events with newer cloud services have created a challenge for customers evaluating the move to cloud-based services.
"It's a buyer beware market out there when it comes to enterprise cloud computing," says Reagan. "Not everything belongs in a cloud and our services help customers sort through it and make decisions about what to put in the cloud."
The IBM Resilient Cloud Validation program, according to Reagan, will verify to businesses that a cloud infrastructure has been designed, implemented and is being managed by the resiliency standards set forth by IBM.
The program will allow businesses that collaborate with IBM through benchmarking, design validation, infrastructure hardening and redundancy, and ongoing monitoring and management to use the IBM "Resilient Cloud Proven" logo when marketing their services.
"There is no one-size-fits-all model for resilience, but there needs to be a framework for service providers to measure and validate their cloud infrastructures against," says Reagan. "Ultimately, cloud service providers are in the business of helping other businesses run. Our program assures a level of resilience and comfort that can be given to their customers."
IBM has also launched business and technology consulting services, including industry-specific Business Consulting Services for Cloud Computing and Technology Consulting, Design and Implementation Services.
Offered by IBM Global Business Services, the Business Consulting Services offering uses an economic model to assess the total cost of ownership for building private clouds, or moving data and applications off-site in a public or hybrid cloud model.
The Technology Consulting, Design and Implementation Services, delivered via IBM Global Technology Services, help customers install, configure and deliver cloud computing inside the data center.
In addition, cloud technology consulting from IBM's Information Technology Strategy and Architecture team can now evaluate how cloud computing resources, processes and investments can support business objectives. IBM consultants can assist customers in creating roadmaps for re-constructing their IT environments with an eye toward cloud computing models to streamline operations.
IBM is also expanding its research in the cloud computing market by working directly with clients to create replicable, cloud-delivered, industry-specific services such as Lender Business Process Services or Healthcare Process Services, as well as horizontal business services such as CRM and supply chain management.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
BC/DR advice for SMBs
December 18, 2008 -- Large enterprises typically have solid disaster-recovery (DR) and business continuity (BC) plans in place (although they generally don't test them frequently enough). However, that is not always the case with small to medium-sized businesses (SMBs), which often don't have adequate budgets for BC/DR.
For this report, we gathered BC/DR advice and tips for SMBs from a range of vendors. One of the most common pieces of advice is simple enough: Test your DR plans.
"SMBs should test and practice their disaster-recovery plans regularly to strengthen their skills, determine more efficient logistics, and work out kinks in the system," says Mike Inkrott, Symantec's senior product manager for Backup Exec. "SMBs should also test the backup itself [i.e., recover data] to ensure that critical data is available. If SMBs neglect to back up their data, their disaster recovery plan is useless. By simply practicing the plan, assessing the critical data that needs to be backed up and testing the backup system, SMBs can be confident that the plan will work should they need to use it during an actual disaster."
"If I had to give one piece of advice to SMBs on BC and DR, it would be to practice what might take place if a disaster were to occur," says Ellen Rome, vice president of sales and marketing at STORServer. "Too often, SMBs wait until the disaster takes place and then find they are not at all prepared with a fully laid-out plan and a practiced approach to data recovery." Rome also advises IT organizations to determine which servers, applications, and storage resources are most critical, and to determine the order in which they need to be recovered.
In a business continuity survey sponsored by Stratus Technologies, only 45% of the respondents with BC plans tested them more than once a year, 35% tested yearly, and 20% never tested their BC plans.
Recommendations on how often companies should test their DR/BC plans vary widely but, generally, vendors and analysts recommend testing at least quarterly.
Low-cost BC/DR
In terms of actual BC/DR implementation, perhaps the best news for SMBs is that they are no longer restricted to the expensive, costly, "vendor lock-in" solutions that characterized BC/DR options in the past. And not surprisingly, hardware-independent vendors stress low-cost alternatives for budget-strapped SMBs.
"DR solutions should be open and flexible, easily fitting into an organization's existing IT infrastructure, and should minimize risk, implementation time, and cost," says Fadi Albatal, director of marketing at FalconStor Software. Albatal advocates hardware independence, virtualization (with an emphasis on heterogeneous array support), and resource consolidation. He also notes that IT organizations are no longer required to have the same types of storage systems at the primary and secondary (DR) sites; users can keep expensive, high-performance systems at their primary site, but deploy less expensive arrays at their remote sites.
Although most IT managers view virtualization primarily as a way to lower costs, it can also be used as a basis for a disaster-recovery program. "If you virtualize your systems and storage, your primary and backup data centers can run disparate hardware, with the virtualization layer hiding the differences," says Barry Phillips, group vice president and general manager in Citrix Systems' advanced solutions group. "Through the use of clustered computing, load balancing, replication, and remote access technologies, your downtime can be brought to zero and your data loss minimized."
FalconStor's Albatal stresses the importance of BC/DR technologies that provide full integration of the physical and virtual environments to enable DR process automation.
"The ability for virtual machines to move between servers in the event of a failure greatly simplifies and lowers the cost of application high availability and business continuance," says Chris McCall, director of product marketing at LeftHand Networks. "Combine this with storage systems that present a single volume in multiple sites, so that when a failure occurs and virtual machines migrate over, they remain connected to their volumes. Applications and storage remain online, with no data loss or manual intervention."
Software-as-a-Service (SaaS) is another cost-saving approach that might appeal to SMBs. "Managed and hosted SaaS solutions are a cost-effective alternative with limited up-front investment and IT management responsibilities," says Frank Jablonski, senior director of product marketing at CA. For in-house BC/DR implementations, Jablonski also advocates evaluating virtualization technology, which can lower costs for recovery management. He also advises a multi-level approach to BC/DR, which saves money by applying the right level of data protection according to its value to the business.
Similarly, remote on-demand data-protection services can help defray BC/DR costs. According to Brian Reagan, director of strategy in IBM's BCRS division, remote services eliminate the need for capital expenditures and can save 20% to 60% versus in-house BC/DR implementations, in part because hosted services are typically based on a "pay-as-you-use" subscription model. In addition, subscribers can more easily define and execute specific time-based data retention policies that match their business requirements.
First steps
Before actually embarking on a BC/DR implementation, SMBs should perform a business risk and business impact analysis, according to Kyle Fitze, director of marketing for SAN products in HP's StorageWorks division. Business impact can be measured in both direct (such as lost revenue) and indirect (e.g., productivity impact) dimensions. The metrics should also be measured in quantitative (revenue, costs) and qualitative (customer satisfaction, brand reputation) dimensions, according to Fitze. With this data in hand, SMBs can decide how much downtime they can tolerate for a given application or system, and how much data loss is acceptable, which in turn will determine the best technologies to use for the BC/DR infrastructure.
During the business impact analysis (BIA) phase, Edgar Jimenez, director of managed services for the EVault data-protection business unit of i365 (a Seagate company), recommends the following actions:
-- Define critical success factors that will support and enable the BIA;
-- Establish application restoration priorities (e.g., critical vs. non-critical apps);
-- Identify tasks required to resume 100% normal operation (aka business resumption); and
-- Define data recovery and backup management procedures.
After identifying risk factors (e.g., natural disasters, system failures, legal/regulatory action) and business impact (e.g., loss of productivity, revenues, or legal liabilities), users should identify the criticality of various applications.
"Conduct a session with all business managers where business applications are charted on a matrix with ‘acceptable data loss' on the Y-axis and ‘acceptable interruption' on the X-axis," suggests Bruce Caswell, director of marketing communications at Xiotech. "Both axes are divided into three layers, labeled ‘minutes,' ‘hours,' and ‘days.' Each application is mapped into the appropriate section of the matrix, with discussion of the consequences involved for each application." Caswell also notes that the current level of protection from existing systems can also be mapped on the matrix to demonstrate existing gaps.
Most vendors agree that assessing the criticality, or business value, of data and applications is a crucial step in organizing a BC/DR strategy. "Profile your applications and rank them in terms of value to the business and assess the impact to your business if there were any downtime," advises Jonathan Buckley, vice president of outbound marketing at Asempra. "And do similar exercises for data: What data is critical, and what data is not?
"We recommend bifurcating your BC/DR technologies depending on your total data requirements and your RPO [recovery point objective] and RTO [recovery time objective] requirements," says Buckley. "The servers holding second- and third-tier data can take hours, or in some instances days, to recover without financial impact."
Most BC/DR implementations today use disk-based technologies to some degree. Steve Whitner, product manager at Quantum, says that SMBs should consider disk-based backup with data de-duplication to minimize costs. However, Whitner notes that, at least for long-term retention of data, tape should also be factored into SMBs' BC/DR equation because of its lower cost per gigabyte, as well as power and cooling advantages.
"Technologies such as backup-to-disk, de-duplication, and replication enable SMBs to tailor their BC and DR strategies to fit their specific requirements," says Andrew Wenger, director of the SME segment at CommVault. "By backing up to disk, SMBs can replicate their data, move it to their main data center at headquarters, and implement a DR plan from there, making it easier to manage on an ongoing basis."

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Disaster-recovery management comes of age
Corporate disaster recovery (DR) for critical applications is a big, serious business—or it should be. But the poor state of many critically important DR environments says otherwise. It’s not that the elements of a DR architecture are missing. Large enterprises commonly invest in replicating data between geographically remote primary and secondary hosts, ensuring that if the primary site goes down, then the secondary site will flawlessly fail-over. Unfortunately, that flawless fail-over degenerates over time to a flawed, or even non-existent, fail-over. Why? Because changes to the primary site are not made to the secondary site. Over time, minor changes to the primary environment—adding a volume here, updating an application there—diverge the two sites to the point that recovery objectives will fail.
DR testing helps to uncover the gaps. But such testing is costly and disruptive, and DR testing terror tempts IT to avoid the pain at all costs. Yet without consistent DR testing, recovery configuration failure rates between primary and secondary sites can easily reach 75% or more during the course of a year. If the company has delayed its testing longer than that, or even skipped it entirely, then the percentage and the threat will only worsen. And even when companies do test, the complex dependencies between primary and secondary sites make mitigation difficult and uncertain. But without comprehensive testing, vulnerable or even failed replication continues without anyone knowing it—until IT goes to restore data from the secondary system and cannot.
Yet the challenge of managing complex DR environments is real, and environmental complexity can cripple manual change management. Critical production data is produced by different applications, housed on different storage devices, protected by different technologies, and replicated using different software. This same complex environment must be reproduced at the secondary site for every set of critical applications and data that must be quickly available from the secondary site.
For example, a critical Oracle database is housed on a massive EMC Symmetrix array and replicated via SDRF, while an equally critical SQL Server application is stored on a Veritas Cluster and replicated accordingly. Three more applications are also hosted at the primary data center and use snapshot technologies to copy to the secondary site. Making the matter even more complex, the secondary site must not only store current data and applications, but must also use the same RAID types in the same configurations as the production environment. RAID is a data-protection necessity, but mixing RAID types between primary and secondary environments can lead to sub-optimal storage utilization and performance issues. For example, if a production database uses RAID 1 for logs and RAID 5 for table spaces, then the secondary site must use exactly the same mix of RAID types. In reality, however, it often does not, leading to delays and difficulties when attempting to recover replicated production data within an urgent timeframe.
DRM for testing
Anything that IT can do to avoid manual involvement in this process is an advantage to DR testing, documentation, compliance and, of course, disaster recovery. This is where disaster-recovery management (DRM) comes in: By using tools that automate testing and change management, DRM can reliably and cost-effectively mitigate mismatches between primary and secondary replicated environments.
DRM continually monitors characteristics such as failed dependencies, inconsistent data, incomplete data sets, and breaches of service level objectives. These abilities also increase DRM’s value to highly regulated industries, which can use it not only to protect DR settings, but also to test and prove compliance.
The DRM application works by scanning the primary and secondary-site configurations and dependencies. Working from a knowledge base of product-specific interactions and best practices, it runs a dependency analysis and mitigates gaps by repairing and reporting them. The resulting topology becomes the baseline for continual testing of the DR environment for deviations. DRM tests critical dependencies between hosts—including OS, hardware, and network resource parameters—taking a holistic view of the multi-vendor combination of products that realistically comprise a “replicated solution.” Ideally, DRM should support not a single replication product path, but a comprehensive set of operating systems, databases, cluster configurations, storage technologies, and communication protocols. The DRM package should also be capable of supporting virtual environments as well as physical environments.
DRM automatically collects information from the IT infrastructure and scans for issues that will impact recoverability. Without this level of infrastructure discovery, replication between sites will be inconsistent. This can lead to long fail-over times and even unrecoverable data, forcing IT to recover from backup and losing hours to days of changes to the production data. This of course is an unacceptable level of risk for mission-critical production environments.

Topology map of replication environment form Continuity Software.
The level of DRM support for multi-vendor environments also comes into play. Corporations rarely have just a single replication software solution, and it is common to have as many as seven replication products running in a large data center. DRM ideally protects multiple replication operations from various vendors by sensing gaps throughout the end-to-end protection process. This ability enables IT to keep the replication tree utterly consistent, thus protecting an exact replica of data in case of data loss.
By identifying and mitigating critical gaps between hosts and deep layers of host interaction, DRM solves the problem of complex change management in the DR infrastructure. Business can be confident that it will meet the recovery point objective (RPO) and recovery time objective (RTO) for which a given replication solution was originally designed and deployed.
Another important aspect of DRM is that it should not add to environmental complexity, but should simplify it by centralizing information and management across the entire replicated environment. This makes the complex DR infrastructure far more transparent and manageable, especially if the DRM application leverages existing configuration management databases such as BMC Remedy or HP OpenView.
Let’s look at a typical scenario for DRM. A large financial institution installs a DRM package that tests for gaps across multi-vendor DR software. The application runs for 48 hours on the infrastructure between a production center and a “warm” secondary site. Even though the company previously spent money and resources on DR configuration and testing, the DRM software uncovers nearly two dozen dependency gaps that would have crippled its RTO and RPO. This is a shock to the company, both because of the level of DR risk and also because of the level of non-compliance. By using the detailed DRM analysis, the company not only identifies numerous serious gaps, but is also able to quickly mitigate those gaps. The DRM application now runs automatically at scheduled intervals to identify and close any subsequent gaps created by changes to either environment.
The competitive landscape
DRM is related to technologies such as data-protection management (DPM) and storage resource management (SRM), as well as professional DR testing and change management methodologies. DRM differentiates itself by automating a high level of risk mitigation in replicated DR environments, a complex setting that is under-served by DPM, SRM, and manual change management and test operations. DRM and DPM may potentially develop in parallel as both are concerned with monitoring and managing the recovery management space, but at present their distinction is clear.
Existing vendors in this emerging segment primarily provide gap testing only on their own replication products. EMC offers DRM functionality for SRDF and Veritas Cluster Server for its replication operations. IBM weighs in with TotalStorage Productivity Center (TPC) that manages replication for the ESS 800 (Shark), DS8000 and DS6000 arrays, and SAN Volume Controller. These replication management functions are quite useful for these specific replication paths, but leave complex multi-vendor replication environments subject to DR failure. For these environments, Continuity Software, for example, offers multi-vendor capabilities across a wide variety of replication paths, components, and software.
Protecting the critical DR environment requires complex change management and comprehensive DR testing, but all too many corporations have failed to invest in these operations to protect their critical replicated data. The new DRM technology class is stepping up to automate these manual operations, providing the promise of predictable recovery performance across multi-vendor DR solutions.
DRM can dramatically reduce the costs and time of manual DR testing by locating recoverability gaps, analyzing root causes, and mitigating the problems. This results in consistent, comprehensive, and cost-effective change management and testing in the DR infrastructure. And DRM does not make an already-complex environment even more complicated, but rather centralizes information on multiple replication paths and renders them transparent. This up-and-coming technology will prove fundamental for protecting and optimizing operations between primary and hot/warm secondary sites.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
What is cloud-based storage?
Part I
November 7, 2008 -- The combination of hosted computing and data storage in the Internet cloud has long been filled with compelling promises while bearing little fruit for IT practitioners. But after several years of stalled attempts, hosted computing and storage is finally hitting the market again in a big way.
The past year has seen a constant stream of new cloud-based service vendors and solutions, a lot of talk about out-of-the-box Web service-enabled storage from major vendors, and a seemingly constant series of mergers and acquisitions around service providers such as Mozy and Arsenal Digital. Meanwhile, nearly every cloud-based service provider is experiencing significant growth in consumer, SMB, and even enterprise markets.
Skeptical IT managers may think they've seen this show before -- storage as a service has been marched on parade as the "next big thing" at least three times before. So why is this time different? The answer is simple: while the economics have always been compelling, this time around sophisticated applications and enormous sets of data are already in the cloud, and end-user access is more ubiquitous and reliable.
A variety of companies have developed very rich applications that have demonstrated to users the potential power of the cloud. Computing and storage in the cloud have become an ideal platform for developing sophisticated, economical and flexible services. Cloud-based technology is here to stay, will rapidly become pervasive, and will change the way you're doing business.
Cloud-based storage has evolved from continuing attempts to de-couple storage from applications so that each resource can be optimally scaled and managed.
More importantly, storage in the cloud, which is a fundamental building block for cloud-based computing, is changing. Storage in the cloud, which the Taneja Group refers to as cloud-based storage, or CBS, has evolved into the second generation. CBS requires more than just block or file storage with a few extra services. Cloud-based storage delivers capabilities that will change how users manage storage. Whether you're a customer or a service provider, cloud-based storage will set new precedents for storage economics and how storage is used.
What is it?
Simply put, storage in the cloud de-couples storage and applications so that access to either can be more flexible, and data storage and applications can easily scale in response to changing user demands. In a Web-centric world, where large service providers host storage and computing, and customers buy storage and computing on a pay-per-use basis, this makes the IT infrastructure elastic and cost-optimized.
The industry has long been struggling with de-coupling applications from data so that each can be more flexibly managed, moved, and scaled. NFS and CIFS were among the earliest ways of de-coupling applications and storage so that each could be scaled and managed more effectively. But these protocols are complex and remain restricted to the data center where resources can be expensive and difficult to scale.
The next evolution of de-coupling was to host application and data components with service providers across the Web. Unfortunately, this generation of storage was often mired in the restricted scalability and complex access of traditional remote access protocols (FTP, WebDAV) and traditional storage (file and/or block).
The storage industry has realized the potential flexibility that can be enabled by storing data in the cloud, and several vendors are beginning to march into this space with a new generation of cloud-based storage.
Cloud-based technology wraps traditional IT applications and infrastructure in new, simplified APIs and access semantics. APIs, or sets of application and/or storage commands, are served up as self-contained, discoverable Web services that are accessed via HTTP or other protocols and integrated into lightweight, easy to develop, distributed applications.
This allows users to put less effort into developing complex application sub-routines, and instead better serve their businesses with combinations of already available and reusable Web services and data. In turn, the increased independence of these services allows each component to scale up and down in performance as end-user demands change. When distributed onto the enormous data centers of one or multiple service providers, this makes the infrastructure truly elastic.
Cloud-based storage is the next generation of hosted storage and represents a new paradigm in storage and data access and management whereby users can integrate hosted, location-abstracted data storage into applications and infrastructure in unique ways. CBS will serve as the foundation for elastic computing, where economical, highly manageable and easily integrated computing resources dynamically scale and change according to business demands. Cloud-based storage will
--Allow users to integrate and access storage and data in new ways, usually through Web services APIs. In turn, complex storage interactions and data access will be simplified and easily integrated into business processes and applications;
--Serve up storage and data that is location-abstracted, so that requests can be transparently redirected across locations to improve availability and enable scalability by distributing requests across multiple or larger systems as user demands change;
--Include simplified, self-service storage management capabilities (provisioning, service-level tiering, data protection, etc.) that can be easily integrated with other applications or business logic; and
--Be innovative in data organization and management so that data can be easily stored, shared, and managed in a simplified manner compared to traditional storage.
For IT managers, an API-enabled storage cloud may lay a foundation for building workflows to provision, monitor, scale, or tear down complete storage environments, including VMware server images, applications, and databases. IT managers today are using this approach to acquire storage in the cloud, rapidly provision new sets of servers, populate databases with replicated subsets of information, and fire up applications, Web servers, and database servers as user demands increase. Then, just as rapidly, these organizations turn off these new servers when user demands subside. Meanwhile, users and IT managers never think twice about where each component of the infrastructure resides, and every HTTP-carried transaction is transparently optimized and transported between different locations across the cloud.
This article focuses on file storage, which we believe will be at the heart of abstracting data storage from applications. Block storage will also have a role in the cloud, but this role may be restricted to supporting hosted virtual servers and data centers. We expect block-based storage vendors to support automation over Web services frameworks, provide richer quality of service capabilities, and deeper visibility to meet the requirements of hosted virtual infrastructures, but this infrastructure will remain tied to location and remain a disdifferent product than cloud-based storage. Meanwhile, hosted virtual infrastructures may also make use of file storage by turning to NFS-based virtual image storage.
Why store in the cloud?
Economies of scale allow service providers to deliver cloud-based storage at extremely low price points compared to that of a traditional infrastructure. But cloud-based computing and storage go far beyond this benefit by extending storage capabilities.
For IT customers, cloud-based storage is more distributable, scalable, accessible, and manageable than a traditional storage infrastructure. Once the locality of data becomes irrelevant, users can integrate data from anywhere. And when the location of data is no longer important, it is easy to scale performance by distributing or moving data across any system according to demand. In turn, this can eliminate disruptive data migrations or service events. Finally, since cloud-based storage is often acquired on a pay-per-use basis and is simplified in management and organization, administration is extremely low. Moreover, users can roll their own processes into lightweight Web applications to automate storage and data management.
A storage cloud can make an organization's IT infrastructure more versatile. APIs can be wrapped with lightweight Web logic for data archiving, where users can request space, have it automatically provisioned within policy, and self-manage archiving and access to data. And managers can build business logic to automatically provision new storage, duplicate virtual guest disk images, start up a virtual environment for periodic batch processes, and tear it down when finished. Users could also integrate enterprise applications with potentially rich metadata in cloud-based storage in order to automatically classify, move, or secure data according to contents and file characteristics.
Let's look at an example of what cloud-computing architectures, supported by cloud-based storage, look like.
Scalable applications
For some time, the industry has tackled sticky issues surrounding how to scale front-end application and Web servers, while never coming up with better approaches to database scalability than limited scale-up to larger multi-processor servers. But in the last two years, database architectures have rapidly moved in the direction of pseudo scale-out through data partitioning and replication. For example, large-scale order entry systems may partition data into different databases by customer, or even business process, and use multiple database servers (dozens or hundreds) to scale. Data that is common to each customer is replicated into every database, and complete sets of data for reporting or analytics can be consolidated into a central repository. If a customer accesses this type of hosted application with thousands of extra users, then there is zero impact on other users and, more importantly, the subset of data and servers for that particular customer can be more easily scaled. Other predictable or unpredictable demands -- data reporting and analysis, monthly batch processing, etc. -- can also be handled easily.
By changing the granularity of data partitioning, it is possible to make architectures scale to support vast numbers of customers through the addition of more servers, and that is where the value of cloud-based storage and computing comes in. With CBS behind our example application architecture, businesses have begun to encapsulate each component of the application -- front-end application servers as well as business logic and database servers -- in different virtualized servers or applications so that they can protect, duplicate, and manage those servers with CBS tools. These companies create groups of servers in a given environment, and use automation toolsets to allocate storage space, snapshot and duplicate servers, and start up, shut down, or re-arrange the servers behind their application, based on demand and application performance.
As such an environment is deployed, it either starts up with an existing set of data or immediately replicates data from other databases and builds an entire new application environment for a new customer or process. A failure may trigger either a complete restart or an entire movement of the environment to another location. Excess demand may trigger further partitioning of data into more databases and the launch of more application servers.
With the help of solution vendors such as DataSynapse or 3tera, which specialize in the packaging and management of entire environments, some of the largest applications can handle enormous, unpredictable increases in customers without a hiccup, while optimizing costs by rapidly shutting down servers when they are not required. One vendor in this space –- Surgient -- provides a management solution that wraps the virtual infrastructure in an HTTP-accessible Web service API for manipulating nearly any aspect of provisioning, snapshotting, duplicating, and scheduling virtual guests.
Conclusion
Cloud-based storage will create enormous change in the IT infrastructure. As users begin to experiment with cloud-based solutions, the technology will create a ripple effect throughout the storage industry, with several impacts:
--Users will expect cheaper storage, as user self-service makes storage in the cloud less expensive to deliver;
--Users will expect more responsive and scalable storage, because hosted providers can respond and scale on demand; and
--Users will expect to access and manage their data in ways that were not possible before.
Vendors such as Caringo, EMC, Ibrix, Xiotech, and others are racing to provide a storage services layer or APIs that will underpin the next generation of cloud-based storage. These vendors are developing cloud-based storage infrastructures that will go well beyond the limitations of first generation products that tried to provide basic FTP- and WebDAV-type access. Next-generation solutions will provide services on top of true Web services architectures. Moreover, these next-generation solutions will provide secure partitioning, data organization, and user management services.
Parts 2 and 3 of this series will examine what end users and service providers should look for when evaluating cloud-based storage solutions.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
The benefits of cloud-based storage, part 2
What are the core capabilities that end users should look for when evaluating "storage in the cloud" solutions?
November 10, 2008 -- As discussed in part one of this three-part series (see "What is cloud-based storage?"), at the heart of cloud-based computing is a loosely coupled infrastructure that is self-healing, geographically dispersed, and instantaneously scalable in response to business demands. Cloud-based computing virtualizes the location, connectivity, and resources behind loosely-coupled application components in order to be elastic --able to move and shift computing and storage resources, and rapidly deploy new systems or applications, in response to any demand. Moreover, cloud-based computing promises to make infrastructure, applications, and storage easier to manage, and much easier to integrate with other applications or changing business processes.
What does this mean to you as an end-user, application developer, or IT manager who may be considering where to store data? Cloud-based storage today can let you store and manipulate any type of data on higher-performance, more scalable, more accessible, and cheaper storage. Moreover, it can free you from the costly management overhead that surrounds data storage by serving up file storage in a self-managed, easy-to-access manner. Cloud-based storage lets users not only provision and manage storage themselves, but also store data in XML files, text files, or many other data formats. Meanwhile, cloud-based storage solution provide users with database-like data manipulation through innovative file-filtering mechanisms, metadata tagging, and the virtual presentation of files in many places at once.
These solutions are available today. Some users have moved entire application sets to API-accessible storage services such as Amazon S3, and in turn, have access to a dynamically scaling infrastructure. Some businesses are currently shopping for, or building, similar solutions within their own corporate networks, to harness the flexibility of such an infrastructure while reducing their storage infrastructure cost of ownership by enabling user self-service.
But for many other users, cloud-based storage is a fuzzy new technology, with neither clearly defined capabilities nor benefits.
In this second part of our series, we'll look at a set of five core capabilities that are important to end users looking to store data in either private clouds or the public Internet cloud. These capabilities and the associated benefits will shed light on how cloud-based storage may be beneficial to you.
Key capabilities
Cloud-based storage is amorphous today, with neither a clearly defined set of capabilities nor any single architecture. Choices abound, with many traditional hosted or managed service providers (MSP) offering block or file storage, usually alongside traditional remote access protocols or virtual or physical server hosting. Other solutions have emerged, typified by the Amazon S3 service, that resembles flat databases designed to store large objects.
The Taneja Group defines cloud-based storage as a specific category within the larger field of "storage in the cloud" solutions. Storage in the cloud encompasses traditional hosted storage, including offerings accessed by FTP, WebDAV, NFS/CIFS, or block protocols either remotely or from within a hosted environment. Cloud-based storage is an evolution of this hosted storage technology that wraps more sophisticated APIs, namespaces, file or data location virtualization, and management tools, around storage.
Regardless of whether you are building a multi-tenancy hosted application, or you want to move your enterprise applications to the cloud, there is a core set of capabilities that are common to emerging cloud-based storage solutions, and how well a specific solution delivers on these capabilities will be key to determining how well you can 1) integrate stored data with different applications and systems in versatile ways; 2) harness cloud-based storage performance, scalability, and distribution to increase your infrastructure flexibility, responsiveness, and availability; and 3) reduce your cost of owning and managing storage.
API-accessible. Today, businesses are surrounded by a world of Web services, scripting, lightweight development frameworks, mash-ups, and various other dynamic, easily integrated Web technologies. Access to stored data through a sophisticated API makes cloud-based storage extremely versatile, and in fact re-invents how stored data can be leveraged for the support of applications and business processes. Moreover, APIs can be tuned for general storage management as well, and allow administrators to overlay nearly any management, reporting, or governance process on storage. A few potential usage cases for API-accessible storage include
--APIs will let administrators wrap storage management with nearly any business process, including customized automation of provisioning, snapshots, file versioning and rollback, replication, and more. Web services APIs may be easily discovered and integrated so that they remove the hurdles associated with management protocols of the past. Because Web services are self-documenting and discoverable, management capabilities can be exposed, even with different APIs for different systems, without creating lengthy standardization efforts such as SMI-S;
--APIs will let developers create, store, access, and re-use complex sets of data more easily. This will encourage lighter-weight, more flexible application architectures, easier data re-use, and rapid application development, at lower cost and effort. Think of API-accessible data as a gateway for Web application access to any unstructured data in the enterprise, with the simple efficiency of databases;
--APIs will also allow administrators and/or developers to empower user self-service by creating portals or applications where users can manage, protect, and control their own storage. This will drive down the cost of ownership for storage;
--APIs will make storage extensible, and data more portable. For example, an open API could be used to create a gateway that mimics yesterday's protocols or interfaces with today's protocols such as XAM; and
--Storage vendors are developing APIs on top of flexible infrastructures that will make Web-service-based access to storage commonplace. Innovators will make their storage even more extensible through the use of APIs, which will enable integration with other applications for data tiering, classification, control, conversion, or other file manipulation. Some examples include Ibrix's Cirrus API, which provides access to user management, data sharing, snapshots, and versioning, and Omneon's Media Services Framework, which provides API access to video transcoding and QoS-like storage optimization.
Innovative in organization and management. Cloud-based storage cannot grow to the scale necessary without flexible management, organization, and presentation of storage that removes cumbersome semantics such as hostnames, directories, and permissions. When users turn to cloud-based storage, they will recognize enormous savings in the time and effort associated with administration of storage and data management. And developers can store and integrate data faster and with less administrative overhead.
Cloud-based storage providers will enable self-service storage provisioning and management of data that is not only API-enabled, but also breaks with current conventions. Innovative providers will not only cover basic storage management operations (file protection, tiering), but also provide data presentation that can mimic some of the capabilities of file virtualization through virtual views or containers for data that are completely abstracted from the on-disk location of data. Users will be able to place data into different virtual views that are accessible by different users. Such organization through virtual views and lightweight tags, coupled with self-service management of storage, may change how the industry approaches traditional file storage as well.
Responsive and scalable. Users of cloud-based storage should assess the responsiveness, availability, and scalability of their hosting service. Vendor innovation will drive new levels of these features that will surpass even the best enterprise systems. Users should minimize their risks through SLAs focused on performance, responsiveness and scalability, but also through an awareness of their service provider's storage capabilities. While visibility into provider capabilities will likely always be opaque, cloud-based storage should demonstrate the ability to transparently move data across locations and potentially service providers, self-heal, and scale up in performance and capacity to meet rapidly changing customer demands. Equally important, users should match service provider capabilities in these areas with current and anticipated future needs, and do so while being attentive to their planned application architecture. Users with many small, separate I/O streams may be able to easily distribute their demands and work with any, or many, provider(s), regardless of the provider's ability to move or distribute data.
Open, well-documented, portable. Today, cloud-based storage is too new for standardization, and recent attempts at standardization have been slow or have left a sour taste in the mouths of many users, both within the storage industry and across the IT field in general (XAM, SMI-S, OpenXML, and others). This leads ambivalent uses to anticipate wading through a bog of APIs with excessive overhead and incompatibilities, with no hope of moving data between systems without starting from scratch. We believe concerns about standardization and portability for cloud-based storage are largely unwarranted, and that cloud-based storage will in fact remedy many of the standardization issues we have today. That is because cloud-based storage is centered on lightweight APIs and access frameworks such as HTTP-based REST that are already well established. But users should pay attention both to what these frameworks give them, and whether the frameworks are served up on top of the right underlying storage.
In our view, users of cloud-based storage will be best served by innovative storage vendors who develop deeply integrated, full-featured APIs on top of their next-generation storage systems. This creates a turnkey system that can deliver advanced storage features, such as snapshots or file versioning, while assuring both the service provider and the customer that the solution will work without incompatibilities or multi-vendor finger-pointing. Providers that turn to these solutions and APIs will be able to deploy cloud-based storage services quickly and cost-effectively.
More importantly, while we believe the lightweight and simplified nature of REST-based APIs will make application and data porting simple, out-of-the-box solutions will drive standardization. Since cloud-based storage will only be possible on top of a relatively small number of systems that can scale to huge amounts of performance and capacity, there will be a relatively small number of solution vendors and APIs in the market. Since storage Web services APIs will support similar basic operations (even if they also support more advanced operations), developers can quickly map APIs between solutions to mask differences and enable better portability.
Ready for versatile usage cases. Flexible presentation of storage as traditional file/block, remote storage (ftp, http, WebDAV), or API-accessible storage will open the doors to versatile use cases for cloud-based storage. It was not long after Amazon S3 sprung up that users were trying, and demanding, hosting of entire virtual machine computing environments. This enabled more uses for Amazon's storage cloud, and enabled users to collocate complex compute resources alongside rich Web-integrated data. Today, users are able to perform complex content creation and/or business logic processing while simultaneously using generated data within a loosely coupled and widely distributed Web application architecture. Many users will find value in combining multiple computing approaches when cloud-based storage can be accessed as traditional file/block storage in a hosted infrastructure.
This will be an ideal use case for the next generation of cloud-based storage. It is easy to imagine large hosting providers with unique speed and data differentiation that could take advantage of a Web services overlay for their data. As one example, a solution at a service provider like XASAX -- a provider that is collocating HPC-like infrastructure and financial applications next to high-speed financial data feeds for market data analysis --may enable a new generation of dynamic Web-based data reporting/analysis and mash-up applications for financial customers.
Challenges for cloud-based storage
Users initially consider cloud-based storage for its potential cost savings and improvements in storage scalability and availability. While such savings are compelling, users shouldn't overlook holding up the fundamental capabilities of cloud-based storage to a measuring stick that considers future strategic IT and business needs. The fundamental capabilities of cloud-based storage take center stage when considering strategic business needs, and will differentiate providers.
In the next article in this series, we'll look at cloud-based storage capabilities that are key considerations for service providers.
Jeff Boles is a senior analyst and director of validation services at the Taneja Group research and consulting firm.
********
Challenges for cloud-based storage
Users should be aware of the potential downsides to cloud-based storage. These include issues of portability or vendor lock-in, regulatory compliance issues, and the availability of cloud-based storage when one vendor's solution is unique in architecture or APIs.
First, there are currently no standards for cloud-based storage or computing. This can make porting an infrastructure from one vendor to another dicey at best, and may mean you're subject to the whims of an infrastructure provider. This is a key issue that emerging cloud-based storage solutions will address, but nonetheless is a major challenge today. Once solutions are available from major vendors, more services with common APIs will become available, and developers will come up with mappings between other popular APIs (such as Amazon S3, and potentially even XAM).
Second, cloud-based storage solutions still fall short of meeting all IT storage needs. The biggest gap is where databases are concerned. While Amazon S3 started life looking very much like a widely distributed, extremely flat database, it has never been capable of meeting traditional enterprise database needs: It is not relational in the traditional sense, it lacks DBMS tools, and because it is designed to support loosely coupled applications, it does not support high loads of guaranteed, consistent transactions expected in traditional database environments. More importantly, without a distributable database, the cloud looks like a poor place for databases -- applications that depend on access to single instances of databases in the cloud will never be able to benefit from load-balancing, scalability, and improved availability; all of which may imply the use of multiple copies of data or stateless redirection of data connections. This is an area of critical importance in which next-generation cloud-based storage vendors must begin to innovate.
Meanwhile, legacy architectures or users relying on traditional databases face a quandary and must decide whether to re-architect their application for the cloud and/or deal with sticky issues around transactional consistency and other features that are often tied to concrete business requirements, or choose another path.
Third, users need to be sensitive to where they store their data in the face of ever-changing regulations (SOX, HIPAA, PCI, etc.). Many regulations dictate that users will be able to identify and control the location of their data, which isn't feasible when that data is virtualized across a cloud.
Finally, because of availability risks in a shared environment that is not fully under their control, users often remain hesitant to turn to cloud-based storage for anything more important than testing and development, or infrequently accessed data storage. Until cloud-based storage is highly available, users will be unable to host mission-critical data on it. Increased availability will come on two fronts: 1) the entry of enterprise-class storage systems into the cloud storage market; and 2) ubiquitous and compatible multi-provider solutions that deliver availability through data dispersal and flexible delta-based replication. Delivering more availability in the cloud may be largely a matter of dispersal -- spreading partial or complete copies of data across the cloud can keep it available even during dramatic failures. Vendors such as Cleversafe have come up with innovative and unique algorithms for this, and other technologies exist that have not yet come to market.
But this requires widespread cloud-based storage, which will come from service providers ramping up cloud services built on top of out-of-the-box offerings -- foregoing the intensive and drawn-out development cycles that created today's cloud-based storage offerings. Selecting a vendor that is using a commonly available out-of-the-box platform, rather than a custom-developed one, may put you in a position to make use of multiple service providers for increased availability sooner rather than later.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Cloud-based storage, part 3
The final installment in our series on cloud storage focuses on capabilities for service providers and the data center.
November 11, 2008 -- As we have discussed in parts one and two of this three-part series, at the heart of cloud-based computing is a loosely coupled infrastructure that is self-healing, geographically dispersed, designed for user self-service, and instantaneously scalable in response to the ebb and flow of business demands. Cloud-based computing virtualizes the location, connectivity, and resources behind loosely coupled application components in order to be elastic--able to move and shift computing and storage resources, and rapidly deploy new systems or applications, in response to any demand. Moreover, cloud computing promises to make infrastructure, applications, and storage easier to manage, and much easier to integrate with other applications or changing business processes.
To recap, Taneja Group considers cloud-based storage (CBS) an emerging technology within a larger solution category of file-centric "storage in the cloud." Storage in the cloud has previously included remotely accessible file storage offerings accessible by way of FTP, WebDAV, or NFS/CIFS. Cloud-based storage is an evolution of hosted file storage technology that wraps sophisticated APIs, new data presentation and access semantics, location virtualization, and management tools around file storage. While file storage may be used to support block-like storage through virtual server images, CBS is about serving up data stored in files across the Internet or internal enterprise networks.
Strategic asset for service providers
An emerging set of out-of-the-box CBS solutions will be a strategic and critical consideration for managed service providers. CBS solutions today include EMC's Atmos, Ibrix's Cirrus, and Nirvanix, but we expect that every major vendor, and many scale-out NAS vendors, will soon come to market with their own take on CBS.
CBS can serve as a foundation for building more sophisticated and "stickier" Web applications around existing or new applications and upper layer storage services, including basic but unique data storage, backup, file sharing, collaboration applications, more sophisticated Web hosting and Website support, VoIP application services, data archiving and discovery, contact management, customer relationship management, and others. These "stickier" services can increase customer retention and create significant new revenue streams. In a competitive landscape where customer mindshare is increasingly dominated by services from Amazon, Microsoft (Azure), and Google, stickier competitive services are necessary business weapons.
Simultaneously, large enterprises may consider a role as a service provider to their own organization. For the large enterprise, cloud-based storage hosted internally may reduce administrative complexity and cost of ownership for storage by enabling user self-service and providing a foundation for pay-for-use utility storage. The potential pay-for-use aspect of CBS can be compelling in chargeback driven organizations solely because accurately allocating storage utilization is so difficult in traditional infrastructures. CBS, designed to support incredibly large numbers of users, scales both up and down and can provide the granularity necessary to support pay-for-use. Moreover, cloud-based storage in the enterprise may heighten collaboration and enable more rapid, lightweight application development. It is easy to imagine Web portals that are suddenly enabled with rich file and data access, even to the point of hosting GoogleApp-like solutions within the enterprise.
Challenges
Whether it is to take advantage of the tremendous opportunities surrounding storage and application services, or respond to repeated demands by users and customers, it is clear in our conversations with service providers--both external and internal within some enterprises--that they are struggling with how they can implement cloud-based storage as a part of their service portfolio. Building a CBS infrastructure has to date been fraught with difficulty and complexity. Most mature solutions today have been built on top of white box infrastructures with millions of dollars of custom development. The development and management requirements for these infrastructures would overwhelm even sizable service providers or enterprises.
In turn, many organizations have evaluated the re-branding or resale of existing cloud-based storage such as Amazon's S3. While engaging a third party's storage infrastructure is appealing, service providers are often averse to the risks associated with depending on cloud-based services over which they have little or no control.

Cloud-based storage has evolved from continuing attempts to de-couple storage from applications so that each resource can be optimally scaled and managed.
The good news is that this is changing. We see an emerging next generation of storage solution that will provide service providers with out-of-the-box storage in the cloud. Let's take a look at what that out-of-the-box solution will look like, and its fundamental capabilities that will let service providers easily build higher-level storage and computing services, while providing cloud-based storage to customers. While service providers are tied to meeting end-user requirements for storage services, they are necessarily focused on a different set of core capabilities that will allow them to economically manage and grow a storage infrastructure across many widespread users.
CBS capabilities for MSPs
Managed service providers (MSPs) face dueling challenges in building a CBS infrastructure: A CBS service must serve the needs of the customer and have a unique set of capabilities that will allow the MSP to manage massive amounts of storage and users. Moreover, MSPs providing storage today have already managed storage long enough to realize many of the shortcomings of traditional monolithic NAS. Consequently, CBS must also reduce, mitigate, or eliminate the current issues with file storage infrastructures (including costly NAS sprawl), isolated storage silos that must be managed separately, and costly, disruptive NAS migrations and service events.
There are no hard and fast rules for what will make an ideal CBS storage platform for a given MSP. The MSP storage market is still evolving, and each MSP will have a unique combination of services on its road map, resulting in variation in their specific requirements.
Today, there is demand for backup, archiving, Web application data storage, virtual infrastructure hosting, e-discovery, and a number of other storage services in the cloud. While these services are often SMB-oriented today, we expect they'll rapidly take on other customers, including the vast potential market of individual users, as soon as MSP solutions can handle the scale and complexity associated with enormous numbers of customers. In addition, the near future will bring more enterprise application hosting, distributed parallel processing, much more comprehensive virtual infrastructure hosting (in the form of entire virtual environments or virtual private data centers, as well as virtual desktop services), more sophisticated collaboration solutions, and more.
CBS can be a key enabler of every one of these potential services. Using this range of services, and an open mind toward potential future hosted services, we've identified a set of core CBS capabilities that merit special attention by service providers evaluating CBS solutions. MSPs should consider how differentiation in each of these areas may support their plans for services and make their infrastructures more flexible in the future. Moreover, while this article is targeted toward MSPs, enterprise users considering internal service provider models for storage should take these capabilities into consideration as well.
Established, rich APIs. API enablement is at least as important to service providers as it is to end users. First, end users expect API access in true cloud-based storage and will select a provider based on the capabilities of an API. Second, API-enabled storage infrastructure can enable service providers to wrap customized storage management and business processes around their storage services and optimize their management practices. Given the enormous capacities and number of users that will be served by a CBS solution, APIs will be critical to infrastructure management. Similarly, an API for fundamental storage tasks may be the only way to build self-service portals for user creation and management of storage spaces--a key requirement for serving up storage services over the Web.
Moreover, APIs can serve as the foundation for other higher-level storage service offerings. APIs can help service providers develop unique applications or service offerings, or meet specialized needs of key customers. MSPs should carefully evaluate the depth and versatility of an API in combination with their anticipated service offerings, and evaluate whether it exposes the right storage capabilities, in the right way.
Storage management and organizational tools for Web-based storage. Service providers face a significant management hurdle when they're operating cloud-based storage. Traditional approaches to storage management--provisioning, optimization, reporting, and control--will not hold for cloud-based storage as it is too time-, effort-, and cost-intensive when scaled for enormous amounts of capacity and users. Beyond API enablement, Web-scale storage will need a new storage management paradigm. This new approach to management will have several fundamental capabilities.
First, cloud-scale storage management will be user-centric. Users will manage their own allocation, provisioning, protection, and other operations within strongly partitioned areas of the storage infrastructure. Nonetheless, strong partitioning should not retard the MSPs' ability to holistically view and manage the storage infrastructure. Even with end users made responsible for most basic storage operations, the MSP will still require either more sophisticated reporting and utilization tools than ever before, or an API that can easily enable custom development of tools for reporting, planning, accounting, and optimization.
Second, CBS will need to bridge today's sprawling NAS silos and aggregate even geographically dispersed storage systems behind huge namespaces that can effectively virtualize file locations and provide multiple views of data. With this evolution of file virtualization, CBS solutions will not only abstract data location, but will also be able to present storage for pure file storage, present the same storage for virtual environments, control different levels of visibility for different users and applications, and even meet MSP reporting requirements. Moreover, this combination of file virtualization and global namespaces is only a beginning: CBS may extend file virtualization across any system by redirecting requests and API commands to heterogeneous storage systems at other service providers, or even within a customer's own data center. By virtualizing other data behind a cloud and a single API, customers may be able to ingest foreign data into a CBS solution with ease and extend CBS-based storage services to local data or data hosted across heterogeneous storage clouds.
Third, behind a large, partitioned, but flexibly presentable CBS solution, the MSP should look for data replication and movement capabilities. This will allow the MSP to manage storage seamlessly across locations while making service events, migrations, or other outages nearly transparent. Moreover, replication can be a key component of CBS infrastructures built for high availability, and encourage customers to host more mission-critical applications in the cloud.
Storage systems that deliver simplified management, extensible APIs, and abstract the location of data will radically change file storage for the service provider by easing management, simplifying high availability, and making migrations and maintenance activities transparent to end users.
User management and access control for cloud-based storage. While users expect flexible organizational tools for Web-based storage, service providers looking to build out cloud-based storage infrastructures will face challenges of providing storage for multiple types of customers and usage cases.
We see customer usage cases for Web-based storage varying between two extremes. On one end is bucket-like storage where individuals simply store data for access. At the other extreme are customers who will want to store complex data that may be shared in complex ways across large numbers of users, multiple groups of users, and/or between organizations. These customers will want granular control and easy configuration of a CBS solution to support these complex configurations. The MSP will require a framework for managing users and providing data visibility and access control that can support the most complex organizations and inter-organization relationships. Traditional user directories and file and directory access controls simply are not flexible enough for this task. Validated architectures such as extensible metadata tags at the file level or database-like file system overlays may be the only current paths to meeting these data presentation requirements. However, more innovation is sure to come. Both of these approaches are used in and validated by file virtualization and information classification and management (ICM) vendors today. Such technologies will allow organizations to structure multiple presentations of the same data to filter and control content for different types of users and serve as a foundation for enabling user-based access control and sharing of data through simple metadata tagging.
Altogether, MSPs should carefully consider whether the depth and flexibility of user partitioning, namespace, quota, and sharing features can support their potential service offerings and customers today and in the future.
Scalable storage. While the abstractions of CBS--including namespaces, APIs, and management tools--could possibly be layered over many different types of storage systems, the storage system requirements behind CBS are nonetheless unique. Cloud-based storage will impose new demands for scalability, performance, and ease of management on a scale that has not been seen before. Moreover, there are few markets as cost-sensitive as MSPs. Service providers should look for a cloud-based storage infrastructure that will scale indefinitely and linearly in both performance and capacity with extreme cost efficiency, be easy to manage at any scale, deliver high availability, and support wide geographic dispersal with sophisticated replication and data movement tools. A number of block and file storage systems exist today that can meet these requirements.
Extensibility. While APIs can provide a foundation for extending CBS service offers with upper layer services, in the context of how an MSP plans to build out a services portfolio, extensibility merits considerable evaluation. While we expect APIs to provide a foundation for accessing core storage functions, we also believe two other areas of extensibility will emerge. First, sophisticated APIs will allow a CBS storage infrastructure to trigger and interface with other Web services. In the first generation of solutions, this capability will create a plug-in-like architecture that will support the delivery of other services, such as proactive data classification as data is ingested. Second, MSPs should carefully evaluate how much more capability a CBS solution brings to the table beyond basic storage operations. Innovative features will build a foundation for data ingest, across-the-cloud data interaction, secure data locking, file versioning, and other capabilities.
Enormous change is on the horizon around how we work with stored data, and with emerging CBS solutions MSPs will provide new approaches to storage management, data presentation, and flexible storage architectures. Moreover, while service providers will be at the forefront of adopting these technologies, these changes will eventually influence the core practices of data storage across the entire industry, from small businesses to large enterprises. Driven by MSP requirements, the changes surrounding cloud-based storage will alter infrastructure systems drastically, as vendors attempt to look at data storage with a new perspective to address the core requirements.
While the number of vendors currently in this market is small--including EMC, Ibrix, and Nirvanix--other offerings will be available from a number of other vendors in the near future.
The message to the service provider is clear: You must have cloud-based technologies on your strategic radar to remain competitive. Whether it is a core technology or a technology that competes with your core technologies, the likelihood that CBS will influence your business is high. If you are an MSP, we recommend you keep an eye on the emerging vendors, track the rapidly changing capabilities of their solutions, and fully consider how they can be integrated into your services to drive new revenue, increase cost efficiency, and enhance the competitiveness of your business.
Jeff Boles is a senior analyst and director of validation services at the Taneja Group research and consulting firm.
******
Policies and extensibility in the cloud
Sophisticated and flexible optimization of files stored within a cloud-based storage (CBS) infrastructure will be a fundamental requirement, and in many solutions will be delivered by a policy engine interacting with a CBS solution's API.
Take for example a video stream that starts with obscurity then becomes very popular. To deliver the streaming performance required for that video stream, a CBS infrastructure may need to create multiple copies of the video file and even geographically distribute it. If spare capacity and bandwidth exists, perhaps a CBS solution should recognize every video file and cache multiple copies ahead of time, to be used in case popularity increases.
As a non-video example, a service provider may have different costs or classes of service, varying in performance, protection, or availability. A customer may wish to have their data automatically tiered across different storage classes based on file age, utilization, or file type.
But there is debate about how the policy engine should interact, and even whether that policy engine should reside outside of or within a cloud-based storage solution. We see three different types of solutions currently:
--Some solutions are coming to market where a sophisticated policy engine is built into and distributed throughout the solution;
--Some solutions are coming to market where there are efficient hooks within the API or even a different API for handing off data to third-party solutions that can perform an operation on data (such as classification based on content), and return instructions to the CBS solution to take action on the original file (such as moving the file to a different tier or locking the file against future change); and
--Some solutions will provide no more than a Web services API for access, and all policies will need to be supported in a web application business rules layer that interacts with the API.
Policy-based management of data will vary in importance depending on the presence of classes of service within a cloud, the size and distribution of storage resources behind a cloud, and the need to deliver lifecycle management or other complex services against customer data.
The key questions to consider are the following:
--Do the key data interactions meet your needs around the services you plan to offer--e.g., can it take action on video files, perform file and owner characteristic-based tiering, or perform other functions you require?
--Is the architecture scalable enough to service the number of files and users you anticipate? There may be significant performance differences based on where the policy engine is, and how it interacts with data. Moreover, depending on how it interacts with data, a heavily used policy engine may cut into total system performance.
--Can policies be applied with the granularity you require--e.g., if you are building a hosted, highly automated litigation support service on top of CBS, you may have tens of thousands of users that each demand complex policies with hundreds of rules. Can the CBS architecture support isolated but complex rule sets?
--Finally, are there any unique twists in your architecture that you require a policy engine to interact with, and can you customize the solution's policy engine to do so? Today, CBS is in its infancy. But in the near future, it is possible to imagine an application delivery network to cache or distribute content that is in high demand.
Depending upon your response to these questions, any one of the architectures may be the better fit.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
BUSINESS BRIEFS January 23, 2009
BUSINESS BRIEFS
-- Toshiba is reportedly in final negotiations to acquire most of the hard disk drive (HDD) operations of Fujitsu. Official announcement may come before the end of January, and various reports suggest a transaction price of 30 billion to 40 billion yen (roughly $334 million to $435 million). If the deal goes through, it would give the combined company a 31% share of the 2.5-inch notebook HDD market, 19% of the enterprise drive market, and 67% of the 1.8-inch HDD market, for a 17% overall share (behind Seagate and Western Digital), according to Needham & Company.
-- Adva Optical Networking and COLT -- a provider of data, voice and managed services -- have announced a long distance 8Gbps Fibre Channel service based on Adva's FSP 3000 card. The service provides a point-to-point link beyond 135 kilometers to connect two data centers over 8Gbps Fibre Channel.
-- DNF Security, a Dynamic Network Factory (DNF) business unit, announced a partnership with On-Net Surveillance Systems (OnSSI). DNF Security recently completed certification of its Seahawk IP storage platform with OnSSI's NetDVMS Network Video Recorder (NVR).
-- QLogic has formed the SignatureHPC channel program targeting resellers focused on high performance computing (HPC) clusters based on InfiniBand networks. The new program is an extension of the company's Signature Partner channel program for storage networking.
-- Overland Storage has signed up the Nordic division of international distributor Zycko. The reseller deal encompasses Overland's Snap Server NAS devices, NEO tape libraries, REO disk-based backup appliances, and Ultamus RAID arrays.
-- Super Micro Computer has chosen LSI's 6Gbps SAS RAID-On-Chip (ROC) and MegaRAID products for its next generation servers. LSI previously announced 6Gbps SAS design wins with Dell, Fujitsu Siemens, IBM, Intel, NEC and Sun.
-- Sunbelt Software has unveiled the Sunbelt File Archiver (SFA), a new enterprise-grade file archiving product. The SFA software blends electronic document management and file archiving and uses customer-defined rules to determine how files should be archived. Sunbelt File Archiver is available now and starts at a price of $695 for 25 employees with a sliding scale discount based on number of employees. A 30-day trial version of SFA is also available.
-- SiliconSystems has introduced tools that allow OEMs to measure how long solid-state disk (SSD) drives will last. The LifeEST tool is an endurance calculation methodology that is independent of how an SSD is actually used, and SiSMART is a technology that actively monitors SSD performance and measures usage in real time.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
SNIA releases 'green storage' metrics
January 21, 2009 -- The Storage Networking Industry Association's (SNIA) Green Storage Initiative (GSI) group today released the first tools for classifying and measuring the energy consumption of storage systems.
The initial Green Storage Power Measurement Specification consists of two components – a Green Storage Taxonomy and the Idle Power Measurement Metric – both of which have been developed to provide classification and measurement guidelines for standards organizations, governmental agencies, storage vendors, and end users.
The Green Storage Taxonomy is used to classify storage products based on energy consumption characteristics and application environments, while the Idle Power Measurement Metric serves as a baseline standard that can be applied as a uniform method for collecting idle power consumption measurements.
Al Thomason, vice chair of the SNIA's GSI and system storage portfolio manager with IBM, says the taxonomy was designed to ensure apples-to-apples comparisons between storage devices.
"We have put a tremendous amount of effort into creating the storage taxonomy with a focus on system characteristics and usage segments, rather than technologies," says Thomason. "It allows direct comparisons. We want to make sure that these tools are useful and not used for the purpose of green-washing technologies."
The Green Storage Taxonomy groups storage devices based on feature criteria for the application environments that they are intended to support. The SNIA has categorized those environments into five classes of storage products, ranging from small home/office applications (SOHO) to large enterprise–oriented applications.
The storage system categories covered under the Green Storage Taxonomy are: online, near-online, removable media libraries, non-removable media libraries, infrastructure appliances, and infrastructure switches.
The feature criteria for each storage system class are based on the required level of data protection, component redundancy, serviceability, data access time, and range of energy consumption.
Once a device is classified, the Idle Power Measurement specification can be used to test energy efficiency. The specification, which is currently available for public review and comment, outlines a standard for testing and measuring storage power consumption in idle mode. The idle power measurements are reported in raw GB per watt (GB/w) based on manufacturer model number, raw storage capacity, storage media rpm, and interface.
According to recent SNIA research, servers, networks and storage systems draw about 46% of all data center power. Thomason says storage accounts for about 13% of overall power consumption. That percentage is sure to rise with the need for more storage capacity to accommodate data growth.
Members of the SNIA's GSI include 3PAR, AMD, Brocade, Copan, Dell, EMC, Emulex, Fujitsu, Hewlett-Packard, Hitachi Data Systems, IBM, Intransa, LSI, Microsoft, NetApp, Pillar Data Systems, QLogic, Quantum, Seagate, Sun, Symantec, VMware, XIOtech, Xyratex, and other companies and industry analysts.
The SNIA is also working with several groups and industry associations to make IT greener, most notably The Green Grid and the U.S. Environmental Protection Agency's Energy Star program.
There is still much work to be done. Thomason and the GSI have mapped out their plans for 2009 and the next items on the agenda are the development of active power measurement guidelines and metrics, standardized power supply efficiency specifications, and publication of each vendor's completed test metrics.
The group also plans to add power-related reporting capabilities to the Storage Management Initiative Specification (SMI-S) profile.
Related articles:
SNIA launches SSD initiative
SNIA UPDATE: XAM standard to address long-term archiving

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Nexsan combines SAS, AutoMAID
January 22, 2009 -- Nexsan Technologies has once again combined its energy-saving AutoMAID (massive array of idle disks) technology with high-performance, SAS-based RAID storage to create the new SASBeast disk array.
The SASBeast, which is set to be released next week and is the company’s second offering in the SAS market, features up to 42 disk drives in a 4U enclosure for a total of 18.9TB of SAS capacity. The SASBeast also supports SATA drives, as well as 4Gbps Fibre Channel and 1Gbps iSCSI connectivity.
Each SASBeast array features four RAID engines and up to 4GB of cache per dual- controller configuration. The array supports multiple RAID sets and multiple volumes per set, and as many as 256 LUNs per controller. The SASBeast also supports hardware-based RAID 0, 1, 1+0, 4, 5 and 6.
Like its smaller counterpart, the 4.2TB SASBoy RAID array, AutoMAID technology is a standard feature on the SASBeast platform. AutoMAID gives users the option of choosing among three different levels of energy savings, ranging from 20% power savings with response times of less than 10 seconds to 60% energy savings with a 30-second response time.
Bob Woolery, Nexsan’s senior vice president of marketing, says even high-speed systems can be designed to save power. “You don’t have to give up energy efficiency for performance anymore. Systems are not always working 24hours a day,”
With the SASBeast, Nexsan also introduced a new horizontal mid-plane design that promotes more efficient airflow, which helps the system run significantly cooler to improve energy savings, according to Woolery.
Nexsan plays in the tier 2 storage market where retaining and protecting unstructured data is the top concern. Woolery says the SASBeast meets the storage needs of transactional applications with high IOPS requirements and the ability to scale to hundreds of terabytes of capacity in a single rack.
Pricing for a SAS-only system starts at $38,000. The SASBeast costs approximately $4,200/TB of SAS capacity.
The SASBeast is available in a variety of configurations, including a full SAS system that can be populated with up to 42 drives. Intermixed drive configurations are available with up to 14 SAS drives and 28 SATA drives, or up to 28 SAS drives and 14 SATA drives.
Related articles:
Nexsan ships AutoMAID disk array
Nexsan enables archiving-as-a-service
Nexsan targets Apple Xserve market

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Dot Hill ships 2.5-inch SAS arrays
January 19, 2009 – Dot Hill Systems today announced two entry-level RAID arrays equipped with small form factor (2.5-inch) SAS drives. The model 2722 with a 4Gbps Fibre Channel host interface is available now and the model 2522, with a 3Gbps SAS host interface, is expected to ship in April. Both RAID arrays come in a 2U, 24-drive configuration.
"The systems also support SATA, but our focus for 2.5-inch drives is on SAS because of the higher performance and high spindle count you can get in a 2U form factor," says Scott McClure, a product manager at Dot Hill. "SATA is for cheap and deep storage, and the cost per GB is still better in 3.5-inch SATA drives."
On the 2722, Dot Hill claims performance of up to 230,000 I/Os per second (IOPS) in a dual-controller configuration, and throughput of 1,200MBps on reads and 800MBps on RAID-5 write operations. In terms of IOPS, the arrays are 130% faster than Dot Hill's previous generation arrays based on 3.5-inch drives, and they consume 34% less power than the company's 2U, 12-drive subsystems. The RAID arrays are based on 1.8GHz AMD Mobile processors and are compatible with the 120MHz PCI-X bus.
Dot Hill also announced support for 2.5-inch 64GB and 160GB solid-state disk (SSD) drives from Intel (although users and integrators can load the arrays with SSDs from other vendors).
Using the company's 2U, 24-bay model 2122 JBOD expansion enclosures, the systems can support up to 96 drives for a total capacity of 28.8TB with 300GB SAS drives. Drive options include 10,000rpm or 15,000rpm SAS drives (from Seagate, Fujitsu or Hitachi) or 5,400rpm SATA drives. Software options include Dot Hill's AssuredSnap snapshots and AssuredCopy volume copy software.
As are Dot Hill's existing RAID arrays, the 2722 and 2522 are ruggedized and meet military and telecommunications industry standards (including NEBS Level III and MIL-STD-810F), and are based on the company's R/Evolution architecture with dual RAID controllers. A "unified LUN" feature provides the ability to mask the two controllers so they appear as one controller to the user interface, which makes it easier to set up a dual-controller configuration, according to McClure.
List pricing for a fully-populated 2722 starts at less than $16,000.
Market research firm IDC expects shipments of "performance-optimized" 2.5-inch drives to surpass shipments of 3.5-inch drives by 2011.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Lab Review: Hyper-scalability for virtual machine I/O
Using Virtual Service Clients and Virtual Service Providers, Microsoft’s Hyper-V creates a low-overhead environment for virtual machines that can provide a high degree of scalability on a state-of-the-art SAN.
By Jack Fegreus
With the increasing turbulence in the business environment, the forces driving IT activity and reshaping the data center have intensified. Across the corporate landscape, the mandate for peak operations efficiency now has IT redoubling its focus on slowing—if not reversing—rising labor costs by alleviating management complexity. It also brings into the corporate limelight the problem of rising power consumption costs in the data center.
For today's savvy CIO, the solution is not that difficult. Fundamental technology trends and sophisticated IT strategies, such as virtualization and IT service management, are converging to create a revolutionary transformation of the data center that essentially solves the problem through emphasizing the management and automation of services within a Virtual Operating Environment (VOE). The challenge for IT is to provide a sufficiently robust physical infrastructure that meets a broader spectrum of concerns that now go beyond cost, performance, and backwards compatibility.
QLogic’s portfolio of 8Gbps Fibre Channel infrastructure solutions helps meet that challenge via support for a SAN fabric that can meet evolving virtualization technologies and satisfy demands for simplified management, higher reliability, availability, and serviceability (RAS), and lower power consumption. What’s more, QLogic’s 8Gbps Fibre Channel infrastructure enables a VOE, such as Microsoft’s Hyper-V, to scale I/O for running demanding applications within virtual machines (VMs) and running greater numbers of VMs on VOE servers. In both scalability scenarios, physical VOE servers require scalable, high-throughput I/O pipes, backed with a Quality of Service (QoS) guarantee
A two-edged sword
The conundrum for IT is that the same forces that drive the solution also drive the problems. Multi-core, multi-processor servers often have 16 CPUs. Servers with that magnitude of processing power, however, are a two-edged sword. Well-conceived IT plans for resource consolidation will dramatically cut runaway power and cooling costs. On the other hand, introducing these powerful servers with a weak consolidation scheme will raise, rather than lower, those environmental costs.
So too, Shugart's law, which drives down the cost of storage hardware, has a very similar effect on storage resources. With 750GB disk drives, multi-terabyte arrays join the family of commodity devices; however, multiple spare 750GB disk drives quickly degrade the level of resource utilization.
This makes a VOE essential for an aggressive resource consolidation initiative. With a VOE in place, one server easily takes on the workloads of multiple servers without disruptions in the way that applications run and are managed. At a branch office, a single multi-core multi-processor server can be used to consolidate file, print, web, database, and email servers. In so doing, IT can reduce the costs of system management, maintenance, and staffing.
Nonetheless, increasing the degree to which a site’s resources are virtualized also creates greater resource abstraction, and that too can increase complexity. To control a virtualized environment, IT administrators need a consistent, simplified tool set. With Hyper-V, they can reuse their knowledge for managing physical servers with extensions to extend role-based dynamic self-managing systems in the new virtualized environment.
Beyond server consolidation, VOEs can vastly improve scalability by leveraging the mobility of VMs to automate load balancing while also improving traditional RAS concerns. With physical servers, IT must trade off between processing slowdowns during periods of peak processing or higher capital and operating costs by provisioning servers for peak usage. On the other hand, when backed by a scalable, high-throughput SAN fabric, VMs can be automatically migrated to a more robust VOE server based on real- time processing loads.
Test configuration
We setup a Dell PowerEdge 1900 server with a quad-core Xeon processor and 4GB of RAM to anchor the infrastructure supporting our VOE. For a SAN fabric, we installed an 8Gbps QLogic QLE2560 HBA in the server and connected the HBA to a port on a QLogic SANbox 5802V switch. We then installed the 64-bit Datacenter edition of Windows Server 2008. The Datacenter edition provides for the unlimited licensing of VMs hosted under Microsoft’s 64-bit Hyper-V server virtualization platform. An integrated feature of the 64-bit version of Windows Server 2008, Hyper-V is available with the full releases—Standard, Enterprise, and Datacenter—of Windows Server 2008 and a thin Core Server release, which only provides a command line interface.
With quad-core processors and fast PCI-e buses for peripherals, commodity PC servers typically host four to eight VMs. In that kind of environment, an 8Gbps infrastructure can provide an immediate payback through more efficient utilization of storage resources. From the perspective of I/O scalability, a single-port, 8Gbps QLE2560 HBA can provide a Hyper-V server with enough I/O bandwidth to support four VMs. AS such, IT avoids extra costs associated with provisioning multiple HBAs and switch ports for a VOE server.
While the primary rational for server virtualization is related to simplified resource management, powerful secondary benefits arise out of the ease with which IT can consolidate resources in a VOE from a physical environment where most server applications require a significant amount of I/O performance. As a result, a VOE server will need to exhibit significant performance and scalability. To provide a platform for multiple server applications, a VOE server must be able to handle the total aggregate I/O required of each consolidated applications server.
I/O scalability and performance is particularly important for Hyper-V, which is at the center of an old controversy over virtualization architecture with regards to the handling of I/O. Hyper-V, like Xen, is built on a hypervisor model. Under that architecture, a VM deals with virtual devices and a microkernel passes I/O requests from virtual devices to real device drivers, which reside in the base Windows Server 2008 or Server Core installation, which is dubbed the Parent Partition in the Hyper-V architecture.
The result is an efficient I/O delegation scheme that has very low overhead at the VM. Under Hyper-V, the main delegation components are Virtual Service Clients (VSCs), which are at the bottom of I/O stacks in child partitions; Virtual Service Providers (VSPs), which invoke the actual device drivers in the Parent Partition; and the VMbus, which sends communications across partitions. While this is the preferred method to support I/O, Hyper-V also provides fully emulated virtual devices such as a Fast (10/100) Ethernet adapter and a virtual IDE adapter.
<TABLE height=450 cellSpacing=2 cellPadding=2 width=500 align=center border=0><TBODY><TR><TD><INPUT class="" id="/etc/medialib/infostor/openbench-lab_reviews#Par.92268.Image " type=image height=416 width=450 src="http://www.infostor.com/etc/medialib/infostor/openbench-lab_reviews.Par.92268.Image.450.416.1.gif" longdesc="<"></TD></TR><TR><TD>We configured both the Hyper-V environment and each of the five VMs running within Hyper-V using the Hyper-V Manager. In particular, we associated SCSI adapters with the QLogic drivers for the QLE2560 by attaching logical disks connected to the host server via the QLE2560, but not placed online within Windows Server 2008.</TD></TR></TBODY></TABLE>
The critical metric
To test the ability of QLogic’s 8Gbps infrastructure to support Hyper-V scalability with respect to I/O, openBench Labs needed to stress the I/O capabilities of the QLE2560 HBA and I/O delegation within the Hyper-V environment for multiple VMs and multiple logical disks. That made I/Os per second (IOPS) a critical metric for our benchmarks.
For encountering potential SAN and VOE bottlenecks in IOPS performance, our test scenario for VOE scalability represents an absolute worst case. We deliberately designed a SAN fabric topology that maximized stress on the QLE2560 HBA by converging eight 4Gbps data paths—four dual-ported 4Gbps HBAs—onto one port of the 8Gbps QLE2560 HBA. As a result, I/O bottlenecks associated with arrays built with mechanical drives and not germane to SAN transport and VOE issues, would degrade the clarity of our test scenario. That made using a solid-state disk (SSD) array, such as the Texas Memory Systems RamSan, essential for our analysis.
Using Hyper-V Manager, openBench Labs configured four identical VMs, each with one CPU, 756MB of RAM, and one external disk volume hosted by the RamSan SSD. We used these VMs to test I/O scalability when multiple VMs are running on a VOE server. We also configured a fifth VM (HV10), which had four CPUs, 3GB of RAM, and four external RamSan-hosted volumes, in order to test I/O scalability for a single VM.
For I/O, Hyper-V provides VMs with virtual SCSI and virtual IDE controllers. By default, Hyper-V requires a virtual IDE controller, which is an emulated device, for the VM boot disk. The VMbus mechanism does not exist at boot time for a VM, which makes booting from a disk associated with a virtual SCSI controller impossible to use at boot time. As a result, Hyper-V does not automatically configure a virtual SCSI controller.
What’s more, there is a distinct potential for compatibility issues associated with SCSI device drivers, which must reside in the Parent Partition. As a result, administrators must add virtual SCSI adapters manually via the Hyper-V Manager. Nonetheless, there is less overhead processing involved with a VSC-VSP pair. In addition, virtual SCSI controllers support deeper SCSI command queues with multiple outstanding SCSI commands on the bus at the same time. That means a virtual SCSI controller should provide measurably higher throughput performance within the VM.
<TABLE cellSpacing=2 cellPadding=2 width=450 align=center border=0><TBODY><TR><TD></TD></TR><TR><TD>We used Iometer to drive I/O throughput in scalability tests of Windows Server 2008 and Hyper-V. All I/O requests and throughput was measured at the single QLogic HBA port. With 4KB I/O requests and multiple logical drives, we reached I/O levels that surpassed the throughput capabilities of a 4Gbps HBA.</TD></TR></TBODY></TABLE>
Getting physical
From the perspective of the Hyper-V Parent Partition, disks for VMs can be either container files, which represent fixed, differencing, or dynamically expanding disks, or raw off-line physical disks. Also dubbed pass-through disks, physical disks are nearly invisible to the Parent Partition and have no size limitation other than what is imposed by the VM’s operating system. What’s more, physical disks can easily be accessed by other physical servers as well as other VMs.
The natural affinity for device sharing makes physical disks a key Hyper-V option for IT sites with Fibre Channel or iSCSI SAN fabrics. For all benchmark testing, openBench Labs used a virtual SCSI controller and physical disks exported by the Texas Memory Systems RamSan SSD. In these tests, our principle concern centered on the number of IOPS that could be sustained, which provides a critical I/O health measure for a VOE. Our secondary concern was the measurement of I/O throughput, which provides the best insight into SAN fabric infrastructure bottlenecks.
In testing I/O scalability, we first wanted to determine the amount of I/O traffic that a single port on an 8Gbps QLogic QLE2560 HBA could sustain. Second, we wanted to determine how well Hyper-V could utilize the QLE2560 to scale I/O levels using multiple VMs. Finally, we wanted to determine if a single VM could scale to an I/O load level that would require the use of an 8Gbps QLE2560.
We began by setting up three test scenarios, which used the Intel Iometer benchmark to generate I/O requests. In all of our scenarios, we employed 8GB volumes exported on individual 4Gbps controllers from the RamSan SSD. On each volume, we limited the command queue to 30 outstanding I/O requests. We then split I/O between reads and writes in a 75-to-25 ratio. As a direct result of that split in reads and writes, read channel throughput would become saturated as total throughput for the QLE2560 approached 1,075GBps.
For a baseline test, openBench Labs focused on assessing the capabilities of the QLE2560. With our quad-core server running Windows Server 2008 without Hyper-V, we set up four 8GB disks on the RamSan and ran Iometer on one through four drives. During the benchmarks, we measured I/O throughput in terms of both IOPS and MBps.
Our next test scenario focused on assessing the scalability of the Hyper-V VOE in terms of running multiple VMs with the QLogic 8Gbps SAN fabric infrastructure. This is a key to maximizing server resources and achieving a high consolidation ratio. In this test scenario, we configured four identical VMs. We provisioned each VM with one CPU, 756MB of RAM, and one physical test disk from the RamSan connected to a virtual SCSI controller. We then ran Iometer on one through four VMs.
Our final test examined the ability of a single VM to scale in order to handle a large-scale application. In this test we configured a VM with four CPUs and 3GB of RAM. We configured the I/O subsystem for this VM in two ways: First we connected four test disks to one virtual SCSI adapter and then we connected each test disk to its own virtual SCSI adapter. Each VM can be configured with up to four virtual SCSI adapters. We then ran Iometer on the VM using one through four test drives.
Block stress
All of these tests were run using three different I/O block sizes: 4KB, 8KB, and 64KB. We started with 4KB I/O blocks, which are used by MS Exchange to support a high volume of email transactions based predominantly on short messages. With 4KB blocks, we placed maximum stress on both the VSC-VSP mechanism within Hyper-V to pass I/O requests and the QLogic QLE2560 I/O engine to maintain I/O traffic with four independent 4Gbps controllers connected to SSDs.
Running Windows Server 2008, IOPS performance increased about 40% with the addition of another worker process and a new test disk from the RamSan. With four workers and four disks from the RamSan, performance exceeded 200,000 IOPS and the total volume of data throughput exceeded 700MBps. That meant we had surpassed the data throughput level on reads that a single 4Gbps Fibre Channel port on our server would be able to support.
Running our multiple VM test, performance closely paralleled our Windows Server 2008 test for the first two VMs. When we added the third and fourth VMs, IOPS performance increased by about 20% each time. This brought performance with four VMs to just over 160,000 IOPS. Once again, for the read component of the data that we were moving, we had exceeded the capabilities of a 4Gbps HBA.
Following our multiple VM scalability tests, we ran tests on the scalability of a multiprocessor VM with multiple drives. With each disk given its own virtual SCSI adapter, I/O performance with a single VM with multiple drives scaled identically to the I/O scaling exhibited with multiple VMs. Nonetheless, with respect to IOPS, four drives on one virtual SCSI adapter did not scale as well.
Next we utilized random 8KB I/O blocks, which typify the I/O transactions found in database-driven applications. With 8KB blocks, we continued to put significant stress on both the VSC-VSP mechanism and the QLogic QLE2560 HBA to make transactions; however, we also doubled the data throughput.
<TABLE height=293 cellSpacing=1 cellPadding=1 width=450 align=center border=1><TBODY><TR><TD></TD></TR><TR><TD>By increasing the I/O size to 8KB, which is the size used by typical database-driven applications, I/O throughput reached wire speed for an 8Gbps Fibre Channel SAN. As a result, our scalability tests began to converge on that limit.</TD></TR></TBODY></TABLE>
By doubling the amount of data per request, we crossed a threshold not crossed with 4KB requests: we saturated the read channel of the 8Gbps HBA. With just three drives in our initial Windows Server 2008 scalability tests, we exceeded 122,000 IOPS and reached a total throughput of 960MBps. As a result, adding a fourth disk produced only a 5% improvement in IOPS and total data throughput in MB per second. More importantly, all I/O—both reads and writes—coming from the QLogic HBA and measured at the Texas Memory array was perfectly balanced across all four logical disks.
We measured that same pattern in all of our VM scalability tests. With 8KB requests, the ultimate rate limiting factor was the 8Gbps HBA. As a result, we began to see our scalability tests converge to just under 129,000 IOPS and just over 1,000MBps of data throughput.
<TABLE height=239 cellSpacing=1 cellPadding=1 width=454 align=center border=1><TBODY><TR><TD>
</TD></TR><TR><TD>When we viewed Iometer performance with four VMs running under Hyper-V and issuing 8KB I/O requests to a dedicated logical drive on the RamSan, we observed that I/O from the QLogic HBA was perfectly balanced across the four ports connected to the four logical drives.
</TD></TR></TBODY></TABLE>
Test conclusions
In our final tests, we used 64KB I/O blocks, which are found in business intelligence applications such as on-line analytical processing (OLAP), data mining, and data warehousing. I/O in these applications is the antithesis of I/O in messaging applications: There are a limited number of users and the speed at which large volumes of data can be moved dominates in importance. Now data throughput totally dominated our tests which were identical in all cases as two drives or two VMs were enough to saturate reads on our 8Gbps fabric at wire speed.
The number of IOPS sustained in all of our tests clearly indicates that a Hyper-V VOE based on an 8Gbps SAN infrastructure is able to scale and support a high number of VMs, which will easily provide for a high consolidation ratio. Equally important, the scalability that this infrastructure provides a VM enables the hosting of the most I/O-intense applications.
Jack Fegreus is CTO of openBench Labs.
********
openBench Labs Scenario
UNDER EXAMINATION
8Gbps SAN infrastructure
WHAT WE TESTED
8Gbps QLogic QLE2560 8Gbps HBA and SANbox 5802 Fibre Channel switch
HOW WE TESTED
Dell PowerEdge 6850 Server
-- (4) 3.8Ghz Xeon CPUs
-- 8GB RAM
-- Windows Server 2008
-- Hyper-V
-- Multiple VMs running Windows Server 2008
BENCHMARK
-- IOmeter
KEY FINDINGS
-- Near linear IOPS scalability for Windows Server 2008.
-- The QLogic 2500 Series HBA supported IOPS loads of 200,000 with 4KB I/O and full-duplex throughput at wire speed.
-- Single Hyper-V VM benchmarked at 145,000 IOPS using 4KB random read requests per second.
-- With four VMs simultaneously reading and writing data, Hyper-V sustained 159,000 4KB I/O requests per second, which represented a throughput rate 15% greater than the throughput sustainable with a 4Gbps HBA.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Nexsan ships AutoMAID disk array
By Kevin Komiega
July 21, 2008 -- Nexsan Technologies today introduced the DATABeast, a storage system capable of scaling to 4PB of capacity and designed to save power with AutoMAID (Automatic Massive Array of Idle Disks) technology.
The DATABeast stores up 336TB per 42U enclosure and can be configured with SAS and/or SATA drives. The system includes 4GBps Fibre Channel and NAS interfaces for simultaneous block and file access. It also includes tools for thin provisioning, storage pooling, tiering, virtualization, mirroring, snapshots, and replication.
AutoMAID technology reduces energy use without forcing users to choose between power savings and application performance.
Bob Woolery, Nexsan's senior vice president of marketing, says the common on/off approach to MAID can compromise performance, while Nexsan's techniques function more like a dimmer switch.
"While traditional MAID technology is energy-efficient, it can take several minutes to bring those systems online," says Woolery. "Our approach allows users to select the level of energy savings they want without affecting performance."
AutoMAID gives users the option of choosing among three different levels of energy savings, each with its own performance metrics. The first level offers 20% power savings with response times of less than 10 seconds, and once the first file is accessed the disks perform at full speed. For level two, AutoMAID slows down the drives to about 4,000rpm for energy savings of up to 40% and response times of 15 seconds or less. The third level puts drives into what Woolery calls a "light sleep," yielding up to 60% energy savings. Once the drive comes out of its level-three nap, the first I/O transaction takes 30 seconds or less.
Nexsan categorizes the DATABeast as having Tier-1 features and scalability at Tier-2 pricing. Woolery claims that competing systems with similar management, data protection and scalability features can run from $5,000 to $10,000 per terabyte, while the DATABeast ranges from $1,800 to $2,700 per terabyte, depending on configuration. An approximate starting price for the system is around $200,000, according to Woolery.
Related articles:
Nexsan ships SAS array with MAID
How green is your storage?
Nexsan targets Apple Xserve market

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Nexsan enables archiving-as-a-service
By Kevin Komiega
October 6, 2008 -- In an effort to make online archiving more cost-effective for managed service providers (MSPs), Nexsan today announced the availability of Assureon 6.0, a new version of its archiving platform designed to alleviate the security, management, cost, and energy consumption concerns associated with content-addressable storage (CAS) archives.
Bob Woolery, Nexsan's senior vice president of marketing, says online archiving services have been hindered because users don't want their files co-mingling with anyone else's, which forces MSPs to deploy multiple archiving systems.
"Our new software architecture enables archive-as-a-service because it allows providers to guarantee that they can separate different organizations' data in a single archiving system, which lets them pool and share storage versus installing separate archiving systems for individual customers," Woolery says.
The Assureon 6.0 software architecture can virtualize a system into an unlimited number of physically secure archives, according to Woolery, making it a viable archiving platform for hosted storage service providers.
The system scales to support multiple virtual file systems, creating independent CAS archives within one consolidated, or federated, archive. Woolery says this capability provides physical separation of individual customers' data.
The Assureon 6.0 technology is embedded in 2U servers and attaches via Fibre Channel to Nexsan's SATABoy, SASBoy, and SATABeast disk arrays. The archive can subsequently take advantage of the power-saving features of Nexsan's arrays and AutoMAID technology.
AutoMAID places unused data into progressively more idle states in order to save energy. It gives users the option of choosing among three levels of energy savings, each with its own performance metrics. The first level offers 20% power savings with response times of less than 10 seconds, and once the first file is accessed, the disks perform at full speed. For level two, AutoMAID slows down the drives to about 4,000rpm for energy savings of up to 40% and response times of 15 seconds or less. The third level puts drives into what Woolery calls a "light sleep," yielding up to 60% energy savings. Once the drive comes out of its level-three nap, the first I/O transaction takes 30 seconds or less.
Assureon 6.0 also includes digital fingerprinting technology that tracks the chain of custody for each file for compliance purposes.
Nexsan has not disclosed pricing for Assureon 6.0.
Related articles:
Nexsan ships AutoMAID disk array
Nexsan ships SAS array with MAID
Nexsan targets Apple Xserve market

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Do you really need a SAN?
Application-centric storage models challenge the dominance of SAN architectures.
By Andrew Reichman
February 10, 2009 -- Most IT organizations today struggle with high costs and complexity related to their enterprise storage needs. While the industry dogma for the past several years has pointed to use of a SAN for best application performance, data protection and consistency of management, the experience of most organizations has been less than satisfactory.
Now, a number of application vendors are proposing to take on more storage management tasks within their code base, aiming at delivering simpler, cheaper storage. While applications have a lot of ground to cover to gain parity with products offered by dedicated storage vendors, there are significant potential benefits if they get it right.
The problem
A combination of technology inflexibility and organizational dysfunction has led most enterprise storage systems down a path of inefficiency. SAN-based storage was supposed to deliver the following:
• Reduced hardware acquisition costs through increased utilization
• Less complexity through consistent management
• Better performance through increased aggregate spindle count
• Improved data protection through array-based replication and backups
Unfortunately though, the reality is that many of these goals have not been accomplished. Forrester Research interacts with hundreds of enterprises every year and the vast majority of these organizations have little good to say about their storage environments. Firms consistently identify storage as among the costliest and most complex areas of IT service delivery, and cite the following problems as specific woes:
• Low aggregate utilization leads to high cost of infrastructure
• Limited workload-sharing creates islands of stranded storage capacity
• Vendor heterogeneity limits compatibility of infrastructure components
• Block storage devices have limited context of information, hindering tiering and archiving initiatives
Application-centric storage is emerging as a possible alternative. Managing storage directly within the application could break down some of the organizational barriers that limit efficiency. Storage experts that report to application teams would be better in tune with forecasting growth and could streamline the provisioning process. Managing to defined SLAs would be easier as well if the infrastructure and application staff were more closely aligned. Applications are likely to have better success with tiering and archiving since they have more context about the value of information at any given time than do block storage systems. Configuration could become less complex since the hardware and software would be more tightly coupled, placing less reliance on architecture experts.
Finally, there is a potential for application software to manage cheaper hardware composed of industry standard components, compared to custom ASIC-based storage hardware that is far more expensive.
Instead of striving in vain for a consistent pool of efficient storage resources, building silos around applications that have the native ability to manage the resources themselves could be a viable option, and key application vendors are moving in this direction. Some of the emerging options include the following:
• Oracle offers branded storage hardware under the Exadata label. Using its Automatic Storage Management (ASM) feature to control the flow of data between application servers, and storage servers provided by technology partner HP, the Exadata system is a hybrid of software and industry standard hardware that Oracle will sell directly.
• Microsoft recommends DAS for Exchange 2007. Exchange has long had a best practice for configuration that precluded adding other workloads to the storage array it runs on, which has served to put it on an island. Now, with the addition of Cluster Continuous Replication (CCR), the application can manage its own high availability and replication.
• VMware offers native storage management capabilities. The virtualization market leader continues to add storage features to its application stack. For example, VMFS offers volume management capabilities within virtual machine management, and VMware recently announced vStorage includes native thin provisioning and other key storage features. While many of the storage features within VMware leverage SAN array capabilities, the control and management is substantially shifted to the application realm.
Clearly, the concept of application-managed storage represents a significant shift from current understanding of maximum efficiency and capability in the enterprise, and a move to such a model would not be without pain. Enterprise storage changes very slowly, with new architectures having to fit refresh cycles to avoid disruptive and costly forklift upgrades. IT purchasers are notoriously conservative, so significant momentum will need to be present before a majority will move in this direction. Most significantly, the capabilities of the applications will need to continue to improve, because such a shift cannot happen at the expense of application performance or data availability. In the current economic times, it will be incumbent on both pure storage vendors and application vendors that may unseat them that their solutions are the most effective choice, both operationally and economically. Time will tell whether this becomes a true sea change in the enterprise, or if it is just a minor blip in the current evolution of enterprise storage.
ANDREW REICHMAN is a senior analyst at Forrester Research. To receive free related research from Forrester, visit Forrester Research: Welcome to Forrester Campaign Registration.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
6Gbps SAS

Pakistan is the Twenty20 Champions for the Year 2009

-
-
The great convergence: FCoE and Enhanced Ethernet
Ratification of the standard is expected this year, with mainstream adoption beginning next year. Here's why Fibre Channel over Ethernet makes sense for enterprise data centers.
By Christine Taylor
February 16, 2009 -- The major driver for Fibre Channel over Ethernet (FCoE) is to marry the massive economics of Ethernet to huge corporate investments in Fibre Channel. The advantages of unification are significant in the data center – reducing equipment, leveraging Fibre Channel, and centralizing storage.
FCoE could be of real benefit to large enterprise data centers. These entities pour billions of dollars per year into Fibre Channel storage, and being able to leverage that investment over 10Gbps Enhanced Ethernet is a very attractive proposition.
FCoE also makes a particularly compelling argument for applying Fibre Channel SAN storage to high-speed, short-range networks such as blade server backplanes and virtualized servers that are commonly found at the data center edge.
FCoE at-a-glance
The concept of the FCoE standard is a simple one: FCoE enables Fibre Channel frames to run over a 10Gbps Enhanced Ethernet LAN segment, enabling converged networks.
FCoE does not try to be everything to everyone, as smaller environments with Ethernet-only storage do not require FCoE connectivity. For Fibre Channel users, however, FCoE provides the ability to extend Fibre Channel storage from the data center core to its edge.
This scenario requires two related protocols: first is FCoE itself, a transport standard that enables native Fibre Channel frames to run over Ethernet. The second is Enhanced Ethernet, which FCoE requires in order to transport Fibre Channel over Ethernet.
The FCoE standard enables Fibre Channel traffic to run across multiple Enhanced Ethernet LAN segments within the same Layer 2 bridging domain. It supports SAN management domains by maintaining logical Fibre Channel SANs across a 10Gbps Enhanced Ethernet segment. FCoE enables Fibre Channel frames to run with no performance degradation and without making any changes to the frames.
Enhanced Ethernet - also called Converged Enhanced Ethernet (CEE), Data Center Ethernet or Data Center Bridging (DCB) - eliminates Layer-3 TCP/IP protocols in favor of native Layer-2 Ethernet. Traditional Ethernet commonly experiences network congestion, latency and frame dropping, which renders it unreliable for Fibre Channel traffic. However, 10Gbps Enhanced Ethernet changes this by dispensing with TCP/IP in favor of a "lossless" Ethernet fabric.
The lossless environment's basic requirements are Priority Flow Control (priority pause), ETS (scheduler), and the discovery protocol. (Congestion management is attractive but optional.) These capabilities allow the Fibre Channel frames to run directly over 10Gbps Ethernet segments with no performance degradation.
A question of standards
Neither the FCoE nor Enhanced Ethernet standards are completed as yet, but major networking and system vendors have agreements in place and are actively qualifying their products anyway. This is not surprising because most of these vendors belong to both the T11 (FCoE) and IEEE (Enhanced Ethernet) standards groups so they are in a good position to make settlements as ratification winds its way through the standards process. Of the two standards, FCoE is farther along, and should be ratified later this year.
Enhanced Ethernet will take longer, as the IEEE is not exactly known for its lightning-fast ratification speed. Meanwhile, the vendor community has agreed upon the Ethernet standard they will submit to IEEE and will use it to implement the initial commercialized versions of FCoE products. FCoE-enabled networking components are shipping now, with OEM qualifications expected within a few months.
The lack of ratification and the resulting basic integration levels will keep FCoE to the data center edge and non-critical server environments. But in the corporate data center most mission-critical servers are storing to Fibre Channel with a direct port connection anyway and do not require FCoE or Enhanced Ethernet to do so. There are advantages to using FCoE and Enhanced Ethernet, even in mission-critical servers, but for now the data center can leave well enough alone as the standards are ratified and interoperability improves. (Enhanced speed, unified I/O and reducing redundant servers and cabling are attractive to the data center core if the protocols are stable.)
New equipment
The necessary equipment to support these protocols include 1) Enhanced Ethernet switches to provide 10Gbps Enhanced Ethernet, 2) converged network adapters (CNAs) that support both Ethernet and Fibre Channel, and 3) an FCoE forwarder that performs the stateless encapsulation/de-encapsulation function. The FCoE forwarder is significantly lighter weight than a gateway. A gateway must terminate one protocol such as iSCSI and initiate another protocol such as Fibre Channel in an iSCSI-to-FCoE gateway. iSCSI requires two sessions: one from the initiator to the gateway, and one from the gateway to the target. With FCoE, there is one Fibre Channel session from the FCoE initiator to the Fibre Channel target.
FCoE requires some new investment in equipment, but the new products are not dedicated to FCoE. The Enhanced Ethernet switches will share 10Gbps Ethernet with all other Ethernet traffic, while the CNAs will provide the functionality of HBAs with additional FCoE connectivity, so the cost will not be that much more than what the enterprise is spending now on enterprise storage resources. Wider adoption rates should also bring down initial costs. There are no additional costs to using FCoE with Fibre Channel because the SAN's hardware, software and operations remain unchanged. And Enhanced Ethernet benefits not only FCoE but also Ethernet traffic by offering isolated traffic classes, lossless transmission, and 10Gbps speeds.
Managing converged fabrics
FCoE adoption will not be without its challenges. Equipment cost is not a huge factor, but troubleshooting may be. Solving errors is more straightforward in a dedicated network than a converged network, so having the ability to separately manage converged fabrics on the same physical pipe will be extremely important. VLANs are the likely solution for this issue, and will allow storage administrators to separately manage Fibre Channel as the Ethernet administrators manage Enhanced Ethernet. FCoE-enabled 10Gbps switches will replace separate LAN switches and Fibre Channel directors.
VLANs will also allow Fibre Channel administrators to retain existing operational procedures. Even so, we expect some practices overlap and management issues as FCoE is dependent on the Ethernet segment and its resources. Also, servers hosting mission-critical applications generally have large numbers of Ethernet and Fibre Channel ports, more than can reasonably be converged into a redundant pair of 10Gbps ports. Edge deployments, with fewer ports, will benefit more from convergence.
FCoE and the data center
CNAs are in corporate testbeds now. Customers' primary interest in FCoE centers on intensive computing environments that are located in the data center, but are not attached to the SAN. FCoE is about extending and leveraging existing Fibre Channel resources to these environments. Together with Enhanced Ethernet, FCoE provides three important benefits to data center administrators: It 1) enables them to replace direct-attached storage (DAS) with existing centralized storage, 2) leverages Fibre Channel investment because administrators do not have to purchase a separate iSCSI SAN, and 3) delivers 10Gbps Enhanced Ethernet networking to high performance environments.
These high performance environments face several issues that FCoE and Enhanced Ethernet are positioned to solve. These issues include the prevalence of DAS, the perceived need to deploy iSCSI SANs to centralize storage, and a large tangle of energy-consuming cables and redundant equipment. The majority of testbed deployments exist for these reasons, and we expect early adoptions in 2009 to remain in these types of environment. As deployments prove stable and the standards are ratified, we will see FCoE and Enhanced Ethernet move into mission-critical applications within a few years. Mainstream adoption will lag a year or two behind the early adopters, but with very strong interest in FCoE – and given the benefits of FCoE and Enhanced Ethernet -- we should see mainstream testbed deployments in 2009 and edge adoptions in 2010.
There will certainly be organizational and budgetary challenges. Still, we believe that the advantages of centralizing storage on existing Fibre Channel SANs outweigh those issues.
ROI #1: Centralize storage for intensive server environments. Blade and virtualized servers at the data center edge traditionally store data to DAS. However, a converged Ethernet fabric based on Enhanced Ethernet and FCoE provides wide bandwidth, high speed, and access to Fibre Channel SANs. Extending Fibre Channel storage to these I/O-intensive, high-performance environments allows IT to eliminate inefficient DAS and to leverage existing Fibre Channel SAN, instead of purchasing iSCSI SANs. Enhanced Ethernet benefits I/O-intensive environments, as well, because they require massive amounts of bandwidth and speed.
ROI #2: Reduce complexity and data center build-out by consolidating servers. Another early use of FCoE will be in server and network consolidation. Typical server clusters in data centers have five to seven I/O interfaces for different networks and redundant builds. FCoE and Enhanced Ethernet unify I/O through multi-protocol switches and host-based CNAs, allowing IT to sharply reduce the number of network devices, server-to-network interfaces, and cables that now interconnect the clusters. The number of interfaces shrinks down to, for instance, two 10GbE ports, two cables, and two switch ports.
Another consolidation advantage is that FCoE-enabled CNAs provide a standardized method of Fibre Channel SAN connectivity, which simplifies physical architecture and provisioning. As with I/O connectivity, there is no need to locate available Fibre Channel services, since all data center servers will have the CNAs, which enable IT or policy-driven operations to provision Fibre Channel services at will.
ROI #3: Help achieve energy-efficient data centers. Consolidation and network unification also reduce energy costs related to networking and storage in the data center. In general, networking is not as large an energy consumer as storage. But a converged fabric will yield energy savings by reducing the number of cables, interfaces and redundant servers in the data center. However, by using FCoE to extend Fibre Channel storage to more data center servers, administrators avoid adding additional energy-hungry disk arrays. By centralizing storage on the SAN instead of using multiple DAS and iSCSI arrays, IT can significantly reduce the amount of power the arrays require, the rack space they take up, and the cooling they need.
ROI #4: Consistent SAN connectivity. CNAs enable dynamic Fibre Channel SAN configuration on Ethernet servers. There is no need to configure additional connections to a separate Fibre Channel port from the server. Fibre Channel connectivity also replaces inefficient DAS architectures. This simplifies physical architecture and provisioning, and does not require the equipment soup required by multiple connections to different fabrics. It also enables storage administrators to efficiently manage storage resources through Fibre Channel instead of Fibre Channel plus iSCSI and/or DAS.
Even though FCoE and Enhanced Ethernet are not yet ratified, they are buoyed by vendor agreements and qualifications. Ratification will be important as it will enable deeper integration and interoperability. This will allow data center administrators to trust FCoE and Enhanced Ethernet in the core data center environment as well as in edge-based virtualized and blade servers. But in the meantime, there is enough vendor support and enterprise interest to bring commercialized products to early adopters this year, and to the mainstream next year. Current FCoE equipment providers include Brocade, Cisco, Emulex and QLogic, but most storage and Ethernet vendors are deeply involved at the working group and integration levels.
In the meantime, InfiniBand vendors are sensing an opportunity to expand beyond their high performance computing (HPC) niche. With just a few vendors involved (relative to the packed Ethernet field), InfiniBand changes are far easier to ratify, and InfiniBand already provides much of the lossless environment that Enhanced Ethernet will provide.
FCoE/EE proponents should keep on pushing commercialization and standardization if they want to see a major data center market open up for the new protocols. At this point, we still do expect to see fully ratified FCoE and Enhanced Ethernet standards converging Fibre Channel and Ethernet for increased performance, greater data center density without spiraling energy costs, and lower capital and maintenance costs. We believe that FCoE champions and their interoperability partners will play a large part in achieving the converged data center of 2009 and beyond.
CHRISTINE TAYLOR is an analyst with The Taneja Group research and consulting firm.

Pakistan is the Twenty20 Champions for the Year 2009

-
-
Availability of GridBank Announced by Tarmin Technologies
Tarmin Technologies, a provider of active archive, content addressable storage (CAS), information life-cycle management (ILM) and intelligent storage software, announced on Thursday (12 February) general availability of its GridBank product, an enterprise-class active archiving and next-generation intelligent storage software.
According to the company, GridBank enables organisations running Windows, Linux, Solaris, HP-UX, AIX or virtual servers (VMWare or HyperV) to reduce their storage Total Cost of Ownership (TCO). The solution reduces the cost and complexity of retaining, managing and securely accessing unstructured data, while satisfying compliance requirements by ensuring secure, long-term data retention, e-discovery support and fast search and retrieval of business records. In addition, it delivers automation to increase productivity and reduce storage and IT costs.
The product utilises affordable industry-standard, heterogeneous server and storage platforms to form a grid-based active archive and intelligent, scalable storage solution, enabling long-term, fixed-content data preservation on cost-effective secondary and tertiary storage tiers.
No pricing details were disclosed.

Pakistan is the Twenty20 Champions for the Year 2009

-
Similar Threads
-
By Chandni in forum

Chit Chat Corner
Replies: 41
Last Post: 09-26-2009, 10:05 PM
-
By SaGhiR in forum

Music - Discuss, Enjoy and Download
Replies: 0
Last Post: 12-06-2008, 03:23 AM
-
By FaZi KhaN in forum

Imaging Zone
Replies: 24
Last Post: 11-14-2008, 07:49 PM
-
By FaZi KhaN in forum

Chit Chat Corner
Replies: 58
Last Post: 10-18-2008, 02:04 AM
-
By RH Positive in forum

Movies & Dramas Downloads & Discussion
Replies: 4
Last Post: 09-11-2008, 05:45 AM
Tags for this Thread
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
Forum Rules
Bookmarks