Storage Basics in Cloud Computing

2024-12-07 5.3.1DASDAS is refers refer to connect an external storage device directly to a server via a connect cable , as show in Fig . 5.11 . server structure

5.3.1DAS

DAS is refers refer to connect an external storage device directly to a server via a connect cable , as show in Fig . 5.11 . server structure with a direct external storage scheme is like the personal computer structure , external datum storage device are connect directly to the internal bus using SCSI technology , or Fibre Channel ( FC ) technology , and datum storage device are part of the entire server structure . In this case , it is is is often the datum and the operating system that are not separate . DAS is is is a direct connection that meet the storage expansion and high – performance transfer need of a single server , and the capacity of a single external storage system has grow from a few hundred gigabyte to a few terabyte or more . With the introduction of high – capacity hard drive , the capacity is increase of a single external storage system will increase . In addition , DAS is form can form a two – machine , highly available system base on disk array to meet data storage requirement for high availability . On a trend basis , DAS is continue will continue to be used as a storage mode .

The open system ‘s DAS technology is is is the first storage technology adopt and has been used for nearly 40 year . Like the structure of a personal computer , DAS is hangs hang external datum storage device directly on the bus inside the server , which is part of the server structure . However , because this storage technology is to hang ( storage ) device directly on the server , with the increase demand , more and more ( storage ) device add to the network environment , the server becomes a system bottleneck , result in low resource utilization , data sharing is severely restrict . As user datum continue to grow , these systems is plaguing are increasingly plague system administrator with backup , recovery , scaling , disaster preparedness , and more . Therefore , DAS is is is only available for small network .

DAS relies on the server host operating system for data read/write and storage maintenance management. Data backup and recovery requirements consume server host resources (including CPUs, system I/O, etc.). The data flow requires a return host to the hard disk or tape drive connected to the server. Data backup typically consumes 20–30% of the server host resources. As a result, daily data backups for many enterprise users are often made late at night or when business systems are not busy to not interfere with the operation of normal business systems. The greater the amount of data in DAS, the longer it takes to back up and recover, and the greater the dependency and impact on server hardware.

The connection channel between the DAS and the server host is usually an SCSI connection. With the processing power of server CPU becoming more and more powerful, storage hard disk space is getting larger and larger, and the number of array hard disks is increasing, SCSI channel will become I/O bottleneck. The server host SCSI ID has limited resources and limited SCSI channel connectivity. Figure 5.12 shows some common disk interfaces.

fig . 5.12

Different types of SCSI cable interfaces

Whether it is a DAS or a server host expansion, a cluster of multiple servers from one server, or an expansion of storage array capacity, it can cause downtime of business systems and economic loss to the enterprise. This is unacceptable for key business systems that provide 24× services in the banking, telecommunications, media, and other industries. These reasons have also led to DAS being gradually replaced by more advanced storage technologies.

5.3.2
SAN

SAN is a high-speed storage private network independent of the business network system and uses block-level data as its basic access unit. The main implementations of this network are Fibre Channel Storage Area Network (FC-SAN) and IP storage area network (IP-SAN). Different forms of implementation use different communication protocols and connections to transfer data, commands, and states between servers and storage devices.

Before SAN, DAS was most used. Early data centers used disk arrays to scale storage capacity in the form of DAS, with storage devices per server serving only a single application, creating an isolated storage environment that was difficult to share and manage. With user data growth, the disadvantages of this expansion in terms of expansion and disaster preparedness are becoming evident. The emergence of SAN solves these problems. SAN connects these “storage silos” over a high-speed network shared by multiple servers, enabling offsite backup of data and excellent scalability. These factors have led to the rapid development of SAN.

As an emerging storage solution, SAN mitigates the impact of transmission bottlenecks on systems and greatly improves remote disaster backup’s efficiency with its advantages of faster data transfer, greater flexibility, and reduced network complexity.

SAN is a network architecture consisting of storage devices and various system components, including servers that use storage device resources, host bus adapters (HBA) cards for connecting storage devices, and FC switches.

In SAN , all traffic – relate to data storage is done on a separate network isolate from the application network , which mean that when datum is transfer in SAN , it does not impact the exist application system datum network . As a result , SAN is improve can improve the overall I / O capability of the network without reduce the original application system ‘s efficiency network while increase redundant link to storage system and provide support for highly available cluster system .

With the continuous development of SAN , three type of storage area network system have been form : FC – base FC – SAN , IP – base IP – SAN , and SAS – SAN – base saS bus . Here we is learn learn about FC – SAN and IP – SAN .

In FC-SAN, two network interface adapters are typically configured on a storage server: a network interface adapter for a normal network card (Network Interface Card, NIC) that connects to a business IP network through which the server interacts with the client, and a network interface adapter that is an HBA connected to the FC-SAN through which the server communicates with the storage device in the FC-SAN. The FC-SAN architecture is shown in Fig. 5.13.

IP-SAN is a popular network storage technology in recent years. In the early SAN environment, data was propagated in Fibre Channel as a block-based access unit. For instance, the early SAN was FC-SAN. FC-SAN must be procured and deployed separately because FC protocols are not IP compatible, and its high price and complex configuration are a challenge for many small and medium-sized businesses. Therefore, FC-SAN is mainly used for high-end storage requirements with high performance, redundancy, availability, etc. In order to increase the popularity and scope of SAN and take full advantage of the architectural advantages of SAN itself, the direction of SAN began to consider integration with the already popular and relatively inexpensive IP network. Therefore, IP-SAN, which uses an existing IP network architecture, has emerged. IP-SAN combines standard TCP/IP and SCSI instruction sets based on IP networks to achieve block-level data transmission.

The difference between IP-SAN and FC-SAN is that the transport protocol and transport media are different. Common IP-SAN protocols are iSCSI, FCIP, iFCP, etc., where iSCSI protocol is the fastest-growing protocol standard. Usually, we refer to IP-SAN refers to the iSCSI protocol-based SAN.

The purpose of an iSCSI protocol-based SAN is to establish an SAN connection to iSCSI Target (target, usually a storage device) over an IP network using the local iSCSI Initiator (launcher, usually a server). The IP-SAN architecture is shown in Fig. 5.14.

Compared with FC-SAN, IP-SAN has the following advantages.

access standardization . There is no need for dedicated hba card and FC switch , just plain ethernet card and ethernet switch for storage and server connectivity .
Transmission distance is far. In theory, IP-SAN can be used as long as it is accessed by IP networks, which is one of the most widely used networks.
It is maintainable. On the one hand, most network maintenance personnel have an IP network foundation, IP-SAN is naturally more acceptable than FC-SAN. On the other hand, IP network maintenance tools have been very developed, IP-SAN fully developed the “take it.”
It is easy to extend the bandwidth in the future. Because the iSCSI protocol is hosted by Ethernet, with the rapid development of Ethernet, IP-SAN single-port bandwidth expansion to more than 10GB is an inevitable result of development.

These benefits is reduce reduce the total cost of Ownership ( TCO ) . For example , to build a storage system , the total cost is includes of ownership include the need to purchase disk array and access device ( HBA card and switch ) , personnel training , routine maintenance , subsequent expansion , disaster tolerance expansion , etc . Because of the wide application advantage of IP network , Ip is reduce – is reduce SAN is reduce can significantly reduce the cost of purchase access equipment for a single purchase , reduce maintenance cost , and subsequent expansion and network expansion cost are significantly reduce . Ip – SAN and other aspect of FC – SAN are show in Table 5.1 .

table 5.1 Comparison between IP – SAN and FC – SAN

5.3.3
NAS

NAS is a technology that consolidates distributed, independent data into large, centrally managed data centers for access by different hosts and application servers. Typically, NAS is defined as a special dedicated file storage server that includes storage devices such as disk arrays, CD/DVD drives, tape drives, removable storage media, and embedded system software that provides cross-platform file-sharing capabilities.

The emergence is is of NAS is inextricable to the development of the network . After the emergence of THEPANET , modern network technology has been develop rapidly . People is share share more and more datum in the network , but share file in the network face cross – platform access and data security and many other problem . early network sharing was show in Fig . 5.15 .

To solve this problem, you can set up a dedicated computer to hold many shared files, connect to an existing network, and allow all users on the entire network to share their storage space. Through this approach, the early UNIX network environment evolved into a way of relying on “file servers” to share data.

Using specialized servers to provide shared data storage, with a large amount of storage disk space, is necessary to ensure data security and reliability. Simultaneously, a single server is responsible for many servers’ access needs and needs to optimize the file-sharing server in terms of file I/O. Also, computers used in this manner should have an I/O-only operating system connected to an existing network, which is not required for such servers. Users on the network can access files on this particular server as if they were accessing files on their workstation, essentially fulfilling the need for file sharing for all users throughout the network. The TCP/IP network sharing indication in the early UNIX environment is shown in Fig. 5.16.

Fig. 5.16

TCP / IP network sharing in the early UNIX environment

With the development of the network, there are more and more data sharing needs between different network computers. In most cases, systems and users on the network are expected to connect to specific file systems and access data, so that remote files from shared computers can be processed in the same way as local files in the local operating system, providing users with a virtual collection of files. The files in this collection do not exist on the local computer’s storage device, and their location is virtual. One of this storage approach’s evolutions is integration with traditional client/server environments that support Windows operating systems. This involves issues such as Windows network capabilities, private protocols, and UNIX/Linux-based database servers. In its early stages of development, a Windows network consisted of a network file server that is still in use today and uses a dedicated network system protocol. Early Windows file servers were shown in Fig. 5.17.

Fig. 5.17

Early Windows file server diagram

The advent of file-sharing servers has led to the development of data storage toward centralized storage, which has led to rapid growth in centralized data and business volumes. As a result, NAS, which focuses on file-sharing services, has emerged.

NAS typically has its nodes on a local area network, allowing users to access file data directly over the network without an application server’s intervention. In this configuration, NAS centrally manages and processes all shared files on the network, freeing the load from applications or enterprise servers, effectively reducing the total cost of ownership and protecting users’ investments. Simply put, an NAS device is a device that is connected to a network and has file storage capabilities, hence the name “network file storage device.” It is a kind of dedicated file data storage server, with the file as the core, realizes the storage and management of Chinese pieces, completely separates the server’s storage device, thus freeing up bandwidth and improving performance.

essentially , NAS is is is a storage device , not a server . NAS is is is not a Lite file server . It is has has feature that some server do not have . The role is is of the server is to process the business . The role is is of the storage device is to store datum . In a complete application , the environment is be should be the two device organically combine .

NAS ‘s intrinsic value is lies lie in its ability to leverage exist resource in the datum center to deliver file storage service in a fast and low – cost manner . today ‘s solutions is are are compatible across UNIX , Linux , and Windows environment and easily provide the ability to connect to the user ‘s TCP / IP network . The NAS indication is show in Fig . 5.18 .

NAS is requires require store and back up large amount of datum , base on which a stable and efficient data transfer service is require . Such requirement can not be accomplish by hardware alone , and NAS need some software to do so .

NAS devices support reading/writing to CIFS or NFS, as well as both.

CIFS is is is a public , open file system develop by Microsoft ‘s SMB . SMB is is is a set of file – share protocol set by Microsoft base on NetBIOS . CIFS is allows allow user to access datum on remote computer . In addition , CIFS is provides provide a mechanism to avoid read and write conflict and thus support multi – user access .

In order for Windows and UNIX computers to share resources, Windows customers can use the resources on their UNIX computers as if they were using a Windows NT server without changing settings, and the best way to do this is to install the software in UNIX that supports the SMB/CIFS protocol. When all major operating systems support CIFS, “communication” between computers is convenient. Samba software helps Windows and UNIX users achieve this. People set up a CIFS-based shared server, share resources to its target computer, the target computer in their system through a simple shared mapping, the CIFS server shared resources mounted to their systems, as their local file system resources to use. With a simple mapping, the computer customer gets all the shared resources they want from the CIFS server.

NFS was developed by Sun, which enables users to share files and is designed to be used between different systems, so its communication protocols are designed to be independent of hosts and operating systems. When users want to use remote files, only the use of mount commands, you can mount the remote file system under their file system. The use of remote files and native files is no different.

The NFS platform-independent file-sharing mechanism is based on the XDR/RPC protocol.

External Data Representation is transform ( XDR ) can transform the data format . typically , XDRs is transform transform datum into a uniform standard data format to ensure data consistency represent different platform , operating system , and programming language .

Remote Procedure Call (RPC) requests service from the remote computer. The user transmits the request over the network to the remote computer, which processes the request.

Using the Virtual File System (VFS) mechanism, NFS sends user requests for remote data access to the server through a unified file inquiry protocol and remote procedure calls. NFS continues to evolve, and since its emergence in 1985, it has undergone four versions of the update and been ported to all major operating systems, becoming the de facto standard for distributed file systems. NFS appears in an era of unstable network conditions, initially based on UDP transmission, rather than highly reliable TCP. While UDP works well on higher reliability LANs, it is not up to the task when running on less reliable WAN networks such as the Internet. At present, with the improvement of TCP, NFS running on TCP has high reliability and good performance.

Storage Basics in Cloud Computing

5.3.1DAS

5.3.2 SAN

5.3.3 NAS

5.3.2
SAN

5.3.3
NAS