Bookmark this page

Chapter 6. Providing Additional Storage Strategies

Abstract

Goal Identify the available choices for additional cloud storage techniques, including object-based storage, network file sharing, and volumes sourced from a file sharing service.
Objectives
  • Describe the purpose, benefits and operations for object-based storage use cases. Create and manage containers, folders and objects.

  • Analyze and compare the common technologies for general object storage use cases.

  • Provide remote file sharing services for common application file share storage use cases.

Sections
  • Implementing Object Storage (and Guided Exercise)

  • Analyzing Object Storage Technologies (and Guided Exercise)

  • Implementing NFS Shared Storage (and Guided Exercise)

Lab

Providing Additional Storage Strategies

Implementing Object Storage

Objectives

After completing this section, you should be able to describe the purpose, benefits and operations for object-based storage use cases. Create and manage containers, folders and objects.

Introducing Object Storage

Object storage, first and foremost, is simply an available OpenStack resource that can be implemented by your OpenStack configuration personnel in multiple ways. Swift is an object storage service and API for working with stored objects, regardless of the back end implementation. Swift, as an OpenStack service, actually has two APIs; the standard Swift API, and an Amazon S3 compatible API. Developers choose whichever API they prefer to use in their object-using applications. Red Hat OpenStack Platform installs Ceph as the default storage for all other storage service requirements, including the object storage requirements of the Image Service. Swift, however, is not configured to use Ceph as its back end, but instead uses local devices to build a Swift-native back end.

The comparison between having Swift use its own native storage format and the using Ceph object storage is discussed on the following section that discusses object storage technologies. Ceph also offers two APIs for direct access of its object storage service, the Swift API, and the Amazon S3 compatible API. However, using the Ceph APIs directly would bypass the OpenStack infrastructure, leaving OpenStack without access to object storage activities and metrics. In Red Hat OpenStack Platform environments, developers use the Swift service instead of going directly to Ceph.

Defining Object Storage Use Cases

The domain operator can be asked to advise cloud users about object storage as a resource for specific application use cases. For example, the backup and archiving of images and snapshots, static content, and file sharing. An object container or object store is the best storage method for unorganized small objects. These objects can easily be shared, anyone with the public URL access to objects in the container can save them locally.

The five major use cases for Openstack object storage are described below.

Archival or backup

Extended storage for near-line access, disaster recovery, or governance compliance.

Big Data

Large datasets with the ability to use Hadoop FS compliant analytical tools.

Content repository

A scalable, resilient, distributed, redundant data store for application data, images, log records, and video. You can use Object Storage as your primary content repository for data, images, logs, and video. You can reliably store and preserve this data for a long time, and serve this content directly from Object Storage. The storage scales as your data storage needs scale.

Logging records

Keep logs historically for longer range analysis of performance, and usage patterns. You can use Object Storage to preserve application log data so that you can retroactively analyze this data to determine usage pattern and debug issues.

Data lakes

A data lake is usually a single store of all enterprise data including raw copies of source system data and transformed data used for tasks such as reporting, visualization, advanced analytics and machine learning. This is different to data warehousing. You can use Object Storage to store generated application data that needs to be preserved for future use. Pharmaceutical trials data, genome data, and Internet of Things (IoT) data are examples of generated application data that you can preserve using Object Storage.

Describing Object Characteristics

An object is stored as a binary file along with metadata which is stored in the file's extended attributes (xattrs). Objects can be text files, videos, images, emails, or virtual machine images. Objects are simply identified with a GUID and have no relationship to other objects in the container.

Objects can be stored in pseudo-directories by including a forward-slash in the object name. Using a forward-slash as the delimiter during queries makes the results appear as if the objects are laid out in directories.

When an object is created, it is owned by an account. For example, an OpenStack project or service. The account service uses a database to track which containers are owned by which account. The object service also uses a database to track and store container objects.

The Object Storage service uses virtual object containers to allow users to store and retrieve files and other data objects without a file system interface. Object redundancy is provided through software-based data replication. Object storage is well suited to data center deployments across different geographical areas.

Object Storage Technology

Storage Replicas are used to maintain the state of objects in the case of outage. A minimum of three replicas is recommended. Storage Zones are used to host replicas. Zones ensure that each replica of a given object can be stored separately. A zone might represent an individual disk drive or array, a server, all the servers in a rack, or even an entire data center.

Storage Regions are a group of zones sharing a location. Regions can be groups of servers or server farms, usually located in the same geographical area. Regions have a separate API endpoint per Object Storage Service installation, which allows for discrete separation of services.

The Use of Object Storage in OpenStack

Many of the OpenStack services use swift for object storage. Glance creates images but cannot actually store them, it utilizes swift as an image store Cinder backups can also be stored in swift. The Nova service can create snapshots of instances which are passed to Glance for storage in a swift container. Ironic stores introspection results as objects in swift. Ironic bare metal images are stored in a swift container. Swift also supports the Amazon Simple Storage Service (S3) API.

Object Storage Commands

Swift uses the OpenStack Unified CLI. Cloud users must have the admin or member role in the project to work with the project applications that will access object storage. Additionally, project members who do not have a project admin role must also be assigned the swiftoperator role for this project.

The openstack container command is used to manage objects in OpenStack. The openstack container create command is used to create containers. Each container has a unique ID.

[user@demo ~(admin)]$ openstack container create demo-container1
+------------------+-----------------+-------------+
| account          | container       | x-trans-id  |
+------------------+-----------------+-------------+
| AUTH_c0cb...e5cd | demo-container1 | txf6...52f5 |
+------------------+-----------------+-------------+

The openstack object create command uploads an existing object to the specified container. Each object also has an MD5 hash calculated. When using the CLI, the hash is stored in the etag attribute.

[user@demo ~(admin)]$ openstack object create demo-container1 cdr.object
+------------+-----------------+-------------+
| object     | container       | etag        |
+------------+-----------------+-------------+
| cdr.object | demo-container1 | 598d...9fe5 |
+------------+-----------------+-------------+

The openstack container list command displays all containers available to the user. The openstack object show command displays information about a specific object.

[user@demo ~(admin)]$ openstack object show demo-container1 cdr.object
+----------------+---------------------------------------+
| Field          | Value                                 |
+----------------+---------------------------------------+
| account        | AUTH_c0cbb4890bcd45828bf31dc1d64fe5cd |
| container      | demo-container1                       |
| content-length | 8462                                  |
| content-type   | application/octet-stream              |
| etag           | 598d1e6b4f0a3583244e1b4e09b49fe5      |
| last-modified  | Thu, 25 Jun 2020 07:35:50 GMT         |
| object         | cdr.object                            |
+----------------+---------------------------------------+

The openstack container save saves the contents of an existing container locally. The openstack object save command saves the contents of an existing specific object locally.

[user@demo ~(admin)]$ openstack object save demo-container1 cdr.object

Objects can be deleted using the openstack object delete command. While containers are deleted using the openstack container delete command. If the container is not empty, add the --recursive argument to forcibly delete the container and all the objects in the container.

[user@demo ~(admin)]$ openstack container delete \
> demo-container1 --recursive

Application Use Cases for Object Storage

One of the main differences between block storage and object storage is that a volume can only be accessed via instances, and by one instance at a time, whereas any instance or service can access the objects stored in containers because all objects stored within Swift have an accessible URL.

Object storage has several distinct advantages over volume storage. Object storage is accessible from any OpenStack service, and is fully distributed. Swift is best used for large pools of small objects.

Many Telecommunication companies produce massive amounts of call data records or CDR data. Those records need to be accessed by many applications and users, possibly in many geographical locations. Therefore, object storage is the best storage method for this kind of data as it can be accessed using a public URL. It is also the most cost effective method of storage for massive amounts of data. In object storage you only pay for the storage that is actually used. For example, if you upload 5 GB then you pay for that exact amount of storage. In volume storage you pay for the size of the disk created. If you create a 50 GB volume, you will pay for all 50 GB whether it is all used. However, be aware that if you use Swift over multiple data centers, then the cost can increase due to replication and network bandwidth requirements.

CDR data has many uses, primarily for billing, but also analysis using Big Data techniques. Analysis of CDR data can provide near-realtime monitoring, analysis of usage patterns, or even early warning for potential issues and outages. It also allows Telecommunication organizations to make capacity predictions and plan upgrades accordingly.

Many medical researchers work on projects simultaneously in many geographical locations. Object storage can be used to easily store and share the results of tests, diagnostics, or to archive medical data. Offloading archive data from primary storage has many benefits, the first being cost savings. It can also reduce backup complexities because properly configured object storage protects itself. Object storage requires no additional software to store data and therefore does not require any specific data protection.

Using Object Storage in the Dashboard

It is possible to create, manage, and delete containers and objects using the Dashboard.

Navigate to Project+Object StoreContainers. To create a new container click +Container. Enter a container name, click either Public or Not Public, then click Submit.

Figure 6.1: Container creation in the Dashboard

To upload an object into the container, click the container name. Click the upload button, browse to find the file to upload and click Upload File. To view the details of a specific object, click the actions menu and select view details. Note that the unique ID for the object is not named etag in the Dashboard, it is named Hash. The etag and hash are two names for the same attribute. The value of this attribute is the calculated MD5 hash of the object. This hash only applies to a complete object, so it cannot be used to integrity check partial downloads caused by performing a range GET. The hash value will be recalculated whenever the object is updated.

Figure 6.2: Object details in the Dashboard

An object can be deleted by clicking on Delete in the Actions menu. A container can be deleted by clicking on the dustbin icon next to its name. A window appears asking to confirm the deletion, click Delete. Note that a container can only be deleted in the Dashboard if it is empty. There is no force option to delete the container and all of its objects when using the Dashboard. Containers can only be deleted recursively from the command line.

 

References

Further information is available in multiple sections of the Storage Guide for Red Hat OpenStack Platform at https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.0/html-single/storage_guide/index

Revision: cl110-16.1-4c76154