ADSM-L

Re: [ADSM-L] Shared file systems other than GPFS

2017-02-23 07:45:02
Subject: Re: [ADSM-L] Shared file systems other than GPFS
From: Frank Kraemer <kraemerf AT DE.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 23 Feb 2017 12:54:55 +0100
Steve,

Spectrum Scale (aka GPFS) is a scalable paralell Filesystem for High
Performance I/O applications:

the most important thing is that Spectrum Scale (GPFS) is a paralell
filesystem and of course it's a shared filesystem. Each GPFS Server nodes
offers something called an "NSD" service (NSD = network shared disk). A
filesystem is striped of all available NSD's. The more NSD's the more I/O
performance you get. Each NSD is a 1:1 mapping of a local raw disk (on aix
e.g. /dev/hdisk67 on e.g. Linux /dev/sdb). GPFS can use all sorts of disks
provided by the underlaying OS (SATA, SAS, FC, SVC, SSD, NVMe, RAM disks,
etc.).

Network Shared Disk (NSD) is the key factor:

A filesystem is the sum of all the NSD's that you allocate to it. Yes you
can span a single filesystem over AIX and Linux at the same time.
Adding/removing GPFS nodes while the system is up and running - no problem!
The new GUI helps newbies to be learn fast. To setup GPFS from ground is a
matter of minutes...it takes longer for me to write this email than to
setup a GPFS cluster :-)

It's all about performance:

We have customers that run workloads north of 400 GB/sec from a single
application to the filesystem. A single ESS (Elastic Storage Server model
GL6) will give you 27.000 MB/sec file I/O per box - just adding boxes will
add a multiple of this number to your performance. There are customers who
run 40 and more ESS GL6 boxes as one single filesystem. TSM servers can
take the full benefit of this performance. TSM DB, Log and Storage Pools
can be placed on GPFS filesystems.

Steps to setup GPFS on a number of nodes:

1) Install OS on Nodes, Images,...(VMware is supported - yes)
2) Install GPFS LPP on AIX, RPMs on Linux or MSI on Windows (Yes - we have
full windows support)
3) Define/Prepare OS raw disks with OS level commands (e.g. on AIX mkvg,
mklv, ...)
3) Define/Setup a GPFS Cluster using the existing IP connection; define and
distribute ssh keys to nodes; NTP setup;
4) Create NSD's from existing RAW disks (it's just a 1:1 mapping)
5) Define GPFS filesystem(s) on related NSD's
6) Mount GPFS filesystems on all nodes in the cluster
7) -done-

(For Step 3, 4 and 5 there is just ONE text file that you can prepare/edit
and reuse it for all 3 steps. Takes about 40 sec to do all steps even for a
big system.)

How to start ?

1) Download the IBM GPFS VMware image from the GPFS web site. It's a full
function image with no limts for testing & learning.
2) On a laptop create 3 VMware machines and install the images.
3) Add some VMware defined disk for testing.
4) Start installation and setup. Perfect for playing with the system.
5) After 2 day's you are the GPFS expert :-)

Are there other filesystems?

Yes, there are a large number of filesystems on the market. Every week
there is a new kid on the block as long as VC money is flooding Silicon
Valley startups :-) If you think GPFS is complex than please try Luste and
you will find GPFS a piece of cake. Lustre is an overlay filesystem. You
need local filesystem on each node and than Lustre will run "ontop" of the
multiple local filesystems to distribute the data. Runs only on Linux, you
need much more skills and know how from my point of view...but it's a good
system, if you can deal with the complexity. (Linux only of course no AIX
support.)

Gluster from RedHat is easy to setup but it's very slow and does not scale
well. It's not used in any large installation where performance is
relevant. Redhat has too many filesystems on the truck there is some
confusion were to go. They have GFS, GFS2, Gluster and Ceph...just too much
choice for me. (Linux only of course no AIX support.)

BeeGFS from Kaiserslautern in the Pfalz is another alternative filesystem.
It's a spin-off of the German research community.

Hedvig - new kid on the block...I never saw a customer in real life but
strong marketing on the web.

Ceph from ex-Inktank is very good for running OpenStack block device
support but the filesystem "plugin" is a poor implentation from my
experience.

NFS - not a parallel filesystem at all it's 1990' technology, has a lack of
security and limited I/O efficiency. Works fine but is painfully slow. Of
course NetApp and EMC Isilon will tell you a different story here.

[....]

GPFS is proven for TSM workloads, it's fully supported and speed is just a
matter of adding more hardware to the floor. It has full support for DB2
and yes also Oracle (!). You can perfectly use GPFS as target for the
Oracle RMAN tool. SAP HANA runs on GPFS as well as SAS.

-frank-

P.S. You can get GPFS not only from IBM - Lenovo, NEC, DDN, .... and a lot
of others can help too; this option will help you to find a good commercial
offering. You are not locked-in to a single vendor like with Dell/EMC
Isilon ;-)

Frank Kraemer
IBM Consulting IT Specialist  / Client Technical Architect
Am Weiher 24, 65451 Kelsterbach
mailto:kraemerf AT de.ibm DOT com
voice: +49-(0)171-3043699 / +4970342741078
IBM Germany


> TSM server 7.1 on AIX, 7.1 TSM for VE on Linux X86_64 with storage
> agent, currently backing up to Protectier VTL.
>
> The only supported file sharing is via GPFS, and I don't think I can
> justify the complexity and expense of that. However there are a lot
> of shared filesystems out there.  Is anyone running Gluster, Lustre,
> or something similar and doing storage agent backups on that.  It
> will obviously need to run on AIX and Linux and ideally should have
> minimal set up.
>
> Ideas welcome.
>
> Steve
>
> Steven Harris
> TSM Admin/Consultant
> Canberra Australia

<Prev in Thread] Current Thread [Next in Thread>