IBM Spectrum Protect Tivoli Storage Manager (TSM) Spectrum Protect to Google Cloud Platform via rclone

BackupPlan

Newcomer
Joined
Nov 15, 2022
Messages
1
Reaction score
0
Points
0
Location
Here
Website
www.tannerwilliamson.com
I'm new here, and have mostly come to learn from those with significant TSM experience. I'm fairly new to TSM, and can't contribute much on that regard here yet, but wanted to take the opportunity to contribute where I can, especially as seeing this Google Cloud Platform sub forum doesn't have a single thread yet.

I'll share some ideas for anyone who was looking for a method of getting TSM setup with Google Drive via rclone via GCP service accounts and APIs.

My boss was initially hesitant to try out the native cloud storage integration with the version of TSM we have (8.1.x), but was willing to meet me halfway by allowing backup to filesystem on disk, where I used rclone to present a remote storage location from GCP to the local TSM instance as a filesystem mount point. Rclone is free and open source software available for linux, windows, and more.

There are two ways you can go about using rclone with cloud storage. One is to define your remote storage configuration and then to mount it as a live mount point, and the other is to define the remote storage configuration but only use it for rclone specific on-demand calls. I'll explain why this is relevant later*.

Google Cloud Platform allows you to use many of their APIs for free, while paying for the cost of storage. There is a way to use Google Drive storage that comes with Google Workspace accounts (formerly known as Google apps and or Google G-suite), without having to incur any additional costs beyond what your business may already be paying for Google user accounts. The idea is that you create a shared team drive in drive.google.com. You then go into Google Cloud Platform console, and enable the Google Drive API. Next in GCP, create at least one service account. The service account will have a user identity given to you like user@*google-something*.com. You copy that [email protected], and then go back to your Team Drive on drive.google.com, and choose to "share the drive with specific people", and invite the user to have full read and write access to the shared drive, while putting in the [email protected] user identity given to you for the google service account.

Next you configure rclone on your TSM host where you want the files to be accessible, and you can now choose to either mount the storage live, or script around rclone and synchronize with a utility such as rsync, or use rclone's copy, sync, and move files commands as your heart desires.

*Note that each Google User account is limited to uploading 750GB per day, and downloading 10TB per day. If you need to upload more than 750GB per day, you'll need to create multiple service accounts, and simply rotate through them. I did this by creating a list of archive files locally, then comparing it to a list of files already stored remotely. I then get the delta of new files needing to be uploaded, and put it into a list. That list is then used to generate a new file with the commands for uploading the files. I rotate through a list of service accounts for each file. For example, if I were to have 10 files to upload and 5 service accounts to use, my prepared command list would look something like this:

Note that these example commands are not syntactically correct, but are provided in a simplified form for the sake of demonstrating the concept of rotating through service accounts. You'll need to ensure you use the correct command syntax for your rclone operations.

Code:
rclone upload /opt/local-archives/file-A.bin [email protected]
rclone upload /opt/local-archives/file-B.bin [email protected]
rclone upload /opt/local-archives/file-C.bin [email protected]
rclone upload /opt/local-archives/file-D.bin [email protected]
rclone upload /opt/local-archives/file-E.bin [email protected]
rclone upload /opt/local-archives/file-F.bin [email protected]
rclone upload /opt/local-archives/file-G.bin [email protected]
rclone upload /opt/local-archives/file-H.bin [email protected]
rclone upload /opt/local-archives/file-I.bin [email protected]
rclone upload /opt/local-archives/file-J.bin [email protected]
rclone upload /opt/local-archives/file-K.bin [email protected]
rclone upload /opt/local-archives/file-L.bin [email protected]
rclone upload /opt/local-archives/file-M.bin [email protected]
rclone upload /opt/local-archives/file-N.bin [email protected]
rclone upload /opt/local-archives/file-O.bin [email protected]


Note how the service accounts are rotated through for each upload. This gives me (5) x 750GB of bandwidth to use for this upload task.

You can create up to 100 service accounts on a GCP account, so you have a theoretical maximum of uploading up to 100 x 750GB or 750,000 GB aka 75TB in a single rolling 24 hour period.

You could always improve on my methodology by implementing a way to spawn multiple upload processes at a time, and do simultaneous uploads to make your upload job go faster.


Some resources for learning about rclone:

Storage Systems Supported by RCLONE:

General Documentation

rclone commands:
 
Back
Top