Replication stgpool questions

smp

ADSM.ORG Member
Joined
Jun 12, 2018
Messages
11
Reaction score
0
Points
0
We have two TSM 8.1.10.100 servers running on rhel 8.4 with identical hardware and software. We are doing cross replication from TSM1 to TSM2 and vice versa.
We are having an issue with the replication pool for TSM1(source) on TSM2(target) getting ANR0522W - no space available in storage pool TSM1CONTAINER.

I have opened cases with IBM in the past and their suggestion has always been to allocate more storage to the TSM1CONTAINER directory storage pool on the TSM2 server.
As soon as we added additional storage, it was immediatley used by the PROTECT STGPOOL process. I quit running the PROTECT STGPOOL process and started running
only the PROTECT NODE process because the PROTECT STGPOOL was always running out of space. I also turned off automatic defrag processing because this was
causing issues. Our TSM2 server has over 30 TB more FsCapacity than the TSM1 server does.

I have a few questions:
1. How does TSM determine whether to use DedupCntrFreeSpace or allocate new containers from the FsFreeSpace when storage is needed for the PROTECT STGPOOL
and REPLICATE NODE processes?
2. If we have about 25 TB of DedupCntrFreeSpace on the TSM2 server, why are we getting the ANR0522W message about no space available?
3. What could be some reasons that the target server would need so much more FSCapacity the the source server?
4. How do we get everything replicated(back in sync) on our target server without continuously adding additional storage?
5. What are the repercussions from cancelling the protect stgpool and replicate node processes before they complete?



TSM1 server information:

SD Pool TSM1CONTAINER (4):
Needs Refresh: False
Maximum Size: 0
FsCapacity: 204341548 MB
FsFreeSpace: 19348061 MB
DedupCntrAllocSpace: 174487926 MB
DedupCntrUsedSpace: 170666606 MB
DedupCntrFreeSpace: 3821319 MB
DedupUnaccessibleSpace: 0 MB
NonDedupCntrAllocSpace: 202735 MB
NonDedupCntrUsedSpace: 113845 MB
NonDedupCntrFreeSpace: 88890 MB
NonDedupUnaccessibleSpace: 0 MB
PoolCapacity: 194038722 MB
PoolFree: 23258270 MB
Pool Unaccessible: 0 MB
Dedup Reserved Space: 0 MB
Non Dedup Reserved Space: 0 MB
Compression Enabled: Yes
Encryption Enabled: No
Last Space Check: 2021-09-28 13:22:09

Directory Count: 27
Directory List:
43: /tsm1spct29
44: /tsm1spct30
45: /tsm1spct31
46: /tsm1spct32
47: /tsm1spct33
48: /tsm1spct34
49: /tsm1spct35
50: /tsm1spct36
51: /tsm1spct37
52: /tsm1spct38
53: /tsm1spct39
54: /tsm1spct40
55: /tsm1spct41
56: /tsm1spct42
57: /tsm1spct43
58: /tsm1spct44
59: /tsm1spct46
60: /tsm1spct45
62: /tsm1spct47
63: /tsm1spct48
65: /tsm1spct49
66: /tsm1spct50
67: /tsm1spct51
68: /tsm1spct52
69: /tsm1spct53
70: /tsm1spct54
71: /tsm1spct55

Current defrag processing status: Sleeping
Defrag currently processing 0 containers
Time of last defrag processing status change: 2021-07-31 11:07:10.905785

TSM2 server information:

SD Pool TSM1CONTAINER (5):
Needs Refresh: False
Maximum Size: 0
FsCapacity: 234614370 MB
FsFreeSpace: 0 MB
DedupCntrAllocSpace: 222577379 MB
DedupCntrUsedSpace: 197568352 MB
DedupCntrFreeSpace: 25009027 MB
DedupUnaccessibleSpace: 0 MB
NonDedupCntrAllocSpace: 207273 MB
NonDedupCntrUsedSpace: 89118 MB
NonDedupCntrFreeSpace: 118155 MB
NonDedupUnaccessibleSpace: 0 MB
PoolCapacity: 222784652 MB
PoolFree: 25127182 MB
Pool Unaccessible: 0 MB
Dedup Reserved Space: 6442 MB
Non Dedup Reserved Space: 0 MB
Compression Enabled: Yes
Encryption Enabled: No
Last Space Check: 2021-09-28 13:29:03

Directory Count: 31
Directory List:
40: /tsm1spct01
41: /tsm1spct02
42: /tsm1spct03
43: /tsm1spct04
44: /tsm1spct05
45: /tsm1spct06
46: /tsm1spct07
47: /tsm1spct08
48: /tsm1spct09
49: /tsm1spct10
50: /tsm1spct11
51: /tsm1spct12
52: /tsm1spct13
53: /tsm1spct14
54: /tsm1spct15
55: /tsm1spct16
56: /tsm1spct17
57: /tsm1spct18
58: /tsm1spct19
59: /tsm1spct20
60: /tsm1spct21
61: /tsm1spct22
62: /tsm1spct23
63: /tsm1spct24
64: /tsm1spct25
65: /tsm1spct26
66: /tsm1spct27
70: /tsm1spct28
71: /tsm1spct29
72: /tsm1spct30
73: /tsm1spct31
 
Hi,

There could to a lot of reason, but what is always good to check .

run generate dedupstats and check if your Data is real the same on the source and target . Also u know your Data consumtion better .
If you have a short rentention of your Backup Data and it rotates a lot, u could end up with a lot of containers which are only half filled. You could check with select stgpool_name,count(*) from containers where free_space_mb>'5000' group by stgpool_name.
I have also just seen that a single stgpool_dir could get 100% used and gets an error with no space left . Not seen very often but happend to me just 2 weeks ago ( 8.1.12 ) . And of course should not happen at all.

At the end if the policy is the same and it is a 1:1 pool protect/replicate it should max exceeds +5% for the target.

So check your dedupstats and if the protected GB is the same on source and target work on stgpooldir and containers. IF the dedupstats differs a lot check your policy / inventory processing .

ALSO a protect stgpool needs to finish , otherwise it does not mark/delete the chunks in the pool on target ...

You need to fix why the protect stgpool does not work.

The only thing u could check if you have 100% filled Filesystems and set those directory to read only (upd stgpooldir ..) or/and use move container defrag=yes to emty those heay filled filesystems a bit ( remember the reuse delay or the stgpool settings, wait 1 day or set it to 0)

If this is not the case/help open again a SR @IBM .

If u add Storage Space it is a way better to increase each LUN as to add a stgpool_dir .

hth, Dietmar
 
  • Like
Reactions: smp
Back
Top