|
Our issue :
We are facing a deal using nativeha and we need some advises from your side. As you know, nativeha is using replicated logs, which is a special kind of linear logging with automatic management.
The subject : We faced during the past years some situations where applications were down and unable to get messages for hours, which drove to have queues with some kind of 1 million persistent messages stuck…
In this particular context, as we were using circular logging, so the only concern was increasing the MAXDEPTH for QL and having enough space in data filesystem. During my past life, at another customer, I used to face some similar situation with LLA and there were 2 ways to survive : - Change IMGRCOVQ to NO, for problematic QL - Change IMGSCHED to MANUAL for QMGR
However, with nativeha, it’s not possible. When I tried to change these parameters, I got the following error.
AMQ8894E: Command is restricted by queue manager.
What can be found in the literature (Christopher Frank – mqtechconference 2017): Writing media images to the log can dramatically impact MQ performance – and not in a good way. There is guidance provided regarding when you should take media images, and when you should avoid doing so if possible. Some times are good for taking a media image: – When a queue is empty – When the system is quiet – When the size of the logs required for media recovery is large – When a lot of time and/or activity has passed since the last one If some times are good for taking a media image, that means there are also times when taking media images is not so good. For example: – When queue(s) are large or growing – When the queue manager is busy and there is a lot of persistent message activity taking place in the system – When an image was just taken, even if the queue is empty So good judgement is needed to take media images in good situations, and avoid doing so when conditions are less than ideal. Unfortunately, this judgement is often not exercised – it’s very difficult to take images at only the good times, since those times are constantly varying. In many cases MQ admins simply automate the taking of media images, so no judgement is used at all.
Another performance advantage that could be gained by having the queue manager decide when to record media images is for it to use strategies that result in less data being written to the log. One strategy would be to establish a point of recovery for an object, but not record a full copy of the object at that time in the log. This is known as a partial media image. When the queue manager records a partial media image, it does so on behalf of a recovery point which is a little time in the past. If many (or all) of the messages on the queue were put since the recovery point, then these messages do not have to be recorded in the media image, since recovering the queue would replay all log records since the recovery point and so would include all these recently put messages. Similarly, if many or all of the messages that were on the queue at the recovery point have since been gotten off the queue, these messages wouldn't need to be recorded in the partial media image because they aren't on the queue now. So, if the cards are right, the partial media image may contain no data at all and so be very quick to record, even though the queue was never empty. This optimization is most effective when most messages only rest on the queue for a short period of time. If your applications do things like use MQ as a database, using queues as a place to store data long-term, then this strategy won’t work, as the queue manager will have no alternative but to record full media images every time. One of the best times to automatically record a media image is when the queue is completely empty, or almost empty, because then there is little user data to log so the media image written is very small. The queue manager watches for such times and if a media image hasn't been written for a while, but IMGINTVL or IMGLOGLN hasn't expired yet, the queue manager may decide to record a media image anyway because now is a really good time. It is much easier for the queue manager to spot such times than it is likely to be for an MQ admin, so this provides another performance boost to automatic media image recording.
So please tell us what to do it such situations !
==================================================================================
Lab's answer :
|
Thank you for the feedback. Please see below the update from the Lab.
+++++
Native HA by design is reliant on writing media images to the recovery log on a frequent basis to be able to perform rebase and group rebase. Whilst media images are required for damaged object recovery, they are also necessary for a rebase operation as Native HA replicas only share log data, they do not share queue files.
As with linear logging, Queues that remain deep for extended periods of time will require additional log space to record media images when compared to shallow queues for a Native HA queue manager using replicated logs.
By design it is not currently possible to opt-out of recording media images for any objects, a rebase requires media images for all queue manager objects to be able to perform a rebase.
Adjusting IMGLOGLN and IMGINTVL to record images less frequently will mean less images are recorded, typically this will mean more log space will be required to ensure that a new set of media images is recorded before the older log extents can be re-used. Less frequent media images will also mean longer rebase times which could reduce messaging performance.
Please feel free to raise a request for a future product enhancement for Native HA to better handle deep queues.
https://www.ibm.com/support/pages/welcome-ibm-ideas-portal
++++++
Thank you for using IBM products!
Vinu Kannadath
IBM Software Support (MQ)
|
|