The OpenVOS Operating System provides a high-level Application Programming Interface (API) that on the whole makes programming the system easy. But sometimes, it becomes too easy – because a simple subroutine call to an s$… routine might hide a lot of complexity. Users sometimes don’t realize that the simple call can result in a lot of work and might cause problems for the system and application.
This is the second of an irregular series of posts to call your attention to pitfalls you may avoid in your application designs.
The Evils of list_messages for Server Queues
The list_messages command in OpenVOS is intended to be used with queues that are part of the transaction protection product for OpenVOS. StrataDOC for OpenVOS will show that this command “displays information about the messages in a given queue”. So when queues start backing up in production, you might be tempted to use this command find out how many messages are on the queue. The –totals_only option looks just like what’s needed. Once you understand how this command operates, you’ll understand why this is not a good choice.
How the list_messages command Operates
The command attaches a port to the file and opens the port as a SERVER_TYPE (or RECV_ONLY_SRV_TYPE for a one-way server queue), and puts it in no_wait_mode. If the queue is transaction protected, then the command starts a transaction at priority -1, which means it will always lose if there is contention, and when it finishes it calls s$abort_transaction.
Next it starts at the first record in queue and proceeds to call s$read_msg and count every record it is able to receive and then counts and/or dumps the record to the output port.
In order to display information about each message, it has to lookup information about the module and process that originated the message.
Finally, it prints totals for the records that it found in the queue.
Problems with that Approach
The command was intended for application development programmers to use to look at messages while the messages were in the queue to determine if the message was formatted properly and contained the proper fields, was queued at the right priority, whether or not it had been received by a server, etc. However, there are many restrictions and hidden/misleading data when the command is used for other purposes:
- To get information about messages in a queue, you must have execute, read, or write access to it. If you have execute or read access to the queue, you can get information about your messages only. If you have write access, you can get information about all of the messages.
- If the queue is a server queue, list_messages lists only those messages that have not yet been replied to by the server. Messages that have been replied to but have not yet been removed from the queue by s$msg_receive_reply are not listed.
- The queue could be full (max_queue_depth) and not accepting any more new messages, but list_messages might show zero messages in the queue if all the messages are unreceived replies.
- Because the list_messages command appears as a server of the queue while it is retrieving messages, it might deceive clients and other control processes as to the number of servers servicing the queue.
- Retrieving module and process information about the requester processes might require network transactions if the requesters are on a different module.
- Each message requires a trip into the kernel to read the message. If the command is not run from the module that owns the queue, then a network transaction is required to read each message.
- Even if –totals_only is specified, each message must be read to be counted.
- The value displayed by –totals_only is ONLY for messages that are either queued for a server or are being processed by a server. No information is display about un-received replies to two-way server queue messages that are still in the queue, making –totals_only values misleading.
- Data in the messages might be sensitive information (e.g, credit card account numbers).
A Different Method
In 1990, a new API was provided: s$get_server_queue_info. Given the pathname of a queue, this routine returns a structure with
- The kind of server_queue (one-way or two-way)
- The number of messages
- The number of non-busy messages
- The highest number of messages
- The total number of messages that have been processed.
- The configured maximum number of messages (max_queue_depth) for this queue.
There are lots of advantages to this routine over list_messages:
- One trip into the kernel and/or across the network to retrieve totals information.
- Consideration of all kinds of messages, including not-yet-received replies.
- No disruption of the queue. The queue header is locked for microseconds, just long enough to get a consistent set of five counters from the queue header.
- Information can be acquired when the queue is opened as a requester.
check_queue_depth Command
Many years ago I wrote check_queue_depth, a command that uses this interface and you may use it to not only display this information for one or more queues, but also it allows monitoring and warning when thresholds are reached.
check_queue_depth queues [-warn_percent number] [-interval number] [-notify user_name] [-brief] [-long] [-ignore_invalid_type] [-syserr] [-exit_after_warning]
The source code for the command is available on the Stratus Public FTP Site.
- Documentation http://ftp.stratus.com/vos/perf/check_queue_depth.txt
- Source code http://ftp.stratus.com/vos/perf/check_queue_depth.pl1
Feel free to use it, enhance it, and/or embed it in your own application monitoring commands.