Citrix Load Balancing in a Complex Environment

A recent client’s Millennium® performance issues had me stumped. The organization had a normal two-application and two-database node configuration. It was using very little CPU and had virtually no paging on these nodes. Queuing was pretty good from an MQ and proprietary middleware perspective. There was a lot of interface queuing, but the organization was on an older version of Oracle and we were not going to be able to address this area of their queuing until a database rebuild. So how could I help them right then?

I turned my attention to the Citrix servers — they had more than 100 of them — to check the load balance. The backend gave me little to go on, so I had to look deeper to see how the Citrix servers were connecting with Millennium. In this CPOE shop, Discern Notification alerts clinicians to potential problems with their order deliveries, so I chose to evaluate the Citrix load balancing in this component of Millennium.

When I did, I found that the method of Citrix load balancing being used was causing inconsistent response times for clinicians. Fast application and database nodes could not fix this performance problem. Rather, the team had to use the Citrix load balancing tool on Application User Load to more evenly distribute the core applications of, for instance, PowerChart® and FirstNet®. The better application distribution resulted in more consistent performance for their clinicians.

If you want to investigate whether you have a load balancing issue between Citrix and the application nodes, I’ll take you through the steps to parse your data and then evaluate the results. Start with one of the following select statements (either from CCL or SQL*Plus):

CCL> rdb select * from EKS_NOTIFY_DEST go
SQLPLUS> select * from EKS_NOTIFY_DEST ;

Your output will look something like this (for space reasons, I have eliminated three columns on the right and reduced my example to two rows of data):

The LOCATION column identifies the device location, set either on the fat client or via the WTS location tool. The TCP_IP_PORT is the port the client is listening on. The PERSON_ID is a unique number for the user name in Millennium. The UPDT_DT_TM shows the last time there was any communication between the application nodes and this client. UPDT_ID is a repeat of the PERSON_ID. UPDT_TASK is the task number for the process that stuck the row in the database or changed it. This number should be 3071000, which is the Discern Notification. Not shown here are UPDT_CNT; UPDT_APPLCTX, the application context number for the application updating this table; and APP_VERSION_NBR, the version of Discern Notification that the client is using. The version generally will be 2, unless you are on some very old code.

The TCP_IP_ADDR column is the key to making sense of this gibberish. It is composed of four elements: the TCP/IP address of the Citrix server or fat client connecting to Millennium, the Microsoft user name, the Millennium domain the connection is for, and the Millennium user name. The next step is to parse the TCP_IP_ADDR column into these four elements.

Import the table into Excel to use the Text to Columns function, and insert three columns between TCP_IP_ADDR and TCP_IP_PORT.

To parse the fat clients or older Citrix servers:

  1. Sort the TCP_IP_ADDR column so any cells with a “$” separating the TCP/IP address (first element) from the Microsoft user name (second element) are at the top of the list. Highlight these cells only.
  2. From the Excel Data tab, select the Text to Column function, select Delimited and click Next.
  3. In the next box, uncheck Tab, select Other, type $ in the open cell and click Finish.

To parse the rest of the Citrix servers:

  1. Highlight the other cells in the TCP_IP_ADDR column, which should all have a “!” between the TCP/IP address and Microsoft user name.
  2. With the Text to Column function, select Delimited and click Next.
  3. In the next box, uncheck Tab, select Other, type ! in the open cell and click Finish.

The first part of the parsing is complete. You now have a list of all the TCP/IP addresses that have connected to the application nodes for Discern Notification. Next you want to see all of the Microsoft user names, Millennium domains and Millennium user names that have connected to the application nodes for Discern Notification.

  1. In row 1 column C, which should be blank, type “Microsoft Username:Millennium Domain:Millennium Username”.
  2. Highlight all the rows in column C, which contains the remaining three data elements.
  3. From the Excel Data tab, select the Text to Column function, select Delimited and click Next.
  4. In the next box, uncheck Tab, select Other, type : in the open cell and click Finish.

The next step in getting usable information is to build a Pivot Table (steps are based on Excel 2010) to see the TCP/IP addresses that have the most Discern Notifications:

  1. From the Insert tab, select Pivot Table and verify the entire spreadsheet is highlighted. The dialog box should read, “Table1.” Select OK.
  2. From the Pivot Table Field List, left-click and drag the TCP_IP_ADDR field to the Row Labels box.
  3. Left-click and drag the TCP_IP_ADDR field to the Values box. The Values box will then read, “Count of TCP….” The Table to the left will have two columns.
  4. In cell A3, titled Row Labels, click the down arrow and select More Sort Options.
  5. Select Descending (Z to A) by:, click the down arrow to change the value to Count of TCP_IP_ADDR and select OK.

The resulting table will list the Discern Notifications for each TCP/IP address, as shown in this example:

If your servers are properly balanced, the count should be fairly similar across all the Citrix servers. If you have fat clients, they will have a smaller count. In my example, the server ending in .213 handles 15.63 times more connections than the .107 server. Even if we exclude the first two rows, the .54 server is doing 2.23 times more work than the .107 server. Once this imbalance is corrected, clinician response times will be more consistent.

Prognosis: A little research might uncover Citrix load balancing issues that are causing response time variability or other performance problems for your clinicians.