|
||
Traffic Modeling and Resource Allocation in Call Centers |
||
IntroductionTraffic, whether cars, customer lines or telephone
calls, share similar characteristics. Traffic may be very busy and move
slowly, or stop and wait, or it may be light and experience no delays.
Highways, toll booths, telephone lines and bank tellers may be either
underutilized, causing costly idle times, or overloaded, generating long
delays and providing poor service. Analysts must find the right number of
resources - toll booths, bank tellers, support agents and telephone lines
- to provide adequate service at reasonable costs. Traffic modeling analyzes traffic patterns and determines the necessary resources to handle that traffic. Traffic modeling originated in the telephone industry, and many of the theories still in use today were developed between 1909 and 1917 by the Danish mathematician Agner Krarup Erlang. Erlang's biography and additional traffic modeling bibliography are available at www.pass.maths.org.uk/issue2/erlang. Traffic ModelsBasic DefinitionsSources and ServersTraffic modeling involves sources generating service requests and servers that fulfill these requests. In a telephone switch, sources are callers and the servers are the telephone company's resources that provide the dial tone and rout calls to their destination. In a bank, customers are sources, and tellers are the servers. Traffic modeling assumes that there is very large number of sources R requesting service, and a limited number of servers N. The number of sources is significantly larger than the available servers, so that virtually R--> infinity. In addition, we assume that: · The sources generate random service requests independently of each other. · The average number of service requests per time unit from all sources is constant. · Service requests arrive at intervals that follow a Poisson distribution (see below). · The time required to service a request is distributed exponentially (see below), and is independent of the arrival rate. · Service is provided on a first-in, first-out (FIFO) basis; that is, service requests are honored in the order in which they arrive. Traffic Volume and IntensityThe volume of traffic is determined by the number of service requests per time unit and the time that each service request consumes. For instance, with an arrival rate of 100 calls per hour, with each call requiring 9 minutes (0.15 hour) of service, the traffic volume in an 8-hour day is: 100 * 0.15 * 8 = 120 Call Hours (Ch) Erlang units represent traffic intensity or load as traffic volume per time unit. One Erlang equals one Ch/hour, so the traffic load in the previous example is 120 / 8 = 15E. An Erlang may be defined thus as: one telephone line carrying traffic for one hour carries one Erlang of traffic. How Calls Arrive in a Call CenterA naive approach to figuring out the number of agents needed in a call center is to divide the number of calls expected to arrive in an hour by the average length of the calls. For example, if 100 calls arrive in one hour, and each call takes, on the average, 15 minutes to service, then each agent can take 4 calls per hour. Therefore, it appears that 25 agents and 25 telephone lines should be able to service the anticipated call load. The flaw in this logic is that service requests do not arrive in an orderly fashion one right after the other. Like customers at a bank, telephone calls arrive at random times and independent of each other. The average arrival rate in the example above is 15 minutes, but the actual arrival time is distributed randomly: some calls will arrive at the same time, some will come in while other calls are still being served, and during periods of the day no calls will arrive at all. The probability of call arrival is approximated by a Poisson process:
A Poisson probability distribution (Figure 1) is similar to a normal, bell-shaped distribution skewed to the right, with the peak of the curve before the mean arrival time. This means that more calls arrive during a period that is shorter than the average call length, and few take much longer than the average time to arrive. The duration of service requests is not uniform either. Call lengths are distributed exponentially, and as Figure 2 shows, most calls are shorter than the average call, but some are much longer than the average.
Erlang B modelErlang B is a "blocked calls lost" model, in which, when servers are unavailable, the service requestor is denied service and must retry the request. This is the situation in a telephone switch, where when all resources (telephone trunks) are exhausted, the caller receives a busy signal and must hang up and redial repeatedly until a server becomes available. Erlang B calculates the blocked call probability (loss probability) for a given traffic load and a number of servers. PB(N,A) is the probability that a caller will receive a busy signal with a traffic load of A Erlangs and N telephone trunks.
Calculating blocking probability is straightforward using Erlang B tables. Table 1 shows traffic loads that 15 to 25 servers can support with loss probabilities of 1%, 2%, 5% and 10%. For instance, if the anticipated load is 15 Erlangs, and the desired blocking probability is 2% or better, the number of telephone lines (and agents) to support it is 23. If resources are at a premium and degradation in service to blocking probability of 10% is acceptable, the number of servers can be reduced to 18. Erlang B tables are also available in a different
arrangement, giving the blocking probabilities for different traffic load Erlang C ModelUnlike the Erlang B model, in which blocked service requests are considered lost, in the Erlang C model, requests that cannot be satisfied immediately are delayed until a server is available. The model defines the probability PC(N,A) that a service request will have to wait for service if N agents are assigned to handle traffic of A Erlangs:
Call Center MetricsCall LoadThe volume and intensity of incoming service requests are the key parameters in determining the call center's resource requirements. Call load is measured in Erlang units, as described earlier. Peak Hour Traffic (PHT), Busy Hour Traffic (BHT)Peak hour is the busiest one-hour period of the day, when incoming service requests are most likely to be delayed or blocked and turned away. This is the load for which resources are calculated. While sufficient resources need to be available to handle peak traffic, it is a good practice to establish traffic arrival and duration patterns during the course of the entire day and each day of the week. Daily traffic should be sampled at half-hour or even 15-minute intervals, because the peak time is unlikely to correspond with the sampling interval and therefore will not be measured correctly. Analyzing resource requirements during different times of the day and all days of the week will allow more specific optimization of daily staff schedules. Average Handling Time (AHT)Average handling time (AHT) defines how long an agent is busy providing service to a single customer call. AHT is the time the agent provides service (talk time) plus wrap-up time - additional activities to complete a call and prepare for the next one. Average Speed of Answer (ASA)Average speed of answer (ASA) is a commonly used call center metric that defines the average time it takes to answer all telephone calls. In general, averages are acceptable for estimations and trending, but as we saw earlier they cannot described traffic patterns accurately because of the natural distribution of call arrival and duration as explained in the previous section, and many callers will experience delays significantly longer than the average. For instance, a staff of 12 taking 80 calls per hour with AHT of 7 minutes can deliver an average speed of answer of 50 seconds. However, as we will see later, this average figure applies to only 78% of the calls; 22% of the callers will experience longer delays, and some are likely to abandon the queue before their turn arrives. Grade of Service (GoS)Instead of targeting the speed of answer as a single figure of merit, a more appropriate and precise method is to set a desired grade of service, which is the percentage of calls that will be answered within a target threshold. For example, a target grade of service may be for 80% of the calls to be answered within 20 seconds, and for the remaining 20% that will end up waiting, the delay will be no longer than 2 minutes. The model should establish the staffing level and telephone lines that are required to support that grade of service. As importantly, it should ascertain how many callers will miss the 20 second target and what will their experience be: how long they will have to wait to receive service and how many are likely to abandon the call prematurely. Putting it Together: Determining Resource LevelsIn call centers, designers must establish the right number of agents and allocate the right number of telephone lines, balancing the desired level of service against the availability and operational costs of these resources. Telephone LinesThe computation of the required telephone trunks is based on the Erlang B model. The target blocking probability depends on the service model employed in the call center. If the call center is designed for a "loss" model, where calls that cannot be serviced immediately are diverted to a voice-mail service or simply receive a busy signal, use Erlang B tables to find out the number of telephone trunks that will provide an acceptable level of service. A blocking probability of 5% or better is usually adequate. But few call centers can afford the staff (and telephone lines) to operate in a "blocked calls lost" mode without periodically turning away customers, and therefore must employ a queuing system and enough telephone trunks to allow callers to hold for as long as they wish. In practice, it is impossible to place an infinite number of calls on hold, so the number of lines is set so that only in rare cases will callers be denied the opportunity to wait for service and receive a busy signal. Use Erlang B tables to find the number of lines that provide a sufficiently low blocking probability. The examples in the following discussion were calculated to deliver a loss probability of 0.001 (0.1%), although 1% should suffice in most cases. StaffingSince in the Erlang C model some calls are always queued, the first step in determining staffing levels is to establish a target grade of service. Calculating staffing levels to support that target is an iterative process and is most easily carried out using an Erlang C software program or spreadsheet. Fig 3 shows output of an Erlang C calculator. The GoS is defined as 80% of calls should be answered within 60 seconds. The maximum allowed wait time is 120 seconds, after which we assume callers will abandon the queue. The expected call volume is 100 calls/hour, and the Average Handling Time is 540 seconds.
In addition, the Erlang calculator shows the following parameters: % Abandoned - the percentage of callers expected to abandon the call while waiting in the queue. This number is calculated based on Queue Time. Queue Time - the average waiting time callers will have to wait in the queue to receive service. % Queued - the percentage of calls that will not be answered within the target speed of answer and will be queued. Queue Depth - average number of calls waiting in queue. Utilization - percentage of agent's time spent servicing calls. If Erlang software is unavailable, similar results, albeit not as detailed, can be computed using Erlang C tables similar to the one shown in Table 3.
1. Calculate the Queue Factor (QF), which is the maximum time a call can spend in the queue and still meet the maximum wait target: QF = MaxWait / AHT 2. Locate the section in the table for traffic load of 15 Erlangs. 3. Locate the row that has the same or higher QF to find staffing level. The nearest higher QF is 0.25. Looking across the row, the required staff is 19. 4. Use the selected QF (0.25) and the percentage of queued calls (Q), which in this case is 0.244, to calculate the ASA: ASA = AHT * Q * QF Non-LinearityA critical factor many back-of-the-envelope
analysts miss is that due to the statistical nature of calls arrival and
queuing discussed above, changes in resources have a non-linear effect on
staffing levels and GoS. For example, a Figure 4 shows the non-linear relationships between staffing level and ASA. The more agents the call center employs the better average speed of answer it can provide, but at the cost of decreased utilization. The call center designer may have to compromise the level of service in order to manage with limited resources and maintain reasonable utilization.
Telephone trunk requirements are Number of QueuesIt is a common practice in call centers to split
incoming calls and route them to specialized groups or individuals, where Limitations of Erlang ModelsThe standard Erlang C model assumes certain capabilities and behaviors that are not always met in the real world. For example, the model assumes that callers will be willing to wait forever to be answered by an agent. In practice, however, some callers will hang up the phone as soon as they are put on hold, and others will abandon the call after waiting in the queue for some time. Some callers will redial right away in hope to beat the system. These human behavior patterns will change the actual call statistics and the performance of the call center overall. The standard model also assumes that the call center has unlimited queuing capability. In practice, queuing resources are limited and when the system is overloaded, exceeding its queuing resources, callers will receive a busy signal or be connected to a voice-mail service. Automatic Call Distribution (ACD) systems can employ various strategies to lower the probability of this happening by overflowing calls to another agent group, or implementing a ring delay where the number of rings before the ACD picks up the line increases proportionally to the number of queued calls. Other considerations that have a significant impact on resources are the call center's call flow management and its problem resolution capabilities. Employing a dispatching strategy in which a clerk logs the call and a subject matter analyst later returns it, changes many of the assumptions made by the basic Erlang models. This approach shortens the duration of inbound calls and evens their distribution, and at the same time generates a significant number of outbound calls. The Erlang model treats all servers as non-human resources. It assumes that they are always available and work at maximum capacity. While this is adequate for telephone lines, a reliable call center model must account for vacation and sick time as well as for training, meetings and other work, which may decrease utilization by 15% or more. Various adaptations of the standard Erlang method exist to account for some of these issues, especially the infinite queuing problem. However, because of the complexity of the subject and the lack of a wide theoretical and practical base, these special versions should to be used sensibly. In very large call centers, where approximations and rounding errors may result in significant numbers, simulation can offer a good substitute or a complementary analysis method.
|
||
|
Prices, Licensing Terms, and Free Support and Free Upgrade offers are subject to change without notice. The information in this Website, screen captures of EasyErlang, and the use of EasyErlang software do not represent advice or recommendation. Any action taken based upon this information is the sole responsibility of the user. No liability for errors or omissions is implied or accepted by the owners of this site. Copyright © 1998-2009 Diagnostic Strategies. All Rights Reserved. Copyright Notice & Terms of Use. Last modified: May 17, 2009. |
||