Blocking Probability Simulations for FDL Feedback Optical Buffers

,


Introduction
The ultimate capacity of the Internet may be constrained by energy density limitations and heat dissipation considerations rather than by the bandwidth of the physical components. Recent studies have shown that optical packet switches do not appear to offer significant throughput improvements or energy savings compared to electronic packet switches [1]. However, the studies also show that optical switch fabrics generally become more energy efficient as the data rate increases [1]. We believe that the development of optical components could lead to a breakthrough in optical packet switches and that the use of all-optical packet switches, in which optical packets are buffered and routed in optical form, will solve the present Internet problems. Asynchronous optical packet switching appears to be suitable as a transport technology for the next-generation Internet due to the variable lengths of IP packets.
In optical packet switches, optical buffers in the output port are an integral part of solving contention by exploiting the time domain. Fiber delay lines (FDLs) are a well-known technique for achieving optical buffers since random access memory (RAM) is not achievable with current optical technologies. However, optical buffers behave differently from electronic RAM. FDLs can only delay the packets for integer multiples of a discrete amount of time, called time granularity; the maximum delay is bound and a packet will be discarded if the maximum delay is not sufficient to avoid contention.
In the FF buffers, optical packets are delayed at the output ports by passing through step-increasing-length multiple FDLs to avoid contentions [3,5]. The step-increasing-length FDLs cause delays with integer multiples of the time granularity, and output-time differences are added to the packets after passing through the FDLs. There have been many studies on FF buffers [5,[9][10][11] investigating the blocking probabilities and delays for Poisson arrival packets with generally distributed packet lengths, and highly accurate closed-form expressions for calculating both block probabilities and delays have already been obtained [13,16].
In contrast, in the FB buffers, optical packets are delayed by being fed back in re-circulating loop FDLs to avoid contentions when another packet already occupies the output, and they arrive again at the input of the switch. On the basis of reservations for re-circulating packets, two types of strategies in the FB buffer have been proposed [4]: Pre-Reservation (PreRes) and Post-Reservation (PostRes). In the PreRes scheme, the output is reserved for re-circulating packets prior to arriving packets, while in the PostRes scheme, there is no priority at the output over newly arriving packets, including re-circulating packets, and all packets are served under first-come-first-serve (FCFS) queueing discipline.
For a cost-efficient packet switch architecture, a share-per-node optical FB butter configuration has been proposed and has been extensively investigated [8,[14][15], where FDLs are shared among the output ports of a node in a feedback configuration. The shared-per-node configuration utilizes the output buffering strategy in the same way as a shared-per-port optical buffer configuration does, where each output port has its own dedicated buffer. Compared with the shared-per-port configuration, naturally, the shared-per-node configuration may achieve a better cost performance. However, it requires more complex switching controls because it has to simultaneously manage and control all packets passing through all output ports.
Generally speaking, for the FB buffers, there has been insufficient theoretical investigation of both blocking probabilities and delays. Simulations [4,[6][7][8], an approximation for the FB buffer with one FB loop [6], and numerical iterations [12,[14][15] have been performed, but the characteristics of the multi-re-circulation-loop FB buffers with general packet-length distributions have not yet been clarified. Basically, in queueing networks, the existence of feedback loops makes the stochastic processes more complex and difficult to solve [20,21]. In Jackson-type queueing networks, for example, it is clear that the arrival process for an M/M/1 system with a feedback loop is not Poisson. Then, theoretical investigations for the feedback-M/M/1 system have been carried out by approximated analyses or simulations.
In this work, we report the detailed characteristics of optical FB buffers and clarify the superiority of the FB buffers through simulations. We consider optical buffers with multi-re-circulation-loops at the output ports of an optical packet switch. The re-circulation loops have the same time granularity D and the buffers adopt the PostRes policy. For comparison, we also show the characteristics of the feedback-re-circulation-loop buffers with step-increasing-length FDLs (FBSI) and feedforward buffers with step-increasing-length FDLs (FF). Estimation items for comparing the FB, FBSI, and FF buffers are the blocking probabilities and the delays.
Concerning the optical buffers with feedback re-circulating loops, it is known that the re-circulation numbers should be limited to avoid optical signal degradation [17][18][19]. Amplified spontaneous emission (ASE) noises in optical amplifiers, which are inserted to compensate for the transmission losses in the re-circulation loops, and optical crosstalk noises, generated in the optical switches, are accumulated during optical signal re-circulations in the loops and may degrade optical signal-to-noise ratios up to the optical receiver's limits. It has been reported that the maximum re-circulation numbers required mainly depends on each optical device's abilities and optical transmission signal rates, varying largely in the range from 2 to 30 [18]. Our aim in this work is to determine the optimal characteristics of the feedback loop buffers without re-circulation limits, and therefore at first do away with the limits altogether. Next, we discuss an effect of the re-circulation limits on the blocking probabilities in considerations. Up to the present, at least, proper values of the maximum numbers have not been reported and have varied too largely to be considered. We believe that developments of the optical devices could increase the maximum re-circulation numbers, thus enabling us to do away with the limits. Section 2 presents optical buffer models the FB, FBSI, and FF buffers and explains the packet transfer algorithms used in the models. In Section 3, we present the results of simulations carried out within 10 8 packets, and compare the blocking probabilities and delays of the three optical buffers. Considerations on blocking probability drops in a deterministic distribution case and the maximum re-circulation numbers are presented in Section 4. We conclude in Section 5 with a brief summary.

Models and Algorithms
We consider optical buffers at the output ports of an optical packet switch, which is shown in Fig. 1(a). This switch has N input and N output ports, functioning as an N N × non-blocking switch, and its architecture utilizes the output buffering strategy. Each output port is equipped with a dedicated buffer containing FDLs, which is modeled as the shared-per-port type optical buffer. A switch model with the shared-per-node type optical buffer in a feedback configuration is also shown in Fig. 1(b). In this paper, we concentrate to study at the shared-per-port type optical buffer because we will clarify detailed characteristics of FB buffers compared with those of FF buffers in simpler models. Figure 2 shows the optical buffer structure models for one output port in the shared-per-port configuration, which is shown in Fig. 1  In the switch, we adopt a first-come-first-service (FCFS) policy without reservation for all packets, including re-circulated packets. This mechanism is the post-reservation (PostRes) scheme [4]. If the output is free, a packet that arrives at the switch input will be transmitted immediately. Otherwise, it will be transferred to one of the re-circulation loops. If the packet cannot be injected into any loop because all FB M loops are filled with packets at the inputs, the packet will be discarded. All packets are served under the FCFS discipline in the re-circulation loops because FDLs in the re-circulation loops all have the same delay D .
The algorithm in the FB buffer with PostRes is simple, because how to treat a packet arriving at the switch input depends on whether the output is available or not, and if not, depends on whether one of the re-circulation loops is available at the epoch of the arrival. It is also known that the blocking probability of the PostRes buffer is lower than that of the PreRes buffer [4,6]. At first, we ignore the maximum number of allowable re-circulations for packets, allowing the packets to re-circulate endlessly in the loops. Our aim with these simulations is to present characteristic values for ideal cases of the FB buffer, such as the blocking probabilities, the delay, and re-circulation numbers. The effect of the re-circulation limits on the blocking probabilities will be discussed in considerations. Figure 4(a) shows an example in which packets are transferred in the FB buffer with 2 FB M = . Packet 1 is transmitted from loop 1 to the output at 1 t . When packet 2 newly arrives at 2 t , the output is busy and so it enters loop 1.
Packet 3 arrives at 3 t and passes through loop 2 because both the output and loop 1 are busy. Packet 4 arrives at 4 t and is discarded because the output and the loops were all busy. 2) Step-increasing-length FDL feedback loop buffer (FBSI buffer) The re-circulation loops of the FBSI buffer have The policy to transfer packets for the switch is the same FCFS with PostRes as that of the FB buffer. For re-circulation loops, however, by using the step-increasing-length FDLs, re-circulated packets in the loops will be scheduled in time to avoid contentions with each other at the loop output. Therefore, as shown in Fig. 3, we can construct only one port with a multiplexer in the output side of the re-circulation loops.
In a case where a packet that should be transferred to the loops has to wait for at least w units of time, the packet will be inserted into the ( ) and x     means the smallest integer greater than x . Figure 4(b) shows an example of packets being transferred in the FBSI buffer. Two re-circulation loops are set: loop 1 with a granularity D and loop 2 with 2D . Packet 1, arriving at 1 t , successfully transmits at the output because the output is free at 1 t . Packet 2 arrives at 2 t and is inserted into loop 1 because the output is busy at 2 t due to packet 1 transmitting.
Packet 3, which arrives at 3 t , is inserted into loop 2 because both the output and loop 1 are busy at 3 t . Furthermore, void period τ is attached to packet 3 because the waiting time w times longer than that of the FB buffer with fixed-length FDLs. We need to take the total length difference into account when comparing packet delays in the buffers.

3)
Step-increasing-length FDL feed-forwarding buffer (FF buffer) The FF buffer has FF M step-increasing-length FDLs, where packets in i -th FDL will be delayed with iD ( ) . We adopt the same procedure for selecting the FDLs when the output is busy as that of the FBSI re-circulation loops; i.e., when a packet has to wait for w , it will be inserted into the ( ) and will be discarded if FF M D w < . As of now, extensive studies have been done on this model [9][10][11][12][13], and a highly-accurate approximation method for calculating blocking probabilities and delays has been presented [16].

Simulation Results
Simulations for three optical buffer models the FB, FBSI, and FF buffers as shown in Fig. 2 were carried out within 10 8 packets. The countable packet loss limit was therefore 10 -8 . We assumed Poisson packet arrivals and used three packet-length distributions, namely, exponential, uniform, and deterministic distributions. The time unit was set to be the average packet length. Traffic load ρ was equal to packet arrival rate λ . Figure 5 shows the blocking probabilities of the FB buffer for the case of 0.5 ρ = and 10 FB M = against D varying from 0 to 2.0, 3.0, 10, and 100. The circles, triangles, and crosses represent the results for the exponential, uniform, and deterministic packet-length distributions, respectively. The blocking probabilities monotonically decreased as D increased for both the exponential and uniform distribution cases, but were saturated around 5.0×10 -5 with larger-than-3.0 D . It seems the saturation occurred because, by increasing D , the packet delay increases in proportional to D but the void period does not increase and keeps in a fixed value. On the other hand, for the deterministic distribution case, the blocking probabilities sharply dropped at 1.0 D = and reached near 1.0×10 -5 .     Fig. 5. In both the FBSI and FF buffers, the blocking probabilities gradually changed with increasing D for both the exponential and uniform distribution cases. For example, in the FBSI buffer, the blocking probabilities were saturated around 2.0×10 -2 at near 2.0 D = , and reversely increased with larger-than-3.0 D . In the FF buffer, the blocking probabilities reached a minimum value near 1.0 D = and increased with increasing D . For the deterministic distribution case in both the FBSI and FF buffers, however, we found sharp decreases at 1.0 D = . The thick lines in Fig. 7 show calculations using the forth-order approximation [16], which was established with the assumption that packet virtual waiting times could be expressed by an exponential function. The calculations were in good agreement with the simulations except for near 1.0 D = of the deterministic distribution case. We have not yet obtained any exact or sufficient approximation theory for this deterministic distribution case.

FDL-granularity D Dependency
The results in Figs. 5-7 suggest that the FB buffer was the best from the viewpoint of blocking probabilities: the probabilities reached a minimum value of about 10 -2 lower than those of the FBSI and FF buffers, and a sharp decrease of the blocking probabilities appeared at 1.0 D = for the deterministic distribution case. Figure 8 shows the average circulation numbers of re-circulating packets, including discarded packets, with the same parameter values as in Figs. 5 and 6. Results for the uniform distribution case were almost equal to those for the exponential and so have been omitted in the figure. In the FB buffer, the average numbers decreased with D and were saturated at about three circulations. The average numbers in the FBSI buffer were under two circulations in the wide range of D and also were saturated. Observing that the average circulation numbers in the FB buffer were almost twice the averages in the FBSI buffer, we found that the buffer ability of the FB was twice that of the FBSI, making the blocking probabilities in the FB much lower than those in the FBSI buffer.   Figure 9 shows average delays of the output packets, i.e., in which discarded packets were excluded, with the same parameters as in Figs. 5-7. All of the delays were generated in the FDLs. The time unit was set to be the time in which a packet with an average packet length passes through the output. For example, the unit is 0.2 sec µ for 40-Gbps-speed packets with an average packet length of 1000 bytes. If the packets pass through a maximum of 100 switches for end-to-end network transmission, the total delay caused by all switches is 20 sec µ . If we require an average delay, caused by all switches between network ends, to be less than 1 msec, it means that the delay per buffer shall be less than 50. As shown in Fig. 8, the average delays for the less-than-2.0 D range are less than 12 for all the buffers, meaning that there is sufficient margin against the delay requirement.
We found that the average delays increased with increasing D for all the buffers and that the delays of the FB buffer were the least of the three. The reason for the least delays of the FB buffer in spite of its larger average circulation numbers (Fig. 8) might be that the total FDL lengths of the FBSI and FF buffers were 5.5 times longer than that of the FB buffer.
Blocking probabilities for the case of 0.8 ρ = and 40 Fig. 10, where only results in the deterministic distribution are illustrated for both the FBSI and FF buffer cases in order to compare them with the FB buffer case.
The same characteristics as in Fig.5 could explicitly be observed: 1) for the exponential and uniform distribution cases in the FB buffer, the blocking probabilities decreased with increasing D and were saturated with larger-than 2.0 D , 2) the probabilities sharply dropped at 1.0 D = for the deterministic distribution case, and 3) in both the FBSI and FF buffers, the probabilities reached minimum values near 0.25 D = and increased gradually with larger-than-0.25 D . Since the void periods were attached to packets in step-increasing length FDL buffers, such as FBSI and FF, the equivalent load reached 1.0 at 0.5 D = when 0.8 ρ = [3,9]. The buffers then failed in excess load conditions over 0.5 D = . The difference in the blocking probabilities between the FB, FBSI, and FF buffers is more clear in Fig. 10. The values were 2×10 -2 at the minimum in the FBSI and FF buffers, whereas the values were under 10 -3 with larger-than-1.0 D for all the distribution cases in the FB buffer. In particular, the value dropped at 1.0 D = and reached 10 -7 for the deterministic distribution case.  Figure 11 shows average circulation numbers with the same parameters as in Fig. 10. The average number kept at about 1.5 time circulations over the wide range of D in the FBSI buffer, whereas, in the FB buffer, the number largely varied with D and almost became more than 10. Moreover, the number decreased to 5.5 at 1.0 D = for the deterministic distribution case. Comparing this with the results in Fig. 10, we found that the blocking probabilities decreased when the circulation number decreased for the deterministic distribution case. Figure 12 shows average delays with the same parameters as in Figs. 10 and 11. The delays rapidly increased with larger-than-0.5 D in both the FF and FBSI buffers, whereas in the FB buffer the delays gradually increased with D and were less than 20. The reason the average delays in the FB buffer became 1/5 of those in the FBSI and FF buffers even though the circulation number in the FB buffer was almost 10 times that in the other two may be that the total FDL length of the FB buffer was 1/20 that of the FBSI and FF buffers. Fig. 11. Average circulation numbers with same parameters as in Fig. 10.   Fig. 12. Average delays with same parameters as in Figs. 10 and 11.    A buffer number seven times higher than the FB buffer was required for the FF buffer.

Loop Number M FB Dependency
The loop number and the buffer number are added to the switch port number, and the seven times higher buffer number requires a (7 times ports)×(7 times ports) switch. This makes the cost of the FF buffer switch 7 2 times higher than the FB buffer.

.0 D = of the Deterministic Distribution Cases
The simulations reported in Sec. 3 yielded the following.
1) The blocking probabilities in the FB buffer became about 10 -2 lower than those in the FF buffer, and the buffer number of the FB buffer can be reduced to 1/2 ( 0.5 ) that of the FF buffer when the blocking probabilities require equal values. Then, by using the FB buffer structure, the switch port scale can be drastically reduced.
2) The blocking probabilities for the deterministic case in the FB buffer sharply dropped at 1.0 D = , where the packet length was equal to the FDL loop length. Of the above, we are particularly interested in the reason 2). Figure 15 shows the flow of packets in the FB buffer with three re-circulation loops, when (a) 1.0 D < , (b) 1.0 D = , and (c) 1.0 D > . For the case shown in Fig. 15(b), each packet in each loop re-circulates without a void period because the packet length coincides with the FDL loop length. Only if the head of the packet reaches the output of the loop when the output is free, i.e., the head lies in between busy periods, that is, from 1 t to 2 t in (b), can the packet escape from the loop and go through the output. The probability that the packet escapes from the loop is ( ) , where out ρ is output load, and is expressed by . We conclude for the 1.0 D = case that each re-circulated packet occupies each loop and waits to go through the output with the probability of ( ) in the random-service rule. Figure 15(a) shows the packet flow for the 1.0 D < case. The packet has to enter another loop after circulating a first loop because the packet length is longer than the FDL loop length. The head of packet 3 entered loop 2 after circulating loop 1 and was connected by a void period. Therefore, the blocking probabilities become higher than those in the 1.0 D = case. With the exception of the 0.5 D = case, each re-circulating packet can occupy two loops and has no void period. This situation is the same as the 1.0 D = case in Fig. 15 (b), but effective loop numbers are reduced by half. Then, the blocking probabilities will locally become minimum values. The 1.0 D > case is shown in Fig. 15(c), where each packet occupies one loop. The re-circulating packet numbers are the same three as for the 1.0 D = case in Fig. (b), but the probability that the packet escapes from the loop is lower than those for the 1.

Re-circulation Number Limits
Considering the optical buffer structures (Fig. 2), we know that each packet that circulates the loops suffers from transmission losses and optical noise. These are physical layer impairments caused by losses in FDLs, optical switches, and arrayed waveguide gratings (AWGs), amplified spontaneous emission (ASE) noises in optical amplifiers, and crosstalk noises in AWGs [14][15][16]. Wavelength selective optical switches constructed by many tunable wavelength converters (TWCs) and AWGs, for example, generate both ASE and crosstalk noises. These impairments degrade signal-to-noise ratios in optical receivers. To avoid degradation, it is known that the re-circulation numbers should be limited. However, such limitations are considered a weakness in the feedback-loop-buffer structures because they increase the probabilities of blocking.  The results of the deterministic-packet-length distribution case are only shown for simplicity. In the 5-circulation-limit case, the blocking probabilities increase from 10-5 (in the case without circulation number limits; that is endless-circulation case) to 10 -2 at 1.0 D = . However, the blocking probabilities in the 20-circulation-limit case show little change from those in the endless-circulation case especially for less-than-1.0 D = range. As shown in Fig. 8, the average circulation numbers of the re-circulating packets are 2.5 at 1.0 D = . In order that the re-circulation number limits may have little influence on the blocking probabilities, therefore, it is necessary to set the limit numbers more than 7 times the average circulation number.
Considering practical FDL loop lengths, a 40-Gbps optical packet with a 1000-byte length, for example, needs a 0.2sec µ transmission time, and the FDL length with the same time granularity as 0.2 sec µ is 40 m. The transmission loss in a 40-m-long FDL is only 0.008 dB and can be neglected when the fiber loss is 0.2 dB/km. On the other hand, the losses in the optical switches are large, and optical amplifiers are inevitable to compensate for the losses. We therefore conclude that, if the switching losses become lower, the required gains in the optical amplifiers will also become lower and re-circulation number limits can be relaxed.
In order to relax the re-circulation number limits, it is effective to decrease the optical switch's port numbers, since the optical switch losses are inclined to increase in proportion to the port numbers [17]. From Figs. 13 and 14, we found that the loop numbers of the FB buffers, i.e., the switch's port numbers, can be reduced, compared with those of the FF buffers. Moreover, in the deterministic-packet-length case, the average circulation numbers have minimum values at 1.0 D = , as shown in Figs. 8 and 11, and the limits can be relaxed more. In future, however, we will expect developments of lower loss optical switches to abolish the re-circulation number limits.

Conclusions
We reported the detailed characteristics of optical FB buffers with the PostRes policy and clarified the superiority of the FB buffers through simulations. For comparison, we also showed the characteristics of FBSI and FF buffers. Our main findings are as follows.
1) The blocking probabilities in the FB buffer became about 10 -2 lower than those in the FF buffer, and if blocking probabilities are required in equal values, the buffer number of the FB buffer can be reduced to 1/2 ( 0.5 ρ = ) -1/7 ( 0.8

ρ =
) that of the FF buffer. 2) The blocking probabilities for the deterministic case in the FB buffer sharply dropped at 1.0 D = , where the packet length was equal to the FDL loop length. This sharp dropping is likely because re-circulating packets have no void period in the loops and can occupy their own loops until transmitting through the output. In this work, we carried out 10 8 packet simulations. The results can be applied to the design of WDM optical packet switches and networks with the maximum throughput. Our future work is to perform theoretical investigations to reinforce the simulation results.