Delay Based Auto-Calibrated PVT Monitor System and Method

: In this paper, an auto-calibrated PVT (process, voltage, temperature) monitoring system based on delay chains and flip-flops is presented. The system and method are proposed to be used by IP’s (Intellectual property) that require to monitor PVT conditions during its operation and depending on the detected changes be configurable or adaptive. The methodology is based on embedded PVT monitors that sense when the propagation delay variation in standard cells reaches a certain threshold. The system implementation is intended to be done since the RTL (register transfer level) design stage to avoid or reduce the full custom design effort. The PVT monitors are built using buffers from a technology design kit. The information of the PVT monitors is sent to a logic module that calibrates the monitors to choose the best monitoring option depending on the PVT corner, available clock


Introduction
With the increase of operation speed and area reduction of modern VLSI circuits, improved methodologies to allow circuits to operate at higher frequencies and improve power and timing management became an important area of study. The impact of the PVT variations in CMOS circuits increase with the technology scaling, making reliability among the most important challenges facing nanoscale system design. Adaptive solutions have been proposed to minimize the performance lost due to PVT variations, allowing systems to tolerate worst-case scenarios by reducing the delay and power impact under normal operation [1][2][3][4][5][6][7][8]. For adaptive methodologies, voltage, frequency, current, power, and activity monitors are used to control the circuit behavior or performance [9][10][11]. Many of these methodologies require separate monitors to control each of the circuit variables. One of the main challenges for this control or adaptive systems is that all the variables combined impact the circuit performance in a nonlinear way, and in a different way when just varying one of the variables at a time. When the different variables are sensed separated, the control circuits should consider all the variations to calculate and determine the result of the overall impact. In order to detect the PVT variations many solutions have been proposed, requiring full custom or analog circuits and hence more area or design effort [12][13][14][15][16]. Solutions include critical path, delay synthesizer, RC, pulse generation and detection circuits, bias voltage generators and detectors, and delay lines among others [17][18][19]. The main idea in the current proposal is to achieve the PVT detection considering the mixed variables impact reflected in the propagation delay. A PVT detector presented in [20] senses the change of the PVT conditions using a delay line or delay chain (Figure 1). This delay line uses a logic inverter chain and D flip-flops (FF1, FF2). The delay line is the element that senses the delay impact on the digital logic due to PVT variations. The CK signal is connected to CK1, CK2 inputs and passed through the delay line to have a delayed version at D1 input. This value will be captured and propagated to the Q1 output indicating that there is no detection of a high delay condition, equivalent to a non-significant PVT change. By contrast, when the delay increases significantly due to PVT changes, a logic 1 can be present in D1 at a CK positive edge. In this case, FF1 will capture a logic 1 and will propagate it to the Q1 output. FF2 is used as a synchronizer flip-flop to avoid metastability, and with this the Q1 value is propagated to Q2 in the next clock cycle. A PVT monitor of this type can be designed to detect different delay levels. A "Multi-level" PVT detector presented in [20] is shown in Figure 2. This monitor is using the same delay detector approach already explained, but using more flip-flops connected to different points of the delay line in order to detect different delay levels. The different delay levels will be represented with the Q output of the flip-flops together as a binary value (Q0 to Qn). In summary, the basic functionality of this PVT monitor is based on creating a delay to a clock signal and use the non-delayed and delayed versions as the clock and D inputs of a type D flip-flop. Depending on the propagation delay of the delayed clock, and the setup time of a flip-flop, the Q output will be low or high, which can be expanded to detect different delay levels using more than one delay chain and multiple flip-flops. In this work, a complete PVT system is proposed to be multi-line, multi-threshold, and multi-clock. This system is intended to be implemented with standard cells. The advantages of using standard cells are that the design of the PVT monitors can be calculated based on the standard cell information; and the capability to modify the delay chains size and connectivity faster than in a full custom design, since the implementation is done at RTL level. The advantage of having multi-clock capability allows to adjust the monitors when the desired clock speed is not available, and when the relationship between the clock frequency and the standard cells delay can cause a reduction in the granularity or range of the measurements. Figure 3 shows a top diagram of the general PVT monitor system proposed. As observed in the block diagram, the system is intended to have n number of PVT monitors inside an IP. The system includes a module to collect the PVT monitors data (PVT Monitors output data control) and a module dedicated to the calibration (PVT Monitors auto-calibration logic). The clock used by the PVT monitors ("PVT monitors CK") can be an external input to the IP ("External CK") or can be generated inside ("Internal IP CK"). Additionally, the system has the capability to override the calibration and select the delay chains and clocks from outside the IP. The delay chain and clock selection outputs from the auto-calibration logic are sent to the PVT monitors. Each PVT monitor sends back the delay level to the auto-calibration logic according to the settings. After the calibration is done, this module sends out a monitor ID, the delay level, and the monitor settings of each of the n PVT monitors ("CK sel" and "Delay chain sel" values). This information is received by the output data control to generate the serialized data and send the parallel or serial outputs inside or outside the IP. Figure 4 shows the connectivity between the IP containing the PVT monitoring system and an external receiver device (not considered as part of this work) containing a control system for the IP. Figure 5 shows the connectivity example using the parallel output data for a control system located inside the same IP.

PVT Monitors
The PVT monitor block shown in Figure 6 receives the PVT monitors clock and using clock dividers generates n number of clock signals dividing the clock signal successively. The diagram shows the use of n delay chains indicated as "Delay chain X1 to Xn". The delay chain has n number of delay blocks (DBXn) corresponding to standard cell buffers.    Figure 7 shows the content of the delay blocks of different delay chains implemented with buffers. The "CK sel" input selects the divided or not divided clock to be used by the delay chains and the "Delay threshold detector" blocks. The internal clock selected inside the monitor corresponds to "CKi". Each output of the n delay blocks inside the delay chain is connected to each n delay threshold detector. For this proposal, the delay threshold detector is implemented with the two D flip-flops explained in section I and shown as an element in Figure 8. Nevertheless, a different delay detector can be used in the system. For each delay chain, the delay threshold detectors outputs together represent the delay level in a thermometer mode.   In order to reduce the number of bits to represent this value, and to be logically interpreted by the calibration and other logic, a priority encoder is used. The outputs of the priority encoders indicating the delay level value of each delay chain are connected to a multiplexer that selects the data from a chosen delay chain depending on the "Delay chain sel" value.

Autocalibration Logic
The inputs and outputs of the PVT monitor auto-calibration logic are connected in a feedback mode to the different PVT monitors inside the IP. The calibration logic sends the control signals to the monitors to try different settings, receiving back the different delay levels detected from the monitors to save the values obtained. Then, the calibration logic selects the more adequate values to send them to the output data control. The propose of the system is to provide at the output of the IP a value that corresponds to the PVT conditions monitored inside the IP and use this information for adaptive systems and control. The system is not intended to detect changes in PVT conditions at each clock cycle or at high speed. For this reason, the calibration does not require to be a search that takes necessarily the smaller number of steps. Since the different delay chains in the PVT monitors are sensing simultaneously, the data is always present. Hence, the calibration algorithm is only checking the value that is more centered to select it and propagate it to the IP output. Figure  9 shows the diagram of the calibration flow. The goal of the calibration is to maintain the delay level output data the most centered possible. For instance, for a 16 bits output (4 bits after encoding), the level sensed would be maintained near to the mid-level decimal value 8. As a first step, the calibration logic will set the delay level midpoint corresponding to the delay level output. Then, the calibration will start trying the different clocks available one by one in the same delay chain. Each iteration will perform the measurement, will get the difference with the midpoint, and will save the different delay level values corresponding to the different clock variations. Once all the clocks have been tried, the next delay chain available will be tried with all the available clocks until trying all the delay chains. In summary, the calibration will try all the clock and delay chain combinations in a nested loop fashion. When all the available clocks and delay chains have been tried and the delay level values saved, the calibration selects the settings that provide a delay level value with less difference to the midpoint defined. This value will be the one to be sent from each PVT monitor to the output. The calibration will continue sending the data to the outputs during a defined time and then will restart the calibration again. The calibration logic will save the different delay levels and settings combinations from each PVT monitor with a corresponding ID. This information is sent to the output data control to be serialized. One of the main important features of this system is that is not designed for a specific system clock speed, making the calibration required and useful.

Output Data Control
The PVT monitor output data control logic module collects the PVT monitors data from the auto-calibration logic and generates a serial data code with the monitors information. The proposal of the serial data code is shown in Figure 10.  The signal "pvt_monitors_data" corresponds to the data packet including the monitor ID, the monitor settings, and the delay level value (Monitor value). The monitor settings indicate which clock, and which delay chain are being used ("CK sel" and "Delay chain sel"). The "data_valid" is a synchronization signal that at logic high indicates that the data packet is valid. In this code proposal, there is a clock cycle between data packets to indicate that a data packet has finished and that another will start to synchronize the decoding. The serial data code is constantly cycling to show the different monitors data one after another.

Method
The goal of this methodology is to provide a constant information inside or outside of the chip of how the PVT is affecting the delay inside. One of the main ideas behind this methodology is to have a standard procedure that can be quickly included and implemented in an IP without putting effort into full custom design. The option of having clock and chain selection allows the system to guarantee a broader range of detection in case of not having the expected delay detection conditions or high-speed clocks. The intention of this methodology is not to inform which is the voltage, temperature, or process by separate, or which corner case is exactly happening. In order to reduce the design cycle that could be taken by a full custom design, the method proposed is to use standard cells and the RTL design flow. Nevertheless, there is no restriction to use full custom cells. The use of custom cells could provide a more linear or optimized design but with increased design time and without reducing completely the impact of the PVT variables. For this method is recommended to use only standard cells provided by the technology vendor and buffers with the smallest propagation delay and size. The buffers should have enough strength to drive a same size buffer plus a flip-flop. This will provide the highest granularity. Depending on the number of flip-flops and delay chains used in the monitor, stronger buffers should be used to drive the clock signals. Although this method does not require necessarily a pre-silicon full custom design, the proposal is to perform a pre-silicon characterization by simulating the circuit, or also calculate the expected results based in the standard cells data sheets. The method recommends also having a PVT post-silicon characterization in order to correlate the results with pre-silicon data and calibrate the receiver module or device. It is assumed that the control system (not part of this work) considers already how to respond depending on the delay levels, delay chain, and clock selection values provided by the PVT monitors.
As an implementation case, a PVT monitor with 16 levels of delay, 4 different clock speeds, and 4 different delay chains is shown in Figure 11. This PVT monitor is implemented at transistor level using 130 nm technology standard cells and characterized through analog simulations. The PVT monitor is capable to select 4 different delay chains and 4 different clocks. 1 clock corresponds to the system clock and 3 additional obtained from the division by 2 of the system clock. The number of delay threshold detectors (pair of flip-flops) connected to the different nodes of the delay chains is determined to be 14 to generate a 16-bit output representing the delay level. The 16-bit data is translated by the priority encoder to have a 4-bit delay level signal. The priority encoder truth table is shown in Table 1, where the delay level 0 means that none of the delay chains detected a PVT change. The delay levels 1 to 15 represent the detection given by a logic 1 present in the threshold detector outputs 0 to 14. The specification of the design kit used indicates that the minimum/maximum voltage and temperature functional ranges are from 1.08 V to 1.32 V, and from -40°C to 125°C respectively. The typical functional conditions are 1.2 V and 25°C. For this implementation a 1 GHz clock was considered as the highest clock speed since is the maximum speed supported by this technology. The results of the measurements for 27 PVT corners, the 4 clock frequencies, and the 4 different delay chains are presented in Figure 12- Figure 15. The "Y" axis is indicating the "Delay level output" decimal value detected by each delay chain.  Figure 11. PVT monitor system current implementation.         It is observed that, as expected, the response is not completely linear but in general showing the same trending among the different PVT corners. There are few corner cases showing nonlinearities of 1 or 2 delay levels from the expected trend line caused by temperature variations (example indicated in Figure 12 as "Nonlinearity"). These are corners cases that are not having the same impact on the delay in some chains, but it does not mean that the detection is not happening. This is observed in this way since all the characterization results are compared together with the same PVT order in the "X" axis. It is important to note that not all the corners are detected in some of the delay chains (out of range). This demonstrates one of the reasons why the calibration feature can help to look for or expand the detection range. The detection range can be increased by design, increasing the number of delay elements in a delay chain, and with this the number of bits to represent the delay level. Nevertheless, it is important to consider that this would increase area and power and does not guarantee to cover all the PVT ranges in case of not having the expected clock or PVT conditions. Although the information of the PVT level is not intended to have a precise detection of which PVT combination is exactly presented, different zones of the PVTs can be estimated with the monitors information. Figure 16 shows the different voltages, and process ranges marked and ordered based in the 4 delay chains characterization in this implementation. It can be observed that for this technology, the process and voltage determine more the delay level change than the temperature. The system information can be used in combination with voltage or temperature measurements to provide more precise information.

Calibration Example
In order to exemplify how the system will work, the following example is presented. In this case, it will be considered that the IP is operating under low voltage, fast-fast process, and high temperature, corresponding to 1.08 V, ff, and 25°C. Figure 17 shows the results of the characterization for only the PVT monitors detecting a level different than zero and near to the mid-point for this corner. The system calibration selects the monitor combination that corresponds to a level closer to the mid-point, which in this case is 8. After calibrating, the system selects the delay chain X1 using a 1 GHz clock since is nearer to the midpoint than X2 at 500 MHz. As a variation for this example, suppose that 1 GHz and 500 MHz clocks are not available, and just a 250 MHz maximum speed clock can be used. In this case, the delay chain X1 and the delay chain X2 are not capable of detecting this corner (see Figure 12 and Figure 13 respectively). Thus, the system will calibrate to choose the delay chain X4, since at 250 MHz is only 2 levels below the mid-point value (level 6). For this example and characterization, the clocks were established to be divisions of 1 GHz, nevertheless, the system is designed to look for calibration with any clock value that the system can provide. The importance of calibrating at the mid-point is to allow the system to be stable or react faster through the constant PVT changes.

Conclusion
In this work, a PVT monitoring system and methodology that are intended to be used inside an IP were presented. The proposed PVT monitors were implemented with delay chains and flip-flops. The system has the capability to look for the adequate clock and delay chains available in order center the monitors data across the detection range. Logic modules inside the system collect data from the different monitors constantly and gather the information to provide it to the outputs in parallel and serial modes. An output serial data code containing the information from the monitors provided internally and externally to the IP was presented and explained.

V
The control flow diagram and the algorithm used for the calibration was shown. The monitor implementation was done at the transistor level and characterized through analog simulations. The characterization results were used to explain examples of the calibration. The results show the advantages that the calibration can provide and the PVT ranges that can be covered with the current implementation. The goals of this methodology were described and clarified, being mainly to detect the variation of voltage, process, and temperature variables together as propagation delay impact inside the IP. With the information from the characterization is demonstrated that this system can be an option to detect the PVT impact inside an IP and used for control systems. The main advantages are the reduced implementation effort and the reduced area, compared with other full custom and individual PVT variables monitoring proposals.