

## Enhancing Sensor Readout Efficiency: Innovations and Challenges

.01



PIXEL 2024 18 - 22 NOV 2024 STRASBOURG, FRANCE







problem known since ALOHA



1-to-1 connections ⇒ Al-generated by ChatGPT impossibility of routing



interconnects



https://www-sk.icrr.utokyo.ac.jp/en/hk/about/detector



Al-generated image by ChatGPT



- no pixel above was not first to talk?
- no pixel below is already talking?
- I start talking and tell all pixels above that I'm talking!
- It should prevent talking simultaneously, but information has some latency, and CKs are not the same in all pixels



neighbors' states awareness, who is first?
Inspired by: "X-ray Detectors for LCLS-II with real-time information extraction: the Spark Div



every pixel has Al-generated image by ChatGPT



280.00 mm

- isochronicity of clocks
- distribution of reference
   clock for in-pixel TDCs is OK
   as delays can be calibrated out,



but wrongly resolved concurrency leads to corrupted data



5



more activity Al-generated image by ChatGPT → more dissipated heat and more interferences Brookhaven National Laboratory



#### power and interferences

pp. 663-674, Feb. 2014, doi: 10.1109/TNS.2013.2294673 and T. Poikela "Readout Architecture for Hybrid Pixel 6 Readout Chips", PhD thesis 2015



## **2 Steps in Data Flow**



Al-generated image by ChatGPT

#### two steps:

- get data from pixels to periphery (buffers)
- organize output awating trigger arrival

focus on extracting data from within pixel matrix to peripherals



HARP : schematic overview



# **Design with Selected Readout Platform**



common approach of building

- super pixel 2×4, 4×4 and 4×4
- analog islands encircled by logic.

Digital pixel front-end



**B**ump pads









logic (arbitration tree) placed in space between channel cores (e.g. AFE)

and T. Poikela "Readout Architecture for Hybrid Pixel Readout Chips", PhD thesis 2015

# Summary of Solutions - 1

FRAME

#### Frame-less POLLING (DATA-DRIVEN):

- typically circulating token(s) queries each pixel's state for data to transmit
- unnecessary transfers removed,
- varrying latency or dead time introduced (propagation of token)
- continuos activity (token and strobes need to be cont. repeated)

#### Frame-readout:

- all pixels are read out sequentially
- predefined order of data retrieval
- large amounts of dubious information
- no timing from readout

Brookhaven National Laboratory -RAME - LESS
Sparsified

FRAME BASED

#### Frame-less EVENT-DRIVEN:

- pixels independently signal themselves for readout
- unnecessary transfers removed,
- fixed or no latency or dead time
- ZERO activity until anything to read
- automated CAD/EDA circuital implementation possible

#### Frame-based sparsified:

- snapshot taken first
- static arbitration predefined order
- pixels fished out with:
  - priority encoder
  - token passing
- zero-supressed
- no timing from readout

# **Summary of Solution - 2**



### Frame readout

typically snaky style – shift registers:

- data out (typically count values in SPC mode)
- data in (configuration)
- data accumulated during exposure window moves a long shift registers
- often shift registers are reconfigured counters for real estate efficiency
- many detectors developed for Photon Science operating in SPC mode
- very simple ⇒ will not spend time on this type



threshold for using is occupancy

# **Summary of Solution - 3**

### Comparison



sparsified / address event encoding

Frame- Less Polling Event-Driven





# **Arbitration methods**

### **Static Arbitration:**

### with combinational logic (priority encoder / token passing)

- state of matrix snapshot before readout to prevent corruption if requests with higher priority arrive
- data sent in packages (with period of snapshotting clock)





limited capacity and resolution

### **Dynamic Arbitration:**

## true event-driven with sequential logic (queuing with memory elements)

- arbiter cell stops and queues requests clearing them one by one
- data points sent one by one (time resolution depends event rate)





# **Arbitration methods - examples**



**Brookhaven** National Laboratory National Laboratory P. Fischer, Nuclear Instruments and Methods in Physics Research A 461 (2001) 499–504 P. Yang et al., Nuclear Instruments and Methods in Physics Research A 785, pp. 61–69, 2015 G.W.Deptuch, https://lss.fnal.gov/archive/test-tm/2000/fermilab-tm-2709-ppd.pdf

.

T Poikela et al 2014 JINST 9 C01007

### Dynamic Arbitration: GALS **C**LAGS - 1

D/



FIGURE 1 Processor (P) Storage (S) Crosspoint Structure.

n arbiter is a mechanism that governs the sharing of a resource among a number of processes. An everyday example is a stoplight at an intersection. It is intended to allow the street crossing to be shared safely between two traffic flows. The traffic-actuated type of stoplight is considered to be superior to the clock-cycled type, because it does a better job of keeping the resource (the street crossing) active and the traffic flowing. Using a mechanism that allocates the resource dynamically in response to the demand achieves a better use of the resource.

One of the most common examples of arbitration in digital systems is the sharing of the main (randomaccess) storage of a computer system amongst a number of processors—instruction processors, peripheral processors, data channels, and so forth. Figure 1 shows a crosspoint structure that allows multiple processors to make concurrent accesses to main storage distributed across several independent boxes. The systems within the dashed lines are multiport stores, each of which FIGURE 2 Symbol for Self-Timed Arbiter.

covers a different range of addresses and can operat independently of the other storage boxes. One might think of processors as placing an addres and a request for a storage cycle on the (horizontal

start a request to a storage cycle on the (horizontal paths that thread through the storage boxes. Eacl storage box can detect the request and whether th address is one of its own. If more than one reques appears on a storage box's ports, how does the store determine which request to service next?

The required arbitration is sometimes accomplished by a fixed allocation of the store to different processor on different periodically repeated time slots. This approach is not unlike the clock-cycled stoplight, and is guaranteed to leave a lot of bus and storage cycle: unused unless the load is perfectly balanced. Dynamic allocation of time slots that are fixed within a synchronous communication discipline is workable scheme whose implementation is straightforward. The only disadvantages of this scheme are the usual problems inherent to synchronous systems in (1) clock distribution and (2)

10 LAMBDA First Quarter 1980



### functionally, Seitz's arbiters are metastability filters

"ghost paper" everyone cites it, but nobody can see it if someone is interested: I can send a copy of this paper!





### Dynamic Arbitration: GALS **C**LAGS - 2



D.M. Chapiro, PhD thesis, https://apps.dtic.mil/sti/pdfs/ADA154624.pdf

- in inter-processor networks (CALTECH), sources are synchronous and acquisition is asynchronous (Global Asynchronous Local Synchronous)
- arbiters with Mueller-C gates, where:
  - access granted can only be canceled by source, and
  - communication with source is impossible after Mueller-C gate is set
- each source has to <sup>(b)</sup> its access to medium



• but pixel systems are LAGS!!

Park, Jongkil, PhD thesis, https://escholarship.org/uc/item/0sc4s9v7 Shih-Chii Liu, et al, "Event-Based Neuromorphic Systems", Wiley 2015

memory is needed to avoid switching

### Dynamic Arbitration: GALS **C**LAGS - 3



once the C-gate is set, both inputs must deactivate, but interaction with other side is blocked

stage with Seitz's arbiter (P) ack1 rq1 ra out rq2 **∃ack** in ack2

who was first? (glitch-less)

Who is first, req or ack? (req blocked if ack high first)

- req goes up if any req passed through (<u>@ ack low</u>)
- if req passed though any ack change will pass down!
- this arbiter disallows any rq1 rq2 grappling (gone + appear again, faster than deactivation of ack, etc )
- interaction with other side is unblocked

17

# **Non-Greedy and Greedy Arbitration**



search for a new data source always resets randomizes/democratizes data retrieval but disrupts geometrical associations search favors the same or neighboring source may show preference for certain part of detector where rates of requests are higher due to e.g. noise



preference? not clear

N. Bingham and R. Manohar, in IEEE Trans. on Circuits and Systems I: Regular Papers, vol. 67, no. 12, pp. 4960-4969, Dec. 2020, doi: 18 10.1109/TCSI.2020.3011552.

### **Non-Greedy EDWARD**



handling of requests and responding with acknowledges

ack5 rq5 rq6 ack6 ack6 rqn-1 rqn-1 rqn D actiaven rqn

 $\overline{}$ 

ackn

fastest, selective, dead-timeless readout

- without snapshotting in readout frames;

  - No:
    - built-in geo-priority;
    - timing circuitry in pixels (not like GALS);

  - Automatic synchronization with data acquisition (LAGS)

### **Greedy EDWARD**



with acknowledges



fastest, selective, dead-timeless readout

- being read out pixel;
- No:
  - built-in geo-priority;
  - timing circuitry in pixels (not like GALS);
- extracted in phases
- Automatic synchronization with data acquisition (LAGS)

# Data Flow in Non-Greedy EDWARD





data can flow without any dead time

# **Readout in Non-Greedy EDWARD**





## **Time Instead Voltage/Current -1**



L. Cecconi et al 2023 JINST 18 C02025 G. Aglieri Rinella, et al., Nuclear Instruments and Methods in Physics Research A 1056 (2023) 168589



reading using a single line, relying on the time interval between pulses and the position of the pixel being read

# **Time Instead Voltage/Current -2**

challenge of method:

- dependence of propagation speeds of pulses on power supplies and biases
- dependence on production process variations and measurement conditions
- dependence on temperature
- how to provide continuous calibration?
- time measurement needed to decode positions
- CAD/EDA tools do not provide equivalence of STA (time enclosure) for time-domain circuit design

## perspective and inspirational!





# **Intrinsic Timing Resolution**

### **EDwARD**



## **Future?**





you , work with us!

light output

## Summary

- Optimizing Readout of segmented radiation detectors:
  - Goal: data integrity at minimal resource overhead
  - Challenges: balancing speed and power consumption
  - Focus: efficient data acquisition for radiation detectors
- Evolutionary path for zero-suppressed readouts from:
  - X-Y coordinate signaling
  - covering large areas token passing
  - static arbitration
  - data polling

to event-driven no clock/strobe distributed/broadcasted and total silence before request

- voltage domain (EDWARD)
  - RTL code for implementation parameterized and scalable including individual pixel configuration available upon request to accelerate design
- time domain (DPTS)
  - very compact implementation
  - challenging calibration



### Acknowledgements

- Thanks to all who provide information to put together my presentation
- This work has been authored by employees of Brookhaven Science Associates, LLC under Contract No. DE-SC0012704 with the U.S. Department of Energy.
- Development of EDWARD directly or indirectly has been supported by:
  - BES B&R 456165021
  - NASA Grant NNR16AC42G
  - LDRD-A 21-020, LDRD-A 24-054, B&R Code: YN0100000
  - DOE Office of Science, DE-S00012704, KA2501032/FWP# P0024
  - TM24-01 ATRO/10 FY24TMALG

