Deep neural network (DNN) models are increasingly deployed in real-time, safety-critical systems such as autonomous vehicles, driving the need for specialized AI accelerators. How- ever, most existing accelerators support only non-preemptive execution or limited preemptive scheduling at the coarse granu- larity of DNN layers. This restriction leads to frequent priority inversion due to the scarcity of preemption points, resulting in unpredictable execution behavior and, ultimately, system failure.
To address these limitations and improve the real-time performance of AI accelerators, we propose CLARERT, a novel accelerator architecture that supports fine-grained, intra-layer flexible preemptive scheduling with cycle-level determinism. CLARERT incorporates an on-chip Earliest Deadline First (EDF) scheduler to reduce both scheduling latency and variance, along with a customized dataflow design that enables intra-layer preemption points (PPs) while minimizing the overhead associated with preemption. Leveraging the limited preemptive task model, we perform a comprehensive predictability analysis of CLARERT, enabling formal schedulability analysis and optimized placement of preemption points within the constraints of limited preemptive scheduling. We implement CLARERT on the AMD ACAP VCK190 reconfigurable platform. Experimental results show that CLARERT outperforms state-of-the-art designs using non-preemptive and layerwise-preemptive dataflows, with less than 5% overhead in worst-case execution time (WCET) and only 6% additional resource utilization.