Abstract :To Get The Best Performance And Performance Of Deep Learning Models, The DLAU Is A Scalable Deep Studying Accelerator Designed To Run On FPGA. The Architecture Utilizes The Parallelism And Configurability Offered By FPGAs To Enable High-throughput Processing With A Lower Power Budget Compared To Traditional Processors. DLAU Accelerates Training And Inference Workloads For A Wide Range Of Deep Learning Frameworks Using A Flexible Interconnect Combined With Purposebuilt Processing Units. Due To Its Scalability, It Can Easily Be Adapted To A Large Number Of Application Domains, Providing Edge Devices And Cloud-based Systems With A High-performance Solution To Balance Energy Efficiency And Computing Resources Against The Workload Needs. The DLAU Accelerator Employs Tiling Techniques To Exploit Locality For Deep-learning Workloads While Utilizing Three Pipelined Processing Units To Optimize Throughput. Moreover, Experimental Results On The Latest Xilinx FPGA Board Demonstrate That The DLAU Accelerator Is Able To Provide A Speedup Of Up To 36.1x Against The Intel Core2 Processors At A Power Consumption Of 234mW. Keywords: FPGA Acceleration, Deep Learning, DLAU Architecture, Neural Network Processing, Hardware Accelerator, Energy Efficiency, High Throughput Computing, Scalable Architecture, Tiling Technique, Pipelined Processing, Edge Computing, Cloud-Based AI Systems, Reconfigurable Hardware, Low Power Consumption, Xilinx FPGA, Parallel Processing, Deep Neural Networks (DNN), AI Hardware Optimization, Inference Acceleration, Training Acceleration |
Published:06-11-2025 Issue:Vol. 25 No. 11 (2025) Page Nos:54-62 Section:Articles License:This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. How to Cite |