Hardware Architecture Papers
@WWVY
Systems organization and hardware architecture new submissions to http://arxiv.org (not affiliated with http://arxiv.org)
A 55-nm SRAM Chip Scanning Errors Every 125 ns for Event-Wise Soft Error Measurement. arxiv.org/abs/2504.08305
The New LLM Bottleneck: A Systems Perspective on Latent Attention and Mixture-of-Experts. arxiv.org/abs/2507.15465
GNN-ACLP: Graph Neural Networks Based Analog Circuit Link Prediction. arxiv.org/abs/2504.10240
DiP: A Scalable, Energy-Efficient Systolic Array for Matrix Multiplication Acceleration. arxiv.org/abs/2412.09709
Real-Time Object Detection and Classification using YOLO for Edge FPGAs. arxiv.org/abs/2507.18174
PRACtical: Subarray-Level Counter Update and Bank-Level Recovery Isolation for Efficient PRAC Rowhammer Mitigation. arxiv.org/abs/2507.18581
Sandwich: Separating Prefill-Decode Compilation for Efficient CPU LLM Serving. arxiv.org/abs/2507.18454
Designing High-Performance and Thermally Feasible Multi-Chiplet Architectures enabled by Non-bendable Glass Interposer. arxiv.org/abs/2507.18040
Ironman: Accelerating Oblivious Transfer Extension for Privacy-Preserving AI with Near-Memory Processing. arxiv.org/abs/2507.16391
Efficient Precision-Scalable Hardware for Microscaling (MX) Processing in Robotics Learning. arxiv.org/abs/2505.22404
Enabling Efficient Transaction Processing on CXL-Based Memory Sharing. arxiv.org/abs/2502.11046
Hardware-Efficient Photonic Tensor Core: Accelerating Deep Neural Networks with Structured Compression. arxiv.org/abs/2502.01670
Balancing Robustness and Efficiency in Embedded DNNs Through Activation Function Selection. arxiv.org/abs/2504.05119
GCC: A 3DGS Inference Architecture with Gaussian-Wise and Cross-Stage Conditional Processing. arxiv.org/abs/2507.15300
Per-Bank Bandwidth Regulation of Shared Last-Level Cache for Real-Time Systems. arxiv.org/abs/2410.14003
Custom Algorithm-based Fault Tolerance for Attention Layers in Transformers. arxiv.org/abs/2507.16676
Optimization of DNN-based HSI Segmentation FPGA-based SoC for ADS: A Practical Approach. arxiv.org/abs/2507.16556
SVAgent: AI Agent for Hardware Security Verification Assertion. arxiv.org/abs/2507.16203
RealBench: Benchmarking Verilog Generation Models with Real-World IP Designs. arxiv.org/abs/2507.16200
MTU: The Multifunction Tree Unit in zkSpeed for Accelerating HyperPlonk. arxiv.org/abs/2507.16793