Buy now
Optimise ML inference performance on custom AI chips. Covers memory hierarchy design for ML (on-chip SRAM, HBM, DRAM), weight compression and caching strategies, operator fusion, pipeline utilisation, data reuse analysis, and power-performance trade-offs on NPU simulators.
₹50000
0 Lessons
Hours
Buy now
Architect AI accelerators (NPU/TPU) for ML inference and training. Covers systolic array architecture, dataflow analysis (weight stationary, output stationary), GEMM/convolution hardware mapping, on-chip SRAM sizing, memory bandwidth analysis, and tiling strategies. Projects design and simulate a si
₹50000
0 Lessons
Hours