Optimise ML inference performance on custom AI chips. Covers memory hierarchy design for ML (on-chip SRAM, HBM, DRAM), weight compression and caching strategies, operator fusion, pipeline utilisation, data reuse analysis, and power-performance trade-offs on NPU simulators.
Learn more| Has discount |
![]() |
||
|---|---|---|---|
| Expiry period | Lifetime | ||
| Made in | English | ||
| Last updated at | Sun Apr 2026 | ||
| Level |
|
||
| Total lectures | 0 | ||
| Total quizzes | 0 | ||
| Total duration | Hours | ||
| Total enrolment |
0 |
||
| Number of reviews | 753 | ||
| Avg rating |
|
||
| Short description | Optimise ML inference performance on custom AI chips. Covers memory hierarchy design for ML (on-chip SRAM, HBM, DRAM), weight compression and caching strategies, operator fusion, pipeline utilisation, data reuse analysis, and power-performance trade-offs on NPU simulators. | ||
| Outcomes |
|
||
| Requirements |
|