The course will be organized in terms of
- Introductory lectures that will introduce the challenge and the research reports that discusses these challenges and solutions. This part of the course will involve 4-5 lectures, spread over two weeks. After every lecture, students will be assigned the reading material and what specific questions they are supposed to answer in their reports and presentations. N.B: Students do not give any presentations or submit reports during the first two weeks, but they start working on them. These lectures will cover a) Requirements Analysis, b) Storage and Interconnect technology and trade-offs and c) Arithmetic Implementation Options.
- Student presentations and group discussions on state-of-the-art solutions. Students will present during the next 3 weeks, 4 X 30 mins presentations each with additional 15 mins reserved for discussion. Each presentation will involve one presenter and two active listeners. Students need to prepare a report based on their presentation and the feedback they get from the teacher and the active listeners. In the 3rd and 4th Weeks, the projects are introduced and assigned, along with the format in which the students submit their project report.
- Student presentations of their projects. During the 5th and 6th weeks, students work on their project and prepare the project presentation and report. Teacher available on demand for discussions. In weeks 7/8, the project presentations will be given and reports submitted
The course consists of the following two modules:
Requirements Analysis
In this module, we study how to systematically extract requirements in terms of computational operations, their types, interconnect and storage. These requirements are logical and are independent of the implementation style. Many real-life examples will be discussed in class and students assigned problems for hands-on experience.
Being able to understand the energy requirements is the first step in creating low-energy and thus sustainable solutions.
Architecting AI Hardware and Understanding technology and architectural trade-offs
In this module, we study what are the architectural trade-offs when implementing AI hardware. We go into the details of memory hierarchy and their technology options. Memory is the most dominant cost-component and we study how to exploit temporal locality to minimize the cost of memory storage and memory access.
Next to memory, interconnect is the biggest challenge. Wires are the worst scaling aspect of technology today. For instance, moving data by 1 mm on a chip is comparable in energy cost to a single precision floating point. Besides energy cost, interconnect plays a strong role in architectural decisions as well. For instance, it is a common mistake to increase parallelism in computation without increasing the parallelism in access to memory. We show how we can architect designs that allow increase in computation with matching increase in bandwidth to memory.
Finally, we also study what are the options to implement the arithmetic operations in Neural Networks. We also study how to do trade-offs in terms of accuracy vs. implementation cost with the help of a concrete case study from the field of bacterial genome recognition.
Knowing these architectural and technological options to reduce energy will contribute to sustainable AI solutions.
After passing this course, the students will be able to
- Analyze the requirements of a real-life machine learning problem in terms of storage, computation and power,
- Make informed decisions based on available technology, architectural options, accurate estimates of area, performance and energy that would best meet the targets for the machine learning problem,
- Create low-energy custom AI solutions that would contribute to a sustainable development,
- Evaluate major research trends and understand what are the open challenges that the community is focusing on.