Computational and Applied Mathematics Seminar

A graph with curved lines and a spectrum of color in the background. Because graphs of complex-valued functions naïvely require four dimensions to display, they can be very difficult to visualize. Domain coloring works around this by representing two dimensions as colors: given a function f, color the point (x, y) using the value of f(x + iy). The hue represents the argument of f(x + iy), and the luminosity varies periodically with the magnitude to produce a terraced effect.

Oct

2025

3:00 pm – 4:00 pm

In Person (view details)

Featured Speaker(s): Ryan Saab

Cost: Free

Quantization for Compressing Neural Networks: Theory and (New) Algorithms

Description

Quantization compresses neural networks by representing weights and activations with few bits, reducing memory, computation time, and energy while preserving inference accuracy.

We analyze OPTQ, a widely used quantization algorithm in the literature. We provide new theory: an error-evolution identity, layerwise error bounds, and theoretical justification for heuristics used in practice, including for feature ordering, regularization, and alphabet size. We further study a stochastic variant that yields entrywise control on the error. With these results in hand we introduce Qronos, a new related algorithm that first corrects errors resulting from previous layers, and thus attains stronger guarantees . We conclude with numerical results on modern language models. However, the underlying optimization problems are NP-hard in general. So, one must settle for computationally efficient approximate solutions, ideally ones with theoretical error guarantees.

Location

PMA 10.166

Computational and Applied Mathematics Seminar

Description

Location

Subscribe

Share

Audience