20th IEEE International Conference on Data Mining (ICDM 2020)
Sprache des Tagungstitel:
Associative memories are one of the earliest artificial neural models dating back to the 1960s and 1970s. Best known are Hopfield Networks, presented by John Hopfield in 1982. Recently, Modern Hopfield Networks have been introduced, which tremendously increase the storage capacity and converge extremely fast. We generalize the energy function of modern Hopfield Networks to continuous patterns and propose a new update rule. The new Hopfield Network has exponential storage capacity. Its update rule ensures global convergence to energy minima and converges in one update step with exponentially low error. The new Hopfield network has three types of energy minima (fixed points of the update): (1) global fixed point averaging over all patterns, (2) metastable states averaging over a subset of patterns, and (3) fixed points, which store a single pattern. Surprisingly, the transformer attention mechanism is equal to the update rule of our new modern Hopfield Network with continuous states. Transformer and BERT models operate in their first layers preferably in the global averaging regime, while they operate in higher layers in metastable states. We provide a new PyTorch layer called "Hopfield", which allows equipping deep learning architectures with modern Hopfield networks as a new powerful concept comprising pooling, memory, and attention. The layer serves for applications like multiple instance learning, set-based and permutation invariant learning, associative learning, and many more. We show some tasks, for which we could increase the performance by integrating the new Hopfield layer into a deep learning architecture.
Sprache der Kurzfassung:
Hauptvortrag / Eingeladener Vortrag auf einer Tagung