2017
- The Brain’s Computational Efficiency derives from using Sparse Distributed Representations. Rejected from Cognitive Computational Neuroscience 2017.
Abstract: Machine learning (ML) representation formats have been dominated by: a) localism, wherein individual items are represented by single units, e.g., Bayes Nets, HMMs; and b) fully distributed representations (FDR), wherein items are represented by unique activation patterns over all the units, e.g., Deep Learning (DL) and its progenitors. DL has had great success vis-a-vis classification accuracy and learning complex mappings (e.g., AlphaGo). But, without massive machine parallelism (MP), e.g., GPUs, TPUs, and thus high power, DL learning is intractably slow. The brain is also massively parallel, but uses only 20 watts and moreover, the forms of MP used in DL, model / data parallelism and shared parameters, are patently non-biological, suggesting DL’s core principles do not emulate biological intelligence. We claim that a basic disconnect between DL/ML and biology and the key to biological intelligence is that instead of FDR or localism, the brain uses sparse distributed representations (SDR), i.e., “cell assemblies”, wherein items are represented by small sets of binary units, which may overlap, and where the pattern of overlaps embeds the similarity/statistical structure (generative model) of the domain. We’ve previously described an SDR-based, extremely efficient, one-shot learning algorithm in which the primary operation is permament storage of experienced events based on single trials (episodic memory), but in which the generative model (semantic memory, classification) emerges automatically, and as a computationally free, in terms of time and power, side effect of the episodic storage process. Here, we discuss fundamental differences between the mainstream localist/FDR-based and our SDR-based approaches.
- A Radically new Theory of How the Brain Represents and Computes with Probabilities. (arXiv)
Abstract: The brain is believed to implement probabilistic reasoning and to represent information via population, or distributed, coding. Most previous population-based probabilistic (PPC) theories share several basic properties: 1) continuous-valued neurons; 2) fully(densely)-distributed codes, i.e., all(most) units participate in every code; 3) graded synapses; 4) rate coding; 5) units have innate unimodal tuning functions (TFs); 6) intrinsically noisy units; and 7) noise/correlation is considered harmful. We present a radically different theory that assumes: 1) binary units; 2) only a small subset of units, i.e., a sparse distributed code (SDC) (cell assembly, ensemble), comprises any individual code; 3) binary synapses; 4) signaling formally requires only single (first) spikes; 5) units initially have completely flat TFs (all weights zero); 6) units are not inherently noisy; but rather 7) noise is a resource generated/used to cause similar inputs to map to similar codes, controlling a tradeoff between storage capacity and embedding the input space statistics in the pattern of intersections over stored codes, indirectly yielding correlation patterns. The theory, Sparsey, was introduced 20 years ago as a canonical cortical circuit/algorithm model, but not elaborated as an alternative to PPC theories. Here, we show that the active SDC simultaneously represents both the most similar/likely input and the coarsely-ranked distribution over all stored inputs (hypotheses). Crucially, Sparsey's code selection algorithm (CSA), used for both learning and inference, achieves this with a single pass over the weights for each successive item of a sequence, thus performing spatiotemporal pattern learning/inference with a number of steps that remains constant as the number of stored items increases. We also discuss our approach as a radically new implementation of graphical probability modeling.
2014
- Sparsey™: Event recognition via deep hierarchical sparse distributed codes. (In Review) Frontiers in Computational Neuroscience.
- Sparse Distributed Coding & Hierarchy: The Keys to Scalable Machine Intelligence. DARPA UPSIDE Year 1 Review Presentation. 3/11/14. (PPT)
2013
- A cortical theory of super-efficient probabilistic inference based on sparse distributed representations. 22nd Annual Computational Neuroscience Meeting, Paris, July 13-18. BMC Neuroscience 2013, 14(Suppl 1):P324 (Abstract)
- Constant-Time Probabilistic Learning & Inference in Hierarchical Sparse Distributed Representations, Invited Talk at the Neuro-Inspired Computational Elements (NICE) Workshop, Sandia Labs, Albuquerque, NM, Feb 2013.
2012
- Probabilistic Computing via Sparse Distributed Representations. Invited Talk at Lyric Semiconductor Theory Seminar, Dec. 14, 2012.
- Quantum Computation via Sparse Distributed Representation. (2012) Gerard Rinkus. NeuroQuantology 10(2) 311-315.
Abstract: Quantum superposition states that any physical system simultaneously exists in all of its possible states, the number of which is exponential in the number of entities composing the system. The strength of presence of each possible state in the superposition—i.e., the probability with which it would be observed if measured—is represented by its probability amplitude coefficient. The assumption that these coefficients must be represented physically disjointly from each other, i.e., localistically, is nearly universal in the quantum theory/computing literature. Alternatively, these coefficients can be represented using sparse distributed representations (SDR), wherein each coefficient is represented by a small subset of an overall population of representational units and the subsets can overlap. Specifically, I consider an SDR model in which the overall population consists of Q clusters, each having K binary units, so that each coefficient is represented by a set of Q units, one per cluster. Thus, K^Q coefficients can be represented with KQ units. We can then consider the particular world state, X, whose coefficient’s representation, R(X), is the set of Q units active at time t to have the maximal probability and the probabilities of all other states, Y, to correspond to the size of the intersection of R(Y) and R(X). Thus, R(X) simultaneously serves both as the representation of the particular state, X, and as a probability distribution over all states. Thus, set intersection may be used to classically implement quantum superposition. If algorithms exist for which the time it takes to store (learn) new representations and to find the closest-matching stored representation (probabilistic inference) remains constant as additional representations are stored, this would meet the criterion of quantum computing. Such algorithms, based on SDR, have already been described. They achieve this "quantum speed-up" with no new esoteric technology, and in fact, on a single-processor, classical (Von Neumann) computer.
2010
- A cortical sparse distributed coding model linking mini- and macrocolumn-scale functionality. (2010) Gerard Rinkus. Frontiers in Neuroanatomy 4:17. doi:10.3389/fnana.2010.00017
2009
- Familiarity-Contingent Probabilistic Sparse Distributed Code Selection in Cortex. (in prep, also see this page)
- Overcoding-and-Pruning:A Novel Neural Model of Temporal Chunking and Short-term Memory. (2009) Gerard Rinkus. Invited Talk in Gabriel Kreiman Lab, Dept. of Opthamology and Neuroscience, Children's Hospital, Boston, July 31, 2009.
- Overcoding-and-paring: a bufferless neural chunking model. (2009) Gerard Rinkus. Frontiers in Computational Neuroscience. Conference Abstract: Computational and systems neuroscience. (COSYNE '09) doi: 10.3389/conf.neuro.10.2009.03.292
2008
- Population Coding Using Familiarity-Contingent Noise.(abstract/poster) AREADNE 2008: Research in Encoding And Decoding of Neural Ensembles, Santorini, Greece, June 26-29. (
abstract) (
poster)
- Overcoding-and-pruning: A novel neural model of sequence chunking (manuscript in prep) ...Patent pending
- A Cortex-inspired Associative Memory with O(1) Learning, Recall, and Recognition of Sequences.
(manuscript in prep)
Abstract: We present a radically new model of chunking, the process by which a monolithic representation emerges for a sequence of items, called overcoding-and-pruning (OP). Its core insight is this: if a sizable population of neurons is assigned to represent an ensuing sequence immediately, at sequence start, it can then be repeatedly pruned as functions of each successive item. This solves the problem of assigning unique chunk representations to sequences that start in the same way, e.g., "CAT" and "CAR", without requiring temporary buffering of the items' representations. OP rests on two well-supported assumptions: 1) information is represented in cortex by sparse distributed representations; and 2) neurons at progressively higher cortical stages have progressively longer activation duration-or, persistence. We believe that this type of mechanism has been missed so far due to the historical bias of thinking in terms of localist representations, which cannot support it since pruning cannot be applied to a single representational unit.
2007
- A Functional Role for the Minicolumn in Cortical Population Coding. Invited Talk at Cortical Modularity and Autism, University of Louisville, Louisville, KY, Oct 12-14, 2007. A revised/corrected version of the talk (PPT) (
pdf Animations do not show in pdf version)
2006
- Hierarchical Sparse Distributed Representations of Sequence Recall and Recognition. Presentation given at The Redwood Center for Theoretical Neuroscience (University of California, Berkeley) on Feb 22, 2006: powerpoint (~6 meg). (video of talk) (Note: the ppt presentation uses a lot of animations so you probably need a very up-to-date version of ppt to view it correctly.)
2005
- Time-Invariant Recognition of Spatiotemporal Patterns in a Hierarchical Cortical Model with a Caudal-Rostral Persistence Gradient (
abstract &
poster) (2005) Rinkus, G. J. & Lisman, J. Society for Neuroscience Annual Meeting, 2005. Washington, DC. Nov 12-16. Note that this poster is almost identical to the one presented at the First Annual Computational Cognitive Neuroscience Conference.
- A Neural Network Model of Time-Invariant Spatiotemporal Pattern Recognition (abstract & poster) (2005) Rinkus, G. J. First Annual Computational Cognitive Neuroscience Conference, Washington, DC, Nov. 10-11.
2004 and earlier
A Neural Model of Episodic and Semantic Spatiotemporal Memory. (2004) Rinkus, G.J. Proceedings of the 26th Annual Conference of the Cognitive Science Society. Kenneth Forbus, Dedre Gentner & Terry Regier, Eds. LEA, NJ. 1155-1160. Chicago, Ill.
A Quicktime animation that walks you through the example in Figure 4 of the paper.
- Software tools for emulation and analysis of augmented communication. (2003) Lesher, G.W., Moulton, B.J., Rinkus, G. & Higginbotham, D.J. CSUN 2003, California State University, Northridge.
- Adaptive Pilot-Vehicle Interfaces for the Tactical Air Environment. (2001) Mulgund, S.S., Zacharias, G.L., & Rinkus, G.J. in Psychological Issues in the Design and Use of Virtual Adaptive Environments. Hettinger, L.J. & Haas, M. (Eds.) LEA, NJ. 483-524.
Leveraging word prediction to improve character prediction in a scanning configuration. (2002) Lesher, G.W. & Rinkus, G.J. Proceedings of the RESNA 2002 Annual Conference. Reno.
-
Domain-specific word prediction for augmentative communications. (2001) Lesher, G.W. & Rinkus, G.J. Proceedings of the RESNA 2002 Annual Conference, Reno.
- Logging and analysis of augmentative communication. (2000) Lesher, G.W., Rinkus, G.J., Moulton, B.J., & Higginbotham, D.J. Proc. of the RESNA 2000 Annual Conference, Reno. 82-85.
-
Intelligent fusion and asset manager processor (IFAMP). (1998) Gonsalves,P.G. & Rinkus, G.J. Proc. of the IEEE Information Technology Conference (Syracuse, NY) 15-18.
- A Monolithic Distributed Representation Supporting Multi-Scale Spatio-Temporal Pattern Recognition" (
abstract & poster) (1997) Int'l Conf. on Vision, Recognition, Action: Neural Models of Mind and Machine, Boston University, Boston, MA May 29-31.
Situation Awareness Modeling and Pilot State Estimation for Tactical Cockpit Interfaces. (1997) Mulgund, S., Rinkus, G., Illgen, C. & Zacharias, G. Presented at HCI International, San Francisco, CA, August.
OLIPSA: On-Line Intelligent Processor for Situation Assessment. (1997) S. Mulgund, G. Rinkus, C. Illgen & J. Friskie. Second Annual Symposium and Exhibition on Situational Awareness in the Tactical Air Environment, Patuxent River, MD.
- A Neural Network Based Diagnostic Test System for Armored Vehicle Shock Absorbers. (1996) Sincebaugh, P., Green, W. & Rinkus, G. Expert Systems with Applications, 11(2), 237-244.
A Combinatorial Neural Network Exhibiting Episodic and Semantic Memory Properties for Spatio-Temporal Patterns (1996) G. J. Rinkus. Doctoral Thesis. Boston University. Boston, MA.
TEMECOR: An Associative, Spatiotemporal Pattern Memory for Complex State Sequences. (1995) Proceedings of the 1995 World Congress on Neural Networks. LEA and INNS Press. 442-448.
- Context-sensitive spatio-temporal memory. (1993) Proceedings of World Congress On Neural Networks. LEA. v.2, 344-347.
Context-sensitive Spatio-temporal Memory. (1993) Technical Report CAS/CNS-93-031, Boston University Dept. of Cognitive and Neural Systems. Boston, MA.
- A Neural Model for Spatio-temporal Pattern Memory. (abstract & poster) Proceedings of the Wang Conference: Neural Networks for Learning, Recognition, and Control, Boston University, Boston, MA 1992.
- Learning as Natural Selection in a Sensori-Motor Being. (abstract & poster) Proceedings of the 1st Annual Conference of the Neural Network Society, Boston, MA 1988.
- Learning as Natural Selection in a Sensori-Motor Being. (1986) G.J.Rinkus. Master's Thesis. Hofstra University, Hempstead, NY.