Hyperdimensional (HD) computing is a cognitive computing model based on the high-dimensional properties of neural circuits in the brain. It is an alternative paradigm that mimics important brain functionalities towards high-efficiency and noise-tolerant computation. It is motivated by the understanding that the cerebellum cortex operates on high-dimensional representations of data that originated from large size brain circuits. It thereby models the human memory using random vectors in a high-dimensional space, that is, hypervectors. HD computing incorporates learning capability along with typical memory functions of storing/loading information.
It mimics important functionalities of the human memory model with vector operations, which are computationally tractable and mathematically rigorous in describing human cognition.
HD computing relies on the algebraic properties of their key operations to incorporate the advantages of structured symbolic and vector distributed representations. It incorporates learning capability along with typical memory functions of storing/loading information. It mimics important functionalities of the human memory model with vector operations, which are computationally tractable and mathematically rigorous in describing human cognition. HD computing operates over a well-defined and hardware-friendly set of mathematics:
- Binding is well suited for associating two hypervectors and is used for key-value association and variable binding.
- Bundling is a memorization operator that superposes the information of input data into one vector.
- Permutation is an operation to represent sequences by creating a near-orthogonal but reversible hypervector.
- Reasoning is done by measuring the similarity of hypervectors.
Privacy and Security
The cybersecurity industry has never been more important than today. Security issues are becoming a day-to-day struggle for businesses and the department of defense. Global cybercrime costs are on the rise. According to the recent report, the demand of the global cybersecurity market is expected to reach from $176.6 Billion in 2020 to $398.3 Billion by 2026. Cybercrimes are also rising by 600% during the pandemic; businesses are more vulnerable than ever to the financial and reputational repercussions of cyber-attacks.
We aim to address security issues from multiple perspective:
- Security monitoring with human-interpretable reasoning: To extract useful information, security monitoring systems need to rely on sophisticated and costly machine learning and artificial intelligence algorithms. At the same time, it is difficult to effectively deploy machine learning without a comprehensive, rich, and complete approach to the underlying data. We plan to develop a robust, real-time, and transparent security monitoring system. Our framework is able first represent security-related data into a holographic space to abstract the knowledge. Then, we develop cognitive learning algorithms capable of security monitoring and human-like reasoning.
- Crypto-based AI computing on Edge: Edge devices are becoming increasingly pervasive in everyday life. There is a crucial need for protecting data and security at the edge. Secure computation relies on costly cryptography methods, which add a significant computing burden on edge devices. To ensure theoretical security support for HDC, we introduce a novel framework that enhances HD computing encoding with state-of-the-art encryption methods. Our solution ensures ultra-lightweight encryption as well as hardware-friendly cognitive operations over encrypted data, thus enabling secure learning on edge.
- Privacy in AI computation: Privacy is one of the key challenges of machine learning algorithms. The lack of trustworthy learning limits machine learning application in real-world IoT systems. We aim to address two important challenges in privacy, data privacy and model privacy. Although there are a lot of existing solutions, our target is more on foundation, as we look at brain-inspired and symbolic AI models with computational transparency. We exploit this feature to develop fully mathematical model that not only mathematically ensures information leakage (privacy level) but also provide theoretical analysis on information loss.
Running data/memory-intensive workloads on traditional cores results in high energy consumption and slow processing speeds, primarily due to a large amount of data movement between memory and processing units. Our group has designed a digital-based Processing in-memory (PIM) platform capable of accelerating fundamental big data applications in real-time with orders of magnitude higher energy efficiency [ISCA’19, HPCA’2020, DAC’17]. This design accelerates entire applications directly in storage-class memory without using extra processing cores.
This platform opened a new direction toward making the PIM technology practical. In contrast to prior methods that enable PIM functionality in the analog domain, we designed the first digital-based PIM architecture that
- Works on digital data; thus, it eliminates ADC/DAC blocks that dominate the area.
- It addresses internal data movement issues by enabling in-place computation where the big data is stored
- It natively supports floating-point precision that is essential for many scientific applications
- It is compatible with any bipolar memory technology, including Intel 3D XPoint.
Our proposed platform can also accelerate a wide range of big data applications including machine learning [ISCA’19, HPCA’20, TC’19], query processing [TCAD’18], graph processing [ISLPED’18], and bioinformatics [ISLPED’19]. One particularly successful application of our design is the FloatPIM architecture [ISCA’19], which significantly accelerates state-of-the-art Convolutional Neural Networks (CNNs).
Machine learning methods have been widely utilized to provide high quality for many cognitive tasks. Running the sophisticated learning tasks requires high computational costs to process a large amount of learning data. A common solution is to use cloud and data centers as the main central computing units. However, with the emergence of the Internet of Things (IoT), the centralized approach faces several scalability challenges towards high-performance computing. In the IoT systems, a large number of embedded devices are deployed to collect data from the environment and produce information.
The partial data need to be aggregated to perform the target learning task in the IoT networks such as a home- or even city-scale. It consequently leads to a significant communication cost with high latency to transfer all data points to a centralized cloud.
However, effective learning in the IoT hierarchy is still an open question. We recognize the following technical challenges to scale the learning tasks for the IoT hierarchy.
- In reality, each IoT device has different types of sensors that generate heterogeneous features. The edge devices often do not have sufficient resources for online processing of the sophisticated learning algorithms.
- To train and infer in a centralized fashion, communication may dominate the total computing costs as the size of data generated in the swarm of the IoT devices increases. Even if the learning tasks could be distributed to the edge devices by deploying a costly hardware accelerator, a large amount of data requires to be transferred between different nodes, e.g., inputs and outputs of neurons for DNN models, during the model training procedure.
In addition, reliable communications are not granted, and IoT networks are often deployed assuming harsh network conditions.
- Distributed learning, beyond federated learning: In this work, we seek to enable a distributed learning using the data that heterogeneous sensors for each IoT device generate on the fly. We accelerate the learning tasks by utilizing the IoT devices as federated computing units, i.e., the learning tasks are processed on the local embedded devices located in the hierarchy.
- Novel communication protocol: to ensure end-to-end system efficiency, the communication and computation systems need to be integrated. This requires novel communication protocols that are compatible with machine learning algorithms. We also need to design machine learning and network protocols that have natural robustness to information loss.