Memory Fabric Forum

Memory Fabric Forum at ISC 2022

Compute Express Link (CXL) technology promises to make memory a first-class citizen in the data center with open standards for pooling and composing petabytes of memory. That’s why MemVerge is a member of the CXL Consortium and integrating our software with a long list of ecosystem partners.

MemVerge hosted a full-day forum at ISC 2023 titled, “Start CXL Here” The goals of the full-day forum are to:

    • Provide a 360º view of development activity in the CXL ecosystem
    • Accelerate collaboration by sharing information about vendor technology partner programs

Leading hardware and software developers are contributing presentations highlighting their:

    • CXL-compatible technology
    • Its impact on application use cases
    • How other vendors can work with your organization to integrate your technology

CXL Forum Speakers and Companies

Sid Karkare | Director – HPC Business Development | AMD

Siddhartha Karkare has over 25 years of technical and business experience in the systems and semiconductor industry. He is responsible for the strategic business development and workload exploration with new technologies for the Manufacturing, Energy and EDA verticals in the Server BU at AMD. He has a BSEE from BITS Pilani, an MSEE from Santa Clara University, an MBA from UC Davis, and is based in Santa Clara.

CXL attached tiered and pooled memory

The progress of memory composability has been enhanced with the advent of CXL2.0 – the ability to create tiers of memory capacities, latencies and price points lends itself to crafting the right TCO models for both cloud deployments and on-prem server configurations. AMD’s collaboration with MemVerge to enable the software framework for page migration over multiple quarters is the right area of focus for the CXL ecosystem as we look forward to node and rack level volume deployments of CXL based architectures.

Antonio Pena | Leading Researcher and Group Manager |Barcelona Supercomputing Center (BSC)

Antonio J. Peña is a Leading Researcher at the Barcelona Supercomputing Center (BSC), where he leads the “Accelerators and Communications for HPC” Group. He is a Ramón y Cajal Fellow and former Marie Sklodowska-Curie Individual Fellow and Juan de la Cierva Fellow. Among others, Antonio is an ERC Cosolidator Laureate, a recipient of the 2017 IEEE TCHPC Award for Excellence for Early Career Researchers in High Performance Computing and ACM/IEEE Sr. Member. He is involved in the organization and steering committees of several conferences and workshops such as SC or IEEE Cluster. Antonio has been working ever since he started his PhD in system software support for hardware heterogeneity in HPC, covering topics such as remote GPGPU virtualization or heterogeneous memory systems.

Two use cases for large CXL memories at BSC: ecoHMEM and HomE

In this talk I will review two use cases in my group for which we are looking forward to get CXL memory expansion as a replacement for persistent memories. The first is ecoHMEM, a framework for automatic data placement in heterogeneous memory systems recently released, and proved successful in KNL and Optane based systems. The second use case is HomE: homomorphically encrypted deep learning inference, a recently-started ERC grant, in which we are looking for the best hardware and software technology to execute production-sized neural networks in remote servers at reasonable speeds with the security guarantees of the memory-hungry homomorphic encryption.

Shrijeet Mukherjee | Co-founder & CDO | Enfabrica

Shrijeet is Co-Founder and Chief Development Officer of Enfabrica. Prior to founding Enfabrica, he was an architect in Google’s infrastructure group. Previously he was VP Engineering at Cumulus Networks shepherding Linux in the Open Networking revolution. Before that, he was part of the core team for the Cisco UCS system, running the NIC and virtualization systems that became known as DPU’s. At SGI, he was part of the Advanced Graphics team that invented floating point framebuffers and programmable shaders that underpin the Machine Learning accelerators of today. Shrijeet is on the Linux NetDev Society Board of Directors and has over 40 patents. He holds an MS in Computer Science from the University of Oregon.

Scaling CXL memory using high speed networking

Memory access using the standardized CXL.mem interface is a game changer in the data center. While NUMA based memory scaling has been attempted by various companies in the past, the ubiquity of this interface can shift data center architectures forever. However, computer systems also evolved away from scale up designs to enable loosely coupled, unshared fate systems built around networking as the way to communicate and using RDMA as the way to share memory. In my talk, we will explore the implications of having massive amounts of networking bandwidth coupled with CXL memory hierarchies and the designs that yield the best overall characteristics.

Alan Benjamin | CEO and President | GigaIO

Alan is one of the visionaries behind GigaIO’s innovative solution. He was most recently COO of Pulse Electronics – an $800M communication components and subsystem supplier and previously CEO of Excelsus Technologies. Earlier he helped lead PDP, a true start-up, to a successful acquisition by a strategic buyer for $80M in year three. He started his career at Hewlett-Packard in Sales Support and quickly moved into management positions in Product Marketing and R&D. Alan graduated from Duke University with a BSEE in Electrical Engineering and attended Harvard Business School AMP program, as well as UCSD LAMP program.

The March of Composability – Onward to Memory with CXL

The vision of enabling data centers to re-architect from static, defined-once infrastructure into disaggregated pools of resources that can be dynamically composed (and recomposed) into application-specific systems is well underway. The last element in this transition is full memory composition, with limited capability today. CXL will enable memory to be as easy to compose as accelerators, networking, processors and storage are today. Full composability will be all the more useful if the fabric used for composition can also handle all the other compute traffic that urgently needs low latency – mostly MPI and GPU RDMA – in order to avoid adding in a third network in the rack.

Brian Pan |CEO and Founder | H3 Platform

Brian Pan is CEO and founder of H3 Platform. Brian has many years of experience developing composable disaggregated PCIe solutions since 2016. He has also been involved in composable memory and GPU solution development. In addition, Brian has rich experience designing composable solutions with tier 1 cloud service providers and data centers.

CXL pooling Solution and Fabric Manager

This is the first external CXL switch implementation for composable memory. The 256 lane CXL switch will be used for host and CXL memory connections to form a composable memory cluster. Topics covered in this presentation include system topology, the usage case (scale up memory, memory pooling, and memory sharing), fabric management API/ GUI (what API is used by users), performance results (the latency and throughput), and implementation experience sharing.

Mark Nossokoff | Research Director | Hyperon Research

Mark Nossokoff is Research Director at Hyperion Research, an industry analyst, market research, and consulting firm specializing in HPC and AI markets and technologies. He has over 30 years of experience in the data storage technology field and directs his research and consulting efforts in associated areas. Mr. Nossokoff’s experience includes hardware design, application engineering, product marketing, product management and strategic planning and is co-inventor of four patents. Before joining Hyperion Research, Mr. Nossokoff held various strategic planning and product management roles with Seagate, Dot Hill, LSI, NetApp, Engenio, Symbios Logic, and NCR. He holds degrees from Purdue University (BSEE) and Wichita State University (MBA).

Perspectives on Composability and HPC Architectures

Traditional HPC architectures have been designed to address either homogenous workloads (such as physics-based modeling and simulation) with similar, and perhaps more important, fixed, compute, memory, and I/O requirements or, more recently, heterogenous workloads with a diverse range of compute, memory, and I/O requirements. Most HPC data center planners and operators, however, don’t have the luxury of focusing on one main type of workload; they typically must support a large number of HPC users and their associated workloads sporting a wide range of compute, memory, and I/O profiles. Ensuing architectures typically, then, consist of a fixed set of resources, resulting in an underutilized system with expensive elements sitting idle a costly and unacceptable amount of time. One approach being explored to increase system utilization by exposing resources that would otherwise sit idle to appropriately matched jobs waiting in a queue is via composable systems.

Dr. Charles Fan | CEO and Co-founder | MemVerge

Charles Fan is co-founder and CEO of MemVerge. Prior to MemVerge, Charles was the CTO of Cheetah Mobile leading its global technology teams, and an SVP/GM at VMware, founding the storage business unit that developed the Virtual SAN product. Charles also worked at EMC and was the founder of the EMC China R&D Center. Charles joined EMC via the acquisition of Rainfinity, where he was a co-founder and CTO. Charles received his Ph.D. and M.S. in Electrical Engineering from the California Institute of Technology, and his B.E. in Electrical Engineering from the Cooper Union.

Introducing CXL Shared Memory Object SDK

MemVerge Shared Memory Object SDK is a revolutionary software development kit that enables the creation of high-performance applications using shared memory. The SDK is based on the Compute Express Link (CXL) technology, which allows for high-speed communication between CPUs and accelerators, making it ideal for demanding workloads such as artificial intelligence and data analytics. With MemVerge Shared Memory Object SDK, developers can create shared memory objects that are accessible by multiple processes on different computers, greatly reducing the need for data movement and improving overall system performance. This technology is the first of its kind in the world, making it a game-changer for developers looking to create high-performance, data-intensive applications.

Gregory Price | Senior Software Engineer | MemVerge

A Cyber Security professional with diverse knowledge of software engineering and reverse engineering, with a specialization in virtualization and emulation technologies. Currently the Compute eXpress Link (CXL) lead investigator for memory pooling, dynamic capacity, and fabric management.

Endless Memory, introducing The Elastic Memory Service

Memory exhaustion is the most common and frustrating issue for data-intensive applications – leading to either Out-of-Memory crashes or poor performance due to swap usage. This problem is compounded in clustered environments when memory usage is not uniform across nodes. To address this challenge, MemVerge has created an Elastic Memory Service, leveraging emerging CXL hardware to provide an “Endless Memory” solution. With this technology, hosts can dynamically allocate memory as needed, mitigating OOM errors and improving application performance. This talk presents a Technology Preview of MemVerge’s CXL memory pooling and tiering technologies running on real CXL memory pooling hardware from a variety of platform and CXL memory device vendors.

Tony Brewer | Chief Architect, Scalable Memory Systems Path Finding Group | Micron

Tony Brewer is currently the chief architect in Micron’s Scalable Memory Systems path finding group. He is the principal investigator on multiple government contracts and manages a team of architects and researchers focused on various processing in or near memory style architectures. His career has been focused on system architecture in both the high-performance computing as well as the telecommunications industries. Prior to joining Micron in 2015, Tony Brewer was a Co-founder and Chief Technology Officer for Convey Computer. Former employers include Convex Computer, and Hewlett Packard. Tony Brewer received his MS and BS degrees in Computer Engineering from Purdue University and has over 200 filed patents.

CXL 3.0 Shared Memory for a New Class of Applications

Modern big data applications execute on many servers with a Data Lake for pipelined application storage. System architectures need to be rethought to address this class of application. CXL 3.0’s shared memory capability is ideally suited for these applications.

Olivier Deprez | Principal SW Architect | SiPearl

Expert in system architecture and software design, Olivier has worked on multiple projects in the Digital TV area, spanning the whole value chain including TV operators (Canal+, Orange), Security provider (Nagra), Middleware providers (Logiways, SoftAtHome) and device manufacturer (SmarDTV). Olivier, also has a solid background in standardization activities and was involved in numerous bodies such as ETSI, DVB and HbbTV. Olivier was also the chairman of the CI Plus Technical Group for a decade. Before joining SiPearl, Olivier was head of engineering of Edgeway Vision, a startup company in the field of virtualization in the cloud for video surveillance cameras. Olivier has an engineering degree from INPG with specialization in computer science and applied mathematics.

Enabling CXL within HPC Centers

The vision of a vendor of general purpose HPC SoCs on how CXL could potentially address the need for server manufacturers to ensure ‘seamless’ interoperability between Hosts and Accelerators and therefore meet their users (governments, supercomputing centres, academics, industries, etc.) need for disaggregation.

Ting Wang | CEO | Timeplus

Ting Wang is the Co-founder and CEO of Timeplus, a next-generation SQL streaming database that unifies streaming and historical data processing for real-time analytics. Prior to Timeplus, Ting served as the VP of Engineering at Splunk and SAP, where he led product and engineering teams to develop several industry-leading operational analytics and business intelligence platforms.

Empowering high-performance streaming analytics with shared big memory

Data-intensive systems, particularly large-scale streaming processing systems that are stateful, long-running, and low-latency, can experience significant impacts on the latency and throughput due to limited memory bandwidth and capacity. In this session, we will explore the core streaming processing technologies used at Timeplus and the challenges involved in building a distributed core engine. We will also discuss how CXL Memory can simplify the design of such systems, reduce system latency, and improve overall throughput. Finally, we will share two real-life use cases where streaming processing with shared big memory can deliver unparalleled performance: real-time trading monitoring and real-time vector search.

Dr. Jianping Jiang | VP of Product Marketing and Business Operations | Xconn Technologies

Jianping (JP) Jiang is the VP of Product Marketing and Business Operation at Xconn Technologies, a Silicon Valley startup pioneering CXL switch IC. At Xconn, he is in charge of CXL ecosystem partner relationship, CXL product marketing, business development, corporate strategy and operations. Before joining Xconn, JP held various leadership positions at several large-scale semiconductor companies, focusing on product planning/roadmaps, product marketing and business development. In these roles, he developed competitive and differentiated product strategies, leading to successful product lines that generate over billions of dollars revenue annually. JP has a Ph.D degree in computer science from the Ohio State University.

CXL Switch: enabler of more advanced HPC and AIML cloud computing

CXL switch will enable scalable memory expansion and memory pooling, these two functions will make big data and big memory computing more accessible while keeping the memory utilization high — much needed for HPC and Hyperscaler applications. On the other hand, as the CXL ecosystem reaches maturity, CXL switch will enable AIML computing to be built on an advanced architecture — with CPUs, Accelerators, Memories interconnected through a fabric network formed by CXL switches.