Low-Latency AI Inference for IBM Z17 and LinuxONE Systems
IBM has announced the release of the IBM Spyre Accelerator, which supports generative and agentic AI through low-latency inferencing. Drawing on their legacy of innovation in mainframe computing, IBM has accommodated the need to run AI models on IBM Z and LinuxONE by making Spyre Accelerator available. Now companies can scale their AI programs while keeping their sensitive data on premises with IBM Z, eliminating the need to utilize expensive hyperscalers with usage-based pricing models.
The release of IBM Spyre Accelerator represents a major step in the evolution of mainframe AI by enabling users to deploy AI models while prioritizing security, compliance, and resilience.
Understanding the IBM Spyre Accelerator
IBM Spyre Accelerator, the IBM Z Systems AI card, powers real-time inferencing directly on the mainframe, creating a low latency, intelligen
t, and responsive infrastructure. As part of IBM Z generative AI mainframe advancements, Spyre bridges modern AI and traditional reliability.
IBM Spyre Accelerator is designed to run AI workloads faster by enabling low-latency inference for the processing of data. Companies can use Spyre cards to implement complex I/O protocols to reduce latency and achieve high transaction rate processing power.
Integration with Telum II Processor
Spyre works with the AI core already embedded in the Telum II processor found in IBM z17 and LinuxONE 5. By integrating with the System Z Linux capabilities, Spyre optimizes the processing of AI and machine learning (ML) workloads.
A powerful, PCIe-attached AI card with 32 AI-optimized cores, Spyre is engineered to handle larger, more flexible inference workloads, including large language models (LLMs) and complex generative AI, taking on more intensive AI tasks that Telum II processor may pass off.
Using Spyre lowers latency and reduces security risk by eliminating the need to send information to a different system or across external networks. According to IBM, approximately 70% of the world’s transactions, including those for finance, are run through IBM Z. The low-latency inference provided by Spyre allows mainframe users to bring AI to mission-critical processes in fast-paced, highly regulated industries such as finance. With Spyre, Z Systems can handle in-transaction AI for millions of transactions per second.
Benefits of IBM Spyre Accelerator
With the release of IBM Spyre Accelerator, IBM Z17 or LinuxONE 5 users in highly regulated industries can take advantage of many benefits that optimize AI capabilities.
- Keep data secure and local
Spyre eliminates the need to move sensitive data to the cloud for AI processing, helping financial institutions and other regulated industries stay compliant with privacy and data protection regulations by running AI right next to the data. The mainframe inferencing accelerator ensures that data never leaves the trusted IBM Z environment, supporting compliance and performance. - Reduce latency for real-time decisions
By performing AI inference directly on the IBM server, Spyre avoids delays caused by transferring data off system, enabling lightning-fast AI-driven decisions, such as real-time fraud detection, in milliseconds. - Enhance existing applications with generative AI
With IBM Z generative AI capabilities, organizations can also modernize legacy code through AI-driven automation. The Accelerator enables new use cases, such as using generative AI for application modernization through the automation of code refactoring or generation of documentation for legacy code. - Scale AI with enterprise demand
Pairing IBM Spyre Accelerator with PSR mainframe services enables businesses to scale AI workloads efficiently while maintaining uptime and security. IBM Z17 and LinuxONE 5 systems can be equipped with up to 48 Spyre cards, allowing organizations to incrementally scale their AI capabilities. This modular design helps meet growing AI demands without a complete infrastructure overhaul. - Empower developer productivity
New tools and software, such as the AI Toolkit for IBM Z and LinuxONE and watsonx Assistant for Z, empower developers to leverage generative AI to automate tasks, generate code, and accelerate mainframe application modernization efforts. - Complement the existing AI capabilities of Telum II
For complex AI tasks, the system can use Telum II’s integrated AI for initial inference and then escalate lower-confidence predictions to the more powerful Spyre accelerator for more comprehensive analysis.
Use Cases for Spyre
Spyre Accelerator fulfills many potential AI use cases for IBM Z and LinuxONE 5, including those for mainframe AI inference in-transaction on IBM Z for finance. An IBM Z equipped with a Spyre cluster and Integrated Facilities for Linux (IFLs) goes beyond detecting fraud in financial transactions to identifying complex fraud patterns that may be overlooked by less sophisticated models. Along with fraud detection, Spyre enhances cybersecurity and automation by supporting more complex AI models.
Spyre also opens possibilities for using the IBM watsonx Platform to harness GenAI capabilities. With Spyre, Z17 or LinuxONE 5 users can run watsonx Assistant for Z to take advantage of conversational AI and low-code, customized automation.
Companies can use the IBM Z AI card to support GenAI for more effective modernization of application code on the mainframe. GenAI can be used to understand how code functions in an application and what needs to be updated or changed.
How to Take Advantage of IBM Spyre Accelerator for Mainframe
To prepare for adopting IBM Spyre Accelerator now that it is available for IBM Z and LinuxONE 5, companies should assess their current workloads and identify new opportunities for AI, such as advanced fraud detection, anti-money laundering, credit scoring, inventory forecasting, healthcare diagnostics, AI chatbots, cyber security, and GenAI for application code modernization. Some tasks may benefit from Telum II predictive AI, while others may require the inference power of Spyre.
Working with PSR and IBM to determine the optimal configuration for your needs will allow you to take full advantage of Spyre Accelerator. As a trusted IBM Platinum Business Partner dedicated to Z Systems and LinuxONE. PSR is the ideal technology provider to work with when transitioning to the Spyre Accelerator for your mainframe. We offer deep expertise in systems programming for IBM Z and IBM Z hardware and software support, ensuring successful adoption of the IBM Spyre Accelerator.
Reach out today to learn how our PSR mainframe services can help your organization integrate the IBM Spyre Accelerator mainframe solution into your IBM Z environment with confidence.
