Get all your news in one place.
100’s of premium titles.
One app.
Start reading
Tom’s Hardware
Tom’s Hardware
Technology
Anton Shilov

Tachyum releases a 1,600-page performance optimization manual despite continued tape-out delays and no actual silicon

Tachyum.

Tachyum has released a 1,600-page guide for optimizing the performance of its Prodigy Universal Processor FPGA hardware. Even though the company has yet to tape out its Prodigy processors after years of delays, it has released a performance optimization manual for the chips, which have a unique instruction set architecture and optimization strategies, well before actual products start sampling or hit the market.

The Prodigy universal processor has faced repeated delays since its initial timeline. Originally planned for a 2019 tape out and a 2020 launch, the schedule shifted multiple times: from 2021 to 2022, then to 2023, and then to 2024. Earlier this year, Tachyum once again updated its plans, saying it would tape out the chip in 2025, thus delaying the sampling of reference servers set for the first quarter of next year. While formally, the company still plans to initiate mass production of its Prodigy processors in 2025, it remains to be seen whether the company can complete all the necessary milestones (tape out, debugging, sampling, mass production start) in just one year. 

Tachuym's Prodigy design features 192 custom 64-bit compute cores based on an all-new microarchitecture that is said to be equally good for general-purpose computing as well as highly parallel AI and HPC computing. In particular, the ISA incorporates extensive vector and matrix instructions to address artificial intelligence and supercomputing applications, and the new performance optimization guide includes design guidelines for the development of AI and HPC software. 

The Prodigy instruction set architecture (ISA) combines elements of both RISC and CISC designs; according to Tachyum, the ISA avoids the complex, lengthy, and inefficient variable-length instructions commonly found in traditional CISC processors. All instructions are standardized to 32 or 64 bits, with some incorporating memory access features to boost performance further.  

Tachuym's Prodigy FPGA features built-in performance counters that enable real-time monitoring and analysis of runtime events. The company says these tools allow programmers and engineers to identify bottlenecks and optimize code for greater efficiency, making the processor ideal for demanding computational tasks. 

The manual provides specific optimization techniques, including managing dispatch limitations, improving memory routines, aligning branches and instructions, and mitigating register forwarding challenges. In addition, it offers guidance for handling cache operations, load/store alignment, and accessing special registers, ensuring developers can fine-tune software for peak performance. 

"Software programmers, test engineers, compiler developers, and systems and solutions engineers will appreciate the opportunity to take this deep dive into how Prodigy offers inherent performance benefits for efficient processing of AI, cloud, and HPC workloads," said Dr. Radoslav Danilak, founder and CEO of Tachyum. "Prodigy's integrated features will help users achieve industry-leading compute efficiency to derive insights faster, to perform research faster, to generate results faster."

As always, the proof is in the shipping silicon, and Tachyum has yet to even tape out a chip.

Sign up to read this article
Read news from 100’s of titles, curated specifically for you.
Already a member? Sign in here
Related Stories
Top stories on inkl right now
One subscription that gives you access to news from hundreds of sites
Already a member? Sign in here
Our Picks
Fourteen days free
Download the app
One app. One membership.
100+ trusted global sources.