//
archives

anemone

This category contains 2 posts

Epiphany Architecture Promises Massive Floating-Point Gains

Epiphany Architecture

Epiphany Architecture

Earlier this month BittWare announced “A New Approach to Floating Point DSP” with the release of the Anemone Co-Processor. The Anemone floating point co-processor chip is designed for use with Altera’s high performance FPGAs. OEM’d from Adapteva’s new Epiphany architecture, BittWare’s Anemone chip is a scalable, true C-programmable, floating point engine that enables novel solutions for complex and evolving signal processing applications.

It’s something which has already caught the eye of FPGA and DSP specialist BittWare, which has licensed Adapteva’s Epiphany technology for use in its own Altera-based FPGAs under the name Anemone. “We believe that Adapteva’s Epiphany architecture represents a flash of brilliance that reintroduces some much needed creativity into the embedded signal processing world,” BittWare’s chief Jeff Milrod crowed at the launch of Anemone. “The resulting Anemone chip, combined with Altera’s family of FPGAs, creates an insanely cool solution for complex signal processing tasks that can be optimised for power, performance, and productivity.”

So what is Epiphany? Here’s the opening section of an interview with Andreas Olofsson of Adapteva with THINQ:

While chip giants ARM, Intel, and AMD battle for control of your smartphones and PCs, a small company in Massachusetts called Adapteva is starting a revolution: many-core processors that offer a significant performance boost over anything currently on the market. We chat to its founder Andreas Olofsson to find out what’s going on.

Initially, it’s easy to dismiss Adapteva’s chances of success as slim to none: in markets where a single processor architecture holds overwhelming dominance, such as x86 in mainstream computing and ARM in the world of smartphones, Olofsson has decided to develop an entirely new architecture.

Olofsson readily admits that history is littered with companies who have tried the same approach and failed. “In the mobile space there’s MIPS, for example,” he explained to thinq_ during our interview. “That’s another architecture, which has been around for 25, 30 years now – you know, it’s a great architecture, and yet they have very low traction. Everybody – 98 per cent of the market – uses ARM.”

To avoid falling into the same trap, Olofsson has a new idea – or, specifically, a variation on an old one. In the early days of personal computing, it was common for a central processor to have a ‘math co-processor’ chip alongside it – a secondary processor which was designed specifically to carry out floating point arithmetic at speeds significantly faster than the main processor. Intel had its 8087, Motorola its 68881, and AMD the stand-alone 9511.

Over time, however, these chips became integrated into the processor itself – evolving into the high-performance floating-point units, or FPUs, that are a feature of all modern central processors. Olofsson’s big idea, he explained, is to bring the days of the math co-processor back – and promises some major performance-per-watt gains for those making the step.

Read more / Source – THINQ

See Also – Wall Street Journal, BittWare Anemone, Adapteva

Anemone Co-Processor for FPGAs, BittWare Inc

Anemone Co-Processor for FPGAs

Anemone Co-Processor for FPGAs

SAN JOSE, CALIFORNIAMay 3, 2011 – BittWare announced today, at the Embedded Systems Conference, the Anemone floating point co-processor chip for use with Altera’s high performance FPGAs.

Read the Full Press Release here: http://www.bittware.com/media/press/pr.cfm?id=62

BittWare yesterday introduced Anemone to the Embedded Systems Conference. What is it?

Its a Floating Point Co-Processor for FPGAs. Anemone represents a new hybrid approach for floating point signal processing that adds a low-power, C-programmable compute engine to world-class FPGA technology from Altera.

Features

  • 16 independent floating point cores
  • 32 GFLOPS of floating point processing
  • 2 Watts total chip power
  • ANSI C-programmable
  • IEEE Floating Point
  • Shared memory architecture
  • External I/O via memory-mapped links
  • Scale multiple chips up to 8 TFLOPS
  • High throughput mesh network
  • Standard GNU/Eclipse Development Tools
  • Available from BittWare on standard board formats

http://www.bittware.com/products/anemone_prod_desc.cfm?ProdShrtName=AN104For more information – visit the Anemone page on BittWare’s website.

Traditional floating point DSPs, while excellent at complex processing tasks, have limitations when it comes to chip real estate and power efficiency that have caused them to become an endangered species. And FPGAs, while superior for versatility and configurability, can be difficult to use for complex and evolving applications. The BittWare Anemone, featuring the Epiphany architecture from Adapteva, enables the best assets of both to be combined, thereby offering a completely new approach to floating point digital signal processing. This hybrid solution provides a standard processor software development environment working in conjunction with a world-class FPGA platform, allowing users to optimally partition their algorithms into hardware and software. The result is superior development productivity and unmatched system size, weight, and power.

Focus on power & efficiency

Anemone is a truly C-programmable floating point compute engine. It is unique in that it achieves superior power efficiency and processing performance because it is designed to work alongside an FPGA as a co-processor. The FPGA handles all the memory, I/O interfacing, protocol processing, and special functions in addition to any computational tasks it may perform, leaving the Anemone free to efficiently perform the complex processing tasks that DSPs are ideal for. This allows Anemone to be an extremely efficient chip – as compared with traditional floating point DSPs that may only use 5% of the silicon area for processing.

Simple, elegantly designed floating point cores

The Anemone is a completely scalable 1 GHz multicore processor with 16 eCores that provide a total sustained performance of 32 GFLOPS while consuming only 2 Watts of total chip power. Each eCore features a compact, general-purpose instruction set that requires no instruction level parallelism and provides high program efficiency. All floating point computations are performed as single-precision IEEE 754; hardware looping is also supported. Anemone offers distributed and segmented memory, and large uniform register files. On-chip distributed shared memory is 4 Mb (32 KByte per eCore) with 32 GBytes/sec of sustained memory bandwidth within each eCore. The cache-less shared memory architecture is extended off-chip via I/O links.

High-throughput eMesh network

The Anemone features an internal high-throughput mesh network, with separate data paths for on-chip and off-chip communications. Each eCore has a multi-channel DMA engine to support background data movement over the mesh. Total on-chip, inter-core bandwidth is 128 GBytes/sec full duplex, with an additional 8 GBytes/sec of off-chip bandwidth. Each router node can simultaneously sustain full-duplex transfers on all ports, with automatic routing based on global addressing.

I/O via memory-mapped high-speed links

The Anemone provides a flexible low-overhead external interconnect scheme that supports memory-mapped direct connection of multiple Anemones and is compatible with any LVDS capable FPGA. This is achieved via four links that are full-duplex 8-bit LVDS data ports @ 500 MHz DDR, each simultaneously providing 1 GByte/sec in each direction for a total off-chip bandwidth of 8 GBytes/sec. Its FPGA co-processor use model provides the ultimate flexibility: since all external I/O goes through an FPGA, system designers can customize the I/O to their application’s specific requirements.

ANSI C-programmable; Standard GNU development tools

The Anemone reduces system development cost by enabling out-of-the-box execution of applications written in regular ANSI-C. It does not require any C-subset, language extensions, or SIMD. Standard GNU development tools are supported including an optimizing C complier, simulator, GDB debugger, and Eclipse multi-core IDE.

Twitter

Follow

Get every new post delivered to your Inbox.

Join 274 other followers