#### Survey of Domain-Specific Languages for FPGA Computing

Nachiket Kapre nachiket@ieee.org

# Some goodness metric

Expressiveness (Freedom)





Expressiveness (Freedom)



(Freedom)

#### Singapore's contempt of court bill

Singapore: Contempt of court bill is a threat to freedom of expression

https://www.amnesty.org/en/latest/news/2016/08/singapore-contempt-of-court-law/

https://twitter.com/amnesty/status/674053786520915969

Donald Trump's hate-filled rhetoric & bigoted scapegoating flies in the face of equality & MUST be rejected.

#### Trump's attack on judge



Some goodness metric

(Car)

# Some goodness metric



Expressiveness (Freedom)





ROBERT MCMILLAN BUSINESS 06.16.14 6:30 AM

#### MICROSOF SUPERCHA BING SEAR PROGRAM CHIPS



#### Intel unveils new Xeon chip with integrated FPGA, touts 20x performance boost

By Sebastian Anthony on June 19, 2014 at 1:19 pm Comment

#### MINISTRY OF INNOVATION / BUSINESS OF TECH

#### Intel will acquire FPGA maker Altera for \$16.7 billion

Stratix®III

EP3SL150F1152C2N J ABDA70837A

TO .

Consolidation continues in the semi industry with Avago-Broadcom, now this.

by Sebastian Anthony (UK) - Jun 1, 2015 9:28pm CST

8

f Share 🔰 Tweet 34

## Outline

- Review of FPGA Design Flow

   Where we stand?
   Need for DSLs
- Classification of DSLs
- Code Vignettes
- Experimental Results

## Outline

- Review of FPGA Design Flow

   Where we stand?
   Need for DSLs
- Classification of DSLs
- Code Vignettes
- Experimental Results

## FPGA flow



- FPGA flow longer, more complex
- Problem 1: Write low-level Verilog code
- Problem 2: Wait hours to compile (adds insult to injury)
- **Problem 3**: Long verification feedback cycles.

#### Example code sketches

endmodule

#### Example code sketches

void poly(int x, int\* y) {
 int a=3,b=2,c=1;
 \*y = a\*x\*x + b\*x + c;
}

## What's different?

```
void poly(int x, int* y) {
    int a=3, b=2, c=1;
    *y = a*x*x + b*x + c;
}
```

- What makes the C code smaller?
- Clocking/Reset?
- Explicit pipelining
- Type information

   registers, wires,
   number of bits

#### Simple forms of parallelism



#### Simple forms of parallelism



## Limits of OpenCL/HLS

• One alternative to HDLs — OpenCL/HLS flow





- Restricted subset of C (no pointers, no complex data sharing) —> sacrifice freedom for speed
- Drawbacks:
  - Overheads due to implicit assumptions
    - more area, slower design, not fully optimised
  - Only really addresses time-to-compilation
    - still need to do synth + P&R

## Outline

- Review of FPGA Design Flow

   Where we stand?
   Need for DSLs
- Classification of DSLs
- Code Vignettes
- Experimental Results

#### Domain-Specific Languages

- "Beauty lies in the eye of the beholder"
- Conventional "application-domain" view

   finance, HPC, radio, multimedia, networking, databases, security.
- Suggest two alternate views in this paper...

#### Axes of classification

- (1) Conventional "application-domain" view — focus on end-user of FPGA technology
- (2) "compute-model" view
   analogous to Berkeley's Ptolemy classification
- (3) "design" view

— behind-the-scenes tinkerers, library developers, system builders, academics

#### Axes of classification

- (1) Conventional "application-domain" view — focus on end-user of FPGA technology
- (2) "compute-model" view
   analogous to Berkeley's Ptolemy classification
- (3) "design" view

— behind-the-scenes tinkerers, library developers, system builders, academics



















## Outline

- Review of FPGA Design Flow

   Where we stand?
   Need for DSLs
- Classification of DSLs
- Code Vignettes
- Experimental Results

#### Matlab HDL Coder

#### Maxeler

## class Poly extends Kernel { Poly(KernelParameters parameters) { super(parameters);

```
DFEVar x = io.input("x", dfeUInt(32));
int a = 3, b = 2, c 1;
DFEVar y = a*x*x + b*c + c;
io.output("y", y, dfeUInt(32));
```

## SCORE

```
poly(input unsigned[32] x,
      output unsigned[32] y)
{
      unsigned[32] a=3,b=2,c=1;
      state always (x):
           y = a*x*x + b*x + c;
```

## MSR Accelerator C#

using PA=Microsoft.ParallelArrays.ParallelArrays;

```
namespace Poly
 class Program
    static void Main(string[] args)
      int N = 1024;
      int a = 3, b = 2, c = 1;
      int[] xArr = new int[N];
      int[] yArr = new int[N];
      FPGATarget t = new FPGATarget();
      PA x = new PA(xArr);
      PA t1 = PA.Multiply(a, x);
      PA t2 = PA.Multiply(t1, x);
      PA t3 = PA.Multiply(b, x);
      PA t4 = PA.Add(t3, t2);
      PA t5 = PA.Add(t4, c);
      yArr = t.ToArray1D(t5);
    }
```

#### JHDL

```
:public class Poly extends Logic {
```

}

```
// Interface
public static CellInterface[] cif = {
    in("x", 18), out("y", 36),
};
```

```
// Constructor
public Poly(Node parent, Wire y, Wire x) {
```

```
// Connect wires
connect("y", y);
connect("x", x);
// Build our logic
new mult18x18(this, x, x, t1);
new mult18x18(this, t1, a, t2);
new mult18x18(this, b, x, t3);
new adder(this, t2, t3, cin, t4, cout);
new adder(this, t4, c, cin, y, cout);
```

### CHISEL

```
class Poly extends Component {
    val io = new Bundle {
        val a = Bits(32, INPUT)
        val b = Bits(32, INPUT)
        val c = Bits(32, INPUT)
        val x = Bits(32, INPUT)
        val y = Bits(32, OUTPUT)
        val y = Bits(32, OUTPUT)
    }
    io.y := io.a * io.x * io.x +
        io.b * io.x + io.c
}
```

## Outline

- Review of FPGA Design Flow

   Where we stand?
   Need for DSLs
- Classification of DSLs
- Code Vignettes
- Experimental Results

# Experimental Evaluation

- NTU MSc Embedded Systems cohort

   Class of 2014-15
   ~25-30 students
- 3-4 students per DSL
- One 4hr lab session devoted to working on the ax<sup>2</sup>+bx+c mapping example

| DSL                               | Dev.           | Lines | Lines of Code     |              | Resources    |        |     |  |
|-----------------------------------|----------------|-------|-------------------|--------------|--------------|--------|-----|--|
|                                   | Time           | DSL   | RTL               | LUTs         | FFs          | DSPs   | MHz |  |
| Flopoco <sup>1</sup>              | 30m            | 2     | 1702              | 1679         | 1288         | 0      | 91  |  |
| Maxeler<br>(baseline)             | 30m<br>30m     | 15    | NA <sup>2</sup>   | 6036<br>5837 | 5391<br>5364 | 3<br>0 | 120 |  |
| Vivado<br>HLS                     | 1h             | 4     | 92                | 53           | 71           | 3      | 117 |  |
| Lime<br>(baseline)                | 2h30m<br>2h30m | 22    | 111               | 245<br>189   | 284<br>209   | 2<br>1 | 160 |  |
| OpenCL <sup>3</sup><br>(baseline) | 2h30m<br>2h30m | 4     | 1262              | 3281<br>3230 | 4443<br>4192 | 2<br>0 | 267 |  |
| Chisel                            | 3h             | 25    | 39                | 129          | 64           | 10     | 66  |  |
| OpenDF                            | 3h30m          | 26    | 689               | 171          | 305          | 9      | 120 |  |
| JHDL                              | 4h             | 40    | 2529 <sup>4</sup> | 41           | 90           | 3      | 84  |  |
| SCORE                             | 4h             | 7     | 111               | 139          | 245          | 2      | 74  |  |

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs,

Altera 18×18 DSPs are also different from Xilinx 25×18 DSPs <sup>4</sup>JHDL directly generates a circuit netlist in EDIF format instead of generating RTL

| Dev.           | <b>-</b> •                                                                                                                                       |                                                        |                                                                                    |                                                                                                                     |                                                                                                                                                                    |                                                                                                                                                                                                                                                                                                                                                                                  |
|----------------|--------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Time           | Lines<br>DSL                                                                                                                                     | of Code<br>RTL                                         | R<br>LUTs                                                                          | Resource<br>FFs                                                                                                     | s<br>DSPs                                                                                                                                                          | <b>Freq.</b><br>MHz                                                                                                                                                                                                                                                                                                                                                              |
| 30m            | 2                                                                                                                                                | 1702                                                   | 1679                                                                               | 1288                                                                                                                | 0                                                                                                                                                                  | 91                                                                                                                                                                                                                                                                                                                                                                               |
| 30m<br>30m     | 15                                                                                                                                               | NA <sup>2</sup>                                        | 6036<br>5837                                                                       | 5391<br>5364                                                                                                        | 3<br>0                                                                                                                                                             | 120                                                                                                                                                                                                                                                                                                                                                                              |
| 1h             | 4                                                                                                                                                | 92                                                     | 53                                                                                 | 71                                                                                                                  | 3                                                                                                                                                                  | 117                                                                                                                                                                                                                                                                                                                                                                              |
| 2h30m<br>2h30m | 22                                                                                                                                               | 111                                                    | 245<br>189                                                                         | 284<br>209                                                                                                          | 2<br>1                                                                                                                                                             | 160                                                                                                                                                                                                                                                                                                                                                                              |
| 2h30m<br>2h30m | 4                                                                                                                                                | 1262                                                   | 3281<br>3230                                                                       | 4443<br>4192                                                                                                        | 2<br>0                                                                                                                                                             | 267                                                                                                                                                                                                                                                                                                                                                                              |
| 3h             | 25                                                                                                                                               | 39                                                     | 129                                                                                | 64                                                                                                                  | 10                                                                                                                                                                 | 66                                                                                                                                                                                                                                                                                                                                                                               |
| 3h30m          | 26                                                                                                                                               | 689                                                    | 171                                                                                | 305                                                                                                                 | 9                                                                                                                                                                  | 120                                                                                                                                                                                                                                                                                                                                                                              |
| 4h             | 40                                                                                                                                               | 2529 <sup>4</sup>                                      | 41                                                                                 | 90                                                                                                                  | 3                                                                                                                                                                  | 84                                                                                                                                                                                                                                                                                                                                                                               |
| 4h             | 7                                                                                                                                                | 111                                                    | 139                                                                                | 245                                                                                                                 | 2                                                                                                                                                                  | 74                                                                                                                                                                                                                                                                                                                                                                               |
|                | Time         30m         30m         30m         30m         1h         2h30m         2h30m         2h30m         2h30m         30m         300m | TimeDSL30m230m1530m42h30m222h30m42h30m43h253h30m264h40 | TimeDSLRTL30m2170230m15NA230m4921h4922h30m221112h30m412622h30m25393h266894h4025294 | TimeDSLRTLLUTs30m21702167930m15NA26036<br>58371h492532h30m22111245<br>1892h30m412623281<br>32303h25391294h402529441 | TimeDSLRTLLUTsFFs30m217021679128830m15NA26036<br>58375391<br>53641h49253712h30m22111245<br>189284<br>2092h30m412623281<br>32304443<br>41923h2539129644h40252944190 | TimeDSLRTLLUTsFFsDSPs $30m$ 2 $1702$ $1679$ $1288$ $0$ $30m$ $15$ $NA^2$ $6036$<br>$5837$ $5391$<br>$5364$ $3$<br>$0$ $1h$ $4$ $92$ $53$ $71$ $3$ $2h30m$ $22$ $111$ $245$<br>$189$ $284$<br>$209$ $2$<br>$1$ $2h30m$ $4$ $1262$ $3281$<br>$3230$ $4443$<br>$4192$ $2$<br>$0$ $3h$ $25$ $39$ $129$ $64$ $10$ $3h30m$ $26$ $689$ $171$ $305$ $9$ $4h$ $40$ $2529^4$ $41$ $90$ $3$ |

Comp

modi

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs, Altera 18×18 DSPs are also different from Xilinx 25×18 DSPs <sup>4</sup>JHDL directly generates a circuit netlist in EDIF format instead of generating RTL

|             | DSL                               | Dev.<br>Time   | Lines of DSL | of Code<br>RTL    | R<br>LUTs    | esources<br>FFs | S<br>DSPs | <b>Freq.</b><br>MHz |
|-------------|-----------------------------------|----------------|--------------|-------------------|--------------|-----------------|-----------|---------------------|
|             | Flopoco <sup>1</sup>              | 30m            | 2            | 1702              | 1679         | 1288            | 0         | 91                  |
|             | Maxeler<br>(baseline)             | 30m<br>30m     | 15           | NA <sup>2</sup>   | 6036<br>5837 | 5391<br>5364    | 3<br>0    | 120                 |
|             | Vivado<br>HLS                     | 1h             | 4            | 92                | 53           | 71              | 3         | 117                 |
| ndor<br>ILS | Lime<br>(baseline)                | 2h30m<br>2h30m | 22           | 111               | 245<br>189   | 284<br>209      | 2<br>1    | 160                 |
|             | OpenCL <sup>3</sup><br>(baseline) | 2h30m<br>2h30m | 4            | 1262              | 3281<br>3230 | 4443<br>4192    | 2<br>0    | 267                 |
|             | Chisel                            | 3h             | 25           | 39                | 129          | 64              | 10        | 66                  |
|             | OpenDF                            | 3h30m          | 26           | 689               | 171          | 305             | 9         | 120                 |
|             | JHDL                              | 4h             | 40           | 2529 <sup>4</sup> | 41           | 90              | 3         | 84                  |
|             | SCORE                             | 4h             | 7            | 111               | 139          | 245             | 2         | 74                  |

Ve

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs,

Altera 18×18 DSPs are also different from Xilinx 25×18 DSPs <sup>4</sup>JHDL directly generates a circuit netlist in EDIF format instead of generating RTL

|       | DSL                               | Dev.<br>Time   | Lines<br>DSL | of Code<br>RTL    | R<br>LUTs    | <b>lesource</b><br>FFs | s<br>DSPs | <b>Freq.</b><br>MHz |
|-------|-----------------------------------|----------------|--------------|-------------------|--------------|------------------------|-----------|---------------------|
| ×     | Flopoco <sup>1</sup>              | 30m            | 2            | 1702              | 1679         | 1288                   | 0         | 91                  |
| mited | Maxeler<br>(baseline)             | 30m<br>30m     | 15           | NA <sup>2</sup>   | 6036<br>5837 | 5391<br>5364           | 3<br>0    | 120                 |
| expr  | Vivado<br>HLS                     | 1h             | 4            | 92                | 53           | 71                     | 3         | 117                 |
|       | Lime<br>(baseline)                | 2h30m<br>2h30m | 22           | 111               | 245<br>189   | 284<br>209             | 2<br>1    | 160                 |
|       | OpenCL <sup>3</sup><br>(baseline) | 2h30m<br>2h30m | 4            | 1262              | 3281<br>3230 | 4443<br>4192           | 2<br>0    | 267                 |
|       | Chisel                            | 3h             | 25           | 39                | 129          | 64                     | 10        | 66                  |
|       | OpenDF                            | 3h30m          | 26           | 689               | 171          | 305                    | 9         | 120                 |
|       | JHDL                              | 4h             | 40           | 2529 <sup>4</sup> | 41           | 90                     | 3         | 84                  |
|       | SCORE                             | 4h             | 7            | 111               | 139          | 245                    | 2         | 74                  |

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs,

Altera  $18 \times 18$  DSPs are also different from Xilinx  $25 \times 18$  DSPs <sup>4</sup>JHDL directly generates a circuit netlist in EDIF format instead of generating RTL

| DSL                               | Dev.<br>Time   | <b>Lines</b><br>DSL | of Code<br>RTL    | F<br>LUTs    | <b>Resource</b><br>FFs | s<br>DSPs | <b>Freq.</b><br>MHz |  |  |
|-----------------------------------|----------------|---------------------|-------------------|--------------|------------------------|-----------|---------------------|--|--|
| Flopoco <sup>1</sup>              | 30m            | 2                   | 1702              | 1679         | 1288                   | 0         | 91                  |  |  |
| Maxeler<br>(baseline)             | 30m<br>30m     | 15                  | NA <sup>2</sup>   | 6036<br>5837 | 5391<br>5364           | 3<br>0    | 120                 |  |  |
| Vivado<br>HLS                     | 1h             | 4                   | 92                | 53           | 71                     | 3         | 117                 |  |  |
| Lime<br>(baseline)                | 2h30m<br>2h30m | 22                  | 111               | 245<br>189   | 284<br>209             | 2<br>1    | 160                 |  |  |
| OpenCL <sup>3</sup><br>(baseline) | 2h30m<br>2h30m | 4                   | 1262              | 3281<br>3230 | 4443<br>4192           | 2<br>0    | 267                 |  |  |
| Chisel                            | 3h             | 25                  | 39                | 129          | 64                     | 10        | 66                  |  |  |
| OpenDF                            | 3h30m          | 26                  | 689               | 171          | 305                    | 9         | 120                 |  |  |
| JHDL                              | 4h             | 40                  | 2529 <sup>4</sup> | 41           | 90                     | 3         | 84                  |  |  |
| SCORE                             | 4h             | 7                   | 111               | 139          | 245                    | 2         | 74                  |  |  |
|                                   |                |                     |                   |              |                        |           |                     |  |  |

To

COr

tou

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs, Altera 18×18 DSPs are also different from Xilinx 25×18 DSPs <sup>4</sup>JHDL directly generates a circuit netlist in ED#F format instead of generating RTL

| $\begin{array}{c c c c c c c c c c c c c c c c c c c $                                                                                                                                                                                                                                                                                                                                                                        |                      |       |    |                   |      |      |    |                     |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------|-------|----|-------------------|------|------|----|---------------------|
| Maxeler<br>(baseline) $30m$ 15 $NA^2$ $6036$<br>$5837$ $5391$<br>$5364$ $3$ $12$ Vivado<br>HLS1h492 $53$ $71$ $3$ $11$ Lime<br>(baseline)2h30m22 $111$ $245$<br>$189$ $284$<br>$209$ $2$ $16$ OpenCL^3<br>(baseline)2h30m4 $1262$<br>$3230$ $3230$<br>$4192$ $4443$<br>$4192$ $2$<br>$0$ $26$ OpenDF $3h$<br>$3h30m$ $25$ $39$<br>$129$ $129$<br>$64$ $64$ $10$<br>$66$ OpenDF $3h30m$<br>$26$ $2529^4$<br>$41$ $90$ $3$ $84$ | DSL                  |       |    |                   |      |      |    | <b>Freq.</b><br>MHz |
| $ \begin{array}{c ccccccccccccccccccccccccccccccccccc$                                                                                                                                                                                                                                                                                                                                                                        | Flopoco <sup>1</sup> | 30m   | 2  | 1702              | 1679 | 1288 | 0  | 91                  |
| HLSLime<br>(baseline) $2h30m$ $22$ $111$ $245$<br>$189$ $284$<br>$209$ $2$ $16$<br>                                                                                                                                                                                                                                                                                                                                           |                      |       | 15 | NA <sup>2</sup>   |      |      |    | 120                 |
| $ \begin{array}{c ccccccccccccccccccccccccccccccccccc$                                                                                                                                                                                                                                                                                                                                                                        |                      | 1h    | 4  | 92                | 53   | 71   | 3  | 117                 |
| (baseline) $2h30m$ $3230$ $4192$ $0$ Chisel $3h$ $25$ $39$ $129$ $64$ $10$ $66$ OpenDF $3h30m$ $26$ $689$ $171$ $305$ $9$ $12$ JHDL $4h$ $40$ $2529^4$ $41$ $90$ $3$ $84$                                                                                                                                                                                                                                                     |                      |       | 22 | 111               |      |      |    | 160                 |
| OpenDF       3h30m       26       689       171       305       9       12         JHDL       4h       40       2529 <sup>4</sup> 41       90       3       84                                                                                                                                                                                                                                                                | -                    |       | 4  | 1262              |      |      |    | 267                 |
| JHDL         4h         40         2529 <sup>4</sup> 41         90         3         84                                                                                                                                                                                                                                                                                                                                       | Chisel               | 3h    | 25 | 39                | 129  | 64   | 10 | 66                  |
|                                                                                                                                                                                                                                                                                                                                                                                                                               | OpenDF               | 3h30m | 26 | 689               | 171  | 305  | 9  | 120                 |
| SCORE 4h 7 111 139 245 2 74                                                                                                                                                                                                                                                                                                                                                                                                   | JHDL                 | 4h    | 40 | 2529 <sup>4</sup> | 41   | 90   | 3  | 84                  |
|                                                                                                                                                                                                                                                                                                                                                                                                                               | SCORE                | 4h    | 7  | 111               | 139  | 245  | 2  | 74                  |

Date

EDIF

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs, Altera 18×18 DSPs are also different from Xilinx 25×18 DSPs <sup>4</sup>JHDL directly generates a circuit netlist in EDIF format instead of generating RTL

|            | DSL                               | Dev.<br>Time   | Lines of Code<br>DSL RTL |                   | <b>Resources</b><br>LUTs FFs DSPs |              |        | <b>Freq.</b><br>MHz |
|------------|-----------------------------------|----------------|--------------------------|-------------------|-----------------------------------|--------------|--------|---------------------|
|            |                                   | Inne           | DSL                      | KIL               | 2013                              | 115          | 2013   |                     |
| _          | Flopoco <sup>1</sup>              | 30m            | 2                        | 1702              | 1679                              | 1288         | 0      | 91                  |
|            | Maxeler<br>(baseline)             | 30m<br>30m     | 15                       | NA <sup>2</sup>   | 6036<br>5837                      | 5391<br>5364 | 3<br>0 | 120                 |
| ardware    | Vivado<br>HLS                     | 1h             | 4                        | 92                | 53                                | 71           | 3      | 117                 |
| tudents    | Lime<br>(baseline)                | 2h30m<br>2h30m | 22                       | 111               | 245<br>189                        | 284<br>209   | 2<br>1 | 160                 |
| lisliked - | OpenCL <sup>3</sup><br>(baseline) | 2h30m<br>2h30m | 4                        | 1262              | 3281<br>3230                      | 4443<br>4192 | 2<br>0 | 267                 |
|            | Chisel                            | 3h             | 25                       | 39                | 129                               | 64           | 10     | 66                  |
| -          | OpenDF                            | 3h30m          | 26                       | 689               | 171                               | 305          | 9      | 120                 |
|            | JHDL                              | 4h             | 40                       | 2529 <sup>4</sup> | 41                                | 90           | 3      | 84                  |
|            | SCORE                             | 4h             | 7                        | 111               | 139                               | 245          | 2      | 74                  |

Ha sti

O

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs,

Altera  $18 \times 18$  DSPs are also different from Xilinx  $25 \times 18$  DSPs <sup>4</sup>JHDL directly generates a circuit netlist in EDIF format instead of generating RTL

|               | 201                               |                |     |                   |              |              |        |       |
|---------------|-----------------------------------|----------------|-----|-------------------|--------------|--------------|--------|-------|
|               | DSL                               | Dev.           |     | of Code           |              | lesources    |        | Freq. |
| . <u>()</u> _ |                                   | Time           | DSL | RTL               | LUTs         | FFs          | DSPs   | MHz   |
| T             | Flopoco <sup>1</sup>              | 30m            | 2   | 1702              | 1679         | 1288         | 0      | 91    |
| <b>T</b>      | Maxeler<br>(baseline)             | 30m<br>30m     | 15  | NA <sup>2</sup>   | 6036<br>5837 | 5391<br>5364 | 3<br>0 | 120   |
| 1             | Vivado<br>HLS                     | 1h             | 4   | 92                | 53           | 71           | 3      | 117   |
|               | Lime<br>(baseline)                | 2h30m<br>2h30m | 22  | 111               | 245<br>189   | 284<br>209   | 2<br>1 | 160   |
| -             | OpenCL <sup>3</sup><br>(baseline) | 2h30m<br>2h30m | 4   | 1262              | 3281<br>3230 | 4443<br>4192 | 2<br>0 | 267   |
|               | Chisel                            | 3h             | 25  | 39                | 129          | 64           | 10     | 66    |
|               | OpenDF                            | 3h30m          | 26  | 689               | 171          | 305          | 9      | 120   |
|               | JHDL                              | 4h             | 40  | 2529 <sup>4</sup> | 41           | 90           | 3      | 84    |
|               | SCORE                             | 4h             | 7   | 111               | 139          | 245          | 2      | 74    |

TABLE I: Comparing DSLs with  $ax^2 + bx + c$  mapping

<sup>1</sup>Flopoco only provides floating-point support for these expressions <sup>2</sup>MaxCompiler does not produce any intermediate RTL, directly generates executable bitstreams <sup>3</sup>Altera resources measured in LEs instead of LUTs,

Altera  $18 \times 18$  DSPs are also different from Xilinx  $25 \times 18$  DSPs <sup>4</sup>JHDL directly generates a circuit netlist in EDIF format instead of generating RTL

### Conclusions

#### • Summary

— Vast space of DSLs

— Various states of rot — unmaintained projects

#### • How to navigate?

First attempt: Does HLS/OpenCL work for you
 Next try: Well-supported tools such as Matlab
 HDLCoder, Tabview FPGA, Maxeler Dataflow
 Finally: Check amongst the DSLs, or write
 your own