Re: [math-fun] Intel to incorporate FPGA in new Xeon processor

older
[math-fun] Intel to incorporate...

Warren D Smith

23 Jun 2014 23 Jun '14

11:18 a.m.

I think this is a superb development, I've felt that should happen for a long time. It also will address a lot of issues that others (e.g. Jorg Arndt) have been complaining about for years, e.g. "why doesn't my processor have a 'reverse bit order' instruction?" This will be a hacker renaissance. It does not bother me the slightest bit that there is a big pre-wired processor accompanying the FPGA. The stuff lots of users often want, should be available hardwired for speed, not programmable for slowness. Furthermore, in the event that some FPGA use becomes commonplace idiom, that will hopefully inspire the hardware guys to make it available without the FPGA. My question would be: how should a high level language take advantage of this new hardware capability?

Show replies by date

Tom Rokicki

23 Jun 23 Jun

11:28 a.m.

New subject: [math-fun] Intel to incorporate FPGA in new Xeon processor

I suspect the latency of accessing the FPGA may preclude use for single-instruction-type things (like finding the index of the 3rd set bit in a 64-bit word, often used in succinct data structures). I believe the FPGA will be more useful in a coprocessor sense. It will be interesting to see how the FPGA ties in to the memory hierarchy, if it does at all. On Mon, Jun 23, 2014 at 10:17 AM, Warren D Smith <warren.wds@gmail.com> wrote:

...

I think this is a superb development, I've felt that should happen for a long time. It also will address a lot of issues that others (e.g. Jorg Arndt) have been complaining about for years, e.g. "why doesn't my processor have a 'reverse bit order' instruction?" This will be a hacker renaissance.

It does not bother me the slightest bit that there is a big pre-wired processor accompanying the FPGA. The stuff lots of users often want, should be available hardwired for speed, not programmable for slowness.

Furthermore, in the event that some FPGA use becomes commonplace idiom, that will hopefully inspire the hardware guys to make it available without the FPGA.

My question would be: how should a high level language take advantage of this new hardware capability?

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

-- -- http://cube20.org/ -- http://golly.sf.net/ --

Tom Knight

11:34 a.m.

New subject: [math-fun] Intel to incorporate FPGA in new Xeon processor

If you check these patents, you’ll see that we envisioned the FPGA logic having direct read/write access to the processor register files. This would allow an easy single cycle instruction of the kind you envision. Intel probably didn’t do this, although there is nothing difficult about it. US 5,742,180 and US 6,052,773 On Jun 23, 2014, at 1:28 PM, Tom Rokicki <rokicki@gmail.com> wrote:

...

I suspect the latency of accessing the FPGA may preclude use for single-instruction-type things (like finding the index of the 3rd set bit in a 64-bit word, often used in succinct data structures). I believe the FPGA will be more useful in a coprocessor sense.

It will be interesting to see how the FPGA ties in to the memory hierarchy, if it does at all.

On Mon, Jun 23, 2014 at 10:17 AM, Warren D Smith <warren.wds@gmail.com> wrote:

...
I think this is a superb development, I've felt that should happen for a long time. It also will address a lot of issues that others (e.g. Jorg Arndt) have been complaining about for years, e.g. "why doesn't my processor have a 'reverse bit order' instruction?" This will be a hacker renaissance.

It does not bother me the slightest bit that there is a big pre-wired processor accompanying the FPGA. The stuff lots of users often want, should be available hardwired for speed, not programmable for slowness.

Furthermore, in the event that some FPGA use becomes commonplace idiom, that will hopefully inspire the hardware guys to make it available without the FPGA.

My question would be: how should a high level language take advantage of this new hardware capability?

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

-- -- http://cube20.org/ -- http://golly.sf.net/ --

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

James Cloos

4:37 p.m.

New subject: [math-fun] Intel to incorporate FPGA in new Xeon processor

...

...
...
...
...
"TK" == Tom Knight <tk@ginkgobioworks.com> writes:

TK> If you check these patents, you’ll see that we envisioned the FPGA TK> logic having direct read/write access to the processor register TK> files. This would allow an easy single cycle instruction of the kind TK> you envision. Intel probably didn’t do this, although there is nothing TK> difficult about it. TK> US 5,742,180 and US 6,052,773 The reports I read said that all they are doing is putting an existing fpga chip in the package with their xeon chip, with the two communicating over the same bus multi-socket xeons use between themselves. -JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 0x997A9F17ED7DAEA6

Tom Rokicki

4:47 p.m.

New subject: [math-fun] Intel to incorporate FPGA in new Xeon processor

Oh my, they are implementing QPI in the FPGA? That's actually pretty cool; it gives you a fully coherent memory interface. QPI is much faster than the FPGA, though, so they probably have a slower wider interface unit of some sort that the FPGA talks to. But it also means you get a really fast interface to memory. That sounds like a lot of fun! (But scary; you can probably really mess up the system by botching the coherence protocol in some subtle ways.) -tom On Mon, Jun 23, 2014 at 3:32 PM, James Cloos <cloos@jhcloos.com> wrote:

...

...
...
...
...
...
"TK" == Tom Knight <tk@ginkgobioworks.com> writes:

TK> If you check these patents, you’ll see that we envisioned the FPGA TK> logic having direct read/write access to the processor register TK> files. This would allow an easy single cycle instruction of the kind TK> you envision. Intel probably didn’t do this, although there is nothing TK> difficult about it. TK> US 5,742,180 and US 6,052,773

The reports I read said that all they are doing is putting an existing fpga chip in the package with their xeon chip, with the two communicating over the same bus multi-socket xeons use between themselves.

-JimC -- James Cloos <cloos@jhcloos.com> OpenPGP: 0x997A9F17ED7DAEA6

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

-- -- http://cube20.org/ -- http://golly.sf.net/ --

Joerg Arndt

12:10 p.m.

New subject: [math-fun] Intel to incorporate FPGA in new Xeon processor

* Warren D Smith <warren.wds@gmail.com> [Jun 23. 2014 19:44]:

...

I think this is a superb development, I've felt that should happen for a long time. It also will address a lot of issues that others (e.g. Jorg Arndt)

Umlaut lecture: Joerg Arndt, or JÃ¶rg Arndt for the UTF-8 professionals, or jj for people that dislike that particular first name as much as I do.

...

have been complaining about for years, e.g. "why doesn't my processor have a 'reverse bit order' instruction?" This will be a hacker renaissance.

Sadly, CPU --> FPGA (revbin word) FPGA --> CPU will be the single most slow way to do it. There used to by a socket for the FPU coprocessor (Weitek?), I always have been puzzled why at the time the Weitek was obsolete (Pentium came) that socket wasn't kept (for FPGA or [here goes your ad]).

...

It does not bother me the slightest bit that there is a big pre-wired processor accompanying the FPGA. The stuff lots of users often want, should be available hardwired for speed, not programmable for slowness.

Furthermore, in the event that some FPGA use becomes commonplace idiom, that will hopefully inspire the hardware guys to make it available without the FPGA.

As it has been said, the FPGA does very well with tasks that (bit-)parallelize well and with tasks where a deep serialization kicks butt. There is a rough similarity between this and both SIMD and GPU (the latter to a greater extend). For running your compiler of (gasp!) Office[TeeEmm], you'll be _very_ hard pressed to win with over your CPU with the FPGA.

...

My question would be: how should a high level language take advantage of this new hardware capability?

I speculate that this, as an afterthought for most (all??) existing languages, is not going to integrate smoothly. I'd be delighted to hear of languages where this may not come as in "we nailed a plank to a dog to make it an octopus". Best, jj

...

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

Tom Rokicki

12:19 p.m.

New subject: [math-fun] Intel to incorporate FPGA in new Xeon processor

So I see (at least) three types of work the FPGA could do: 1. Complicated work on small data---bitcoin mining and stuff like that. FPGAs tend to do this well; no latency concerns or memory access bottlenecks. 2. Streaming work. Will the FPGA have a streamable interface to the memory hierarchy? If so, all sorts of vector work with big computations are possible. It becomes SSE on steroids (of course subject to memory bandwidth limitations). 3. Random access work. Will the FPGA have random access to the memory hierarchy? And if so, how many memory accesses could be in flight at one time? This would permit another wide range of possible applications. I don't have high expectations for the first iteration of this technology, but I hope to be pleasantly surprised. On Mon, Jun 23, 2014 at 11:09 AM, Joerg Arndt <arndt@jjj.de> wrote:

...

* Warren D Smith <warren.wds@gmail.com> [Jun 23. 2014 19:44]:

...
I think this is a superb development, I've felt that should happen for a long time. It also will address a lot of issues that others (e.g. Jorg Arndt)

Umlaut lecture: Joerg Arndt, or Jörg Arndt for the UTF-8 professionals, or jj for people that dislike that particular first name as much as I do.

...
have been complaining about for years, e.g. "why doesn't my processor have a 'reverse bit order' instruction?" This will be a hacker renaissance.

Sadly, CPU --> FPGA (revbin word) FPGA --> CPU will be the single most slow way to do it.

There used to by a socket for the FPU coprocessor (Weitek?), I always have been puzzled why at the time the Weitek was obsolete (Pentium came) that socket wasn't kept (for FPGA or [here goes your ad]).

...
It does not bother me the slightest bit that there is a big pre-wired processor accompanying the FPGA. The stuff lots of users often want, should be available hardwired for speed, not programmable for slowness.

Furthermore, in the event that some FPGA use becomes commonplace idiom, that will hopefully inspire the hardware guys to make it available without the FPGA.

As it has been said, the FPGA does very well with tasks that (bit-)parallelize well and with tasks where a deep serialization kicks butt.

There is a rough similarity between this and both SIMD and GPU (the latter to a greater extend).

For running your compiler of (gasp!) Office[TeeEmm], you'll be _very_ hard pressed to win with over your CPU with the FPGA.

...
My question would be: how should a high level language take advantage of this new hardware capability?

I speculate that this, as an afterthought for most (all??) existing languages, is not going to integrate smoothly.

I'd be delighted to hear of languages where this may not come as in "we nailed a plank to a dog to make it an octopus".

Best, jj

...
_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

_______________________________________________ math-fun mailing list math-fun@mailman.xmission.com https://mailman.xmission.com/cgi-bin/mailman/listinfo/math-fun

-- -- http://cube20.org/ -- http://golly.sf.net/ --

4170

Age (days ago)

4170

Last active (days ago)

List overview

Download

6 comments

5 participants

participants (5)

James Cloos
Joerg Arndt
Tom Knight
Tom Rokicki
Warren D Smith