|
楼主 |
发表于 2008-5-28 12:22:46
|
显示全部楼层
May 27, 2008
More info about the GPU1 to GPU2 transition
There's been several questions regarding the GPU1 client and why we decided to shut it down. I hope I can shed some light here at least on why we're doing what we're doing, such that even if people disagree with our decisions, they can at least see where we're coming from.
Some people have asked "why shutdown the client if it's working?" The bottom line here is that the GPU1 results are no longer scientifically useful. It's pretty clear now that DirectX (DX) is not sufficiently reliable for scientific calculations. This was not known before (and some people wouldn't believe this until we proved it). With the GPU1 results, we can now show what the limitations are pretty decisively.
GPU1 also did help us a lot in terms of developing its successor and what's needed to run GPU's in a distributed computing fashion. The good news here is that GPU2 is behaving very well, on both ATI and NVIDIA hardware, and this is a direct result of what we've learned with GPU1 WU's. In the end, however, GPU1 will not be able to help us understand protein misfolding, Alzheimer's Disease, etc due to this unresolvable limitations. We could keep GPU1 live just crunching away in its current form, but that would be wasting people's electricity at this point, as we've learned everything we can learn from those cards can do.
In the past, we had a somewhat similar shutdown situation, i.e. when QMD core projects stopped. In that case, donors were left hanging since we didn't give any warning for stopping QMD projects. We did try (perhaps unsuccessfully) to handle the GPU1 situation better than QMD. In QMD, we stopped needing that core and so we stopped the calculation without warning, not realizing the impact that would cause. With GPU1, we gave a several month warning (indeed, note that GPU1 is still actively running, so all of this is information in advance to shutting down GPU1). We tried to avoid the QMD situation by giving advance warning, but it looks like donors would like even more advance warning. However, there's limits to how much in advance we know the situation ourselves.
Indeed, the knowledge that it made sense to end GPU1 came reasonably recently to us. We have been working on CAL for a while and it seemed that CAL might be a solution, but we only knew until we got some testing "in the wild." DirectX (DX -- what GPU1 is based on) works much better in the lab than in the wild, and it was possible that CAL behaved that way too. After seeing that CAL behaved well in the wild, it became clear that the GPU1 path was obsolete. However, this is a relatively recent finding and we made the announcement about the situation relatively shortly thereafter.
It was a tough decision. Some suggested we just leave GPU1 running, even though people's electricity really would be going to waste, other than generating points. I didn't think that was a good idea. We did know it would be a tough PR hit, but when people talk about the history of FAH, I want to make it clear that we're here to address AD and other diseases, not just running calculations for the sake of points and nothing more (which has been the critique of some other distributed computing projects).
So, what's the right thing to do? I guess it comes to this: would GPU1 donors be happier if we just keep GPU1 servers running, doing with no scientific value for points? We could do that, at a cost of taking away personnel from improving existing clients, keeping existing servers going, etc for the sake of keeping GPU1 running. However, that's not what FAH is for and I think it's important that FAH not devolve into a big points game, losing sight of why we're doing what we're doing.
PS Some further discussion can be found here .
大意:
因为GPU1的DirectX底层无法提供对高精度科学计算的支持,所以我们要关闭它。我们是为了科学而计算,不是为了计算而计算,希望大家理解。 |
评分
-
查看全部评分
|