DistributedDataMining

来自中国分布式计算总站
Ledled讨论 | 贡献2010年5月8日 (六) 12:21的版本 (新页面: {{Infobox Project | name =distributedDataMining | logo =Image:DistributedDataMining_Logo.png | screenshot = | caption =无屏保图形 | developer = | released =2008年 | operating sy...)
(差异) ←上一版本 | 最后版本 (差异) | 下一版本→ (差异)
跳转到导航 跳转到搜索
可打印版不再被支持且可能有渲染错误。请更新您的浏览器书签并改用浏览器默认的打印功能。

distributedDataMining

DistributedDataMining Logo.png
DistributedDataMining logo

无屏保图形
开发者
版本历史 2008年
运算平台 Windows.pngLinux.png
项目平台 BOINC
程序情况 Time Series Analysis: Stock Price Prediction≈229.8MB
任务情况
项目状态 运行中/开放注册
项目类别 社会科学类
优化程序
计算特点 CPU密集:

支持0分享率

支持GPU计算

官方网址 DistributedDataMining
{{{rss}}} [{{{rss}}} 通过 RSS 获取项目新闻]


distributedDataMining (dDM) is the name of a research project that uses Internet-connected computers to perform research in the various fields of Data Analysis and Machine Learning. The project uses the Berkeley Open Infrastructure for Network Computing (BOINC) for the distribution of research related tasks to several computers. The intent of BOINC is to enable researchers to tap into the enormous processing power of personal computers around the world. If you are willing to support our research challenges please participate in the dDM-Project: Register and download the BOINC software and Java 1.6. After installing and starting BOINC enter the following project URL: http://ddm.nicoschlitter.de/DistributedDataMining/. Please visit our forum to discuss dDM related issues.

All dDM applications use the open source framework RapidMiner. This data mining suite - developed at University of Dortmund - provides various machine learning methods for data analysis purposes. The RapidMinder provides a comfortable plug-in mechanism to easily add new developed algorithms. This flexibility and the processing power of BOINC is an ideal foundation for scientific distributed Data Mining. The dDM project takes that opportunity and serves as a metaproject for different kind of machine learning applications. Below, you find a list of our subprojects and the related scientific publications.

Time Series Prediction

Stock Price Prediction (active)

Part of our research is devoted to Time Series Analysis. Our focus is on forecasting economic time series such as DAX and Dow Jones. At first, we focused on the application of artificial neural networks to forecast time series. A detailed description on this approach, the design of the experimental setting as well as the results are presented in [4]. Later on, we applied support vector machines to avoid the high computational complexity of neural networks. The resulting forecasts are equally impressive even though the necessary computational costs can be decreased significantly. In 2008, we published two related studies [5] and [6]. We extended our studies by using various learning algorithms in order to determine there applicability for stock price prediction. After analyzing the obtained results we made two important observations: (i) the influence of the learning algorithm is much lower than expected, but instead (ii) the training window size has a stronger impact on the quality of the prediction. Since, so far, temporal effects are rarely addressed in the literature, we concentrate in our dDM-project on the study of these temporal aspects in time series analysis.

Social Network Analysis

Tanja Falkowski proposed DenGraph - a density-based graph clustering algorithm. This algorithm is deployable for - among other things - Social Network Analysis. The following studies were part of her PhD theses that is published as a book.

Temporal Dynamics of the Last.fm Music Platform (temporarily suspended)

In this application we applied DenGraph-IO to detect and observe changes in the music listening behaviour of Last.fm users during a period of two years. The aim was to see, whether the proposed clustering technique detects meaningful communities and evolutions [1], [2]. read more

Temporal Evolution of Communities in the Enron Email Data Set (finished)

The collapse of Enron, a U.S. company honored in six consecutive years by "Fortune" as "America's Most Innovative Company", caused one of the biggest bankruptcy cases in US-history. To investigate the case, a data set of approximately 1.5 million e-mails sent or received by Enron employees was published by the Federal Energy Regulatory Commission. We've used the processing power of dDM to analyze the temporal evolution of communities extracted from these email correspondences [3]. read more

References

  1. Schlitter N, Falkowski T. Mining the Dynamics of Music Preferences from a Social Networking Site. In: Proceedings of the 2009 International Conference on Advances in Social Network Analysis and Mining. Athens: IEEE Computer Society; 2009. p. 243-8.
  2. Falkowski T, Schlitter N. Analyzing the Music Listening Behavior and its Temporal Dynamics Using Data from a Social Networking Site. Zurich; 2008.
  3. Falkowski T. Community Analysis in Dynamic Social Networks. Goettingen: Sierke Verlag; 2009.
  4. Schlitter N. Analyse und Prognose ökonomischer Zeitreihen: Neuronale Netze zur Aktienkursprognose. Saarbrücken: VDM Verlag Dr. Müller; 2008.
  5. Schlitter N. A Case Study of Time Series Forecasting with Backpropagation Networks. In: Steinmüller J, Langner H, Ritter M, Zeidler J, editors. 15 Jahre Künstliche Intelligenz an der TU Chemnitz. Chemnitz: Techn. Univ. Chemnitz, Fak. für Informatik; 2008. p. 203-17. (Chemnitzer Informatik-Berichte).
  6. Möller M, Schlitter N. Analyse und Prognose ökonomischer Zeitreihen mit Support Vector Machines. In: Steinmüller J, Langner H, Ritter M, Zeidler J, editors. 15 Jahre Künstliche Intelligenz an der Fakultät für Informatik. Chemnitz: Techn. Univ. Chemnitz, Fak. für Informatik; 2008. p. 189-201. (Chemnitzer Informatik-Berichte).
Boinc Icon.png伯克利开放式网络计算平台BOINC
· ·
生命科学类项目 GPUGRID · RALPH@home (Alpha内测项目)· RNA World · Rosetta@home · The Lattice Project
地球科学类项目 Climateprediction.net
人工智能类项目 MindModeling@Home
天文学项目 Cosmology@Home · MilkyWay@home· Asteroids@home
物理化学类项目 Einstein@Home · LHC@home · QMC@Home
数学类项目 Collatz Conjecture · NFS@Home · PrimeGrid
密码类项目 Moo! Wrapper
多种应用的项目 World Community Grid · Yoyo@home
与 BOINC 平台相关的项目 BOINC Alpha Test · WUProp@Home
已结束/暂停/合并的项目 Astropulse · Computational Structural Biology · DrugDiscovery@Home ·Pirates@home ·Enigma@Home · CAS@home · ABC@home · AlmereGrid Boinc Grid · APS@Home · AQUA@home · BBC Climate Change Experiment · Biochemical Library · BRaTS@Home · Cels@Home · Chess960@Home · CPDN Beta · DepSpid · DistrRTgen · DNA@home · DNETC@HOME · Docking@Home · Drug@Home · DynaPing · EDGeS@Home · eOn: Long timescale dynamics · Evo@home · Eternity2.fr · FreeHAL@home · Goldbach's Conjecture Project · Ibercivis · Magnetism@home · Mersenne@home · MilestoneRSA · Minecraft@Home · Mopac@home · MFluids@Home · Nano-Hive@home · NQueens Project · Orbit@Home · Open Rendering Environment · POEM@HOME · PicEvolvr.com] · Predictor@home · QuantumFIRE alpha · Ramsey@Home ·RamseyX · Rectilinear Crossing Number · Renderfarm.fi · RSA Lattice Siever (2.0) · Seasonal Attribution Project · SHA-1 Collision Search Graz · SIMAP · SLinCA@Home · Spinhenge@home · Sudoku@vtaiwan · Superlink@Technion · TANPAKU · Virtual Prairie · Virus Respiratorio Sincitial · XtremLab · Zivis · SETI@home · SETI@home/AstroPulse Beta (Beta公测项目)· The Lattice Project· Malariacontrol.net· Quake-Catcher Network Seismic Monitoring· primaboinca · SZTAKI Desktop Grid · WEP-M+2 Project· Charity Engine · BURP · Hydrogen@Home · Leiden Classical
BOINC 相关的工具 BOINCstats BAM! · BOINC Translation Services · BOINC TThrottle