| Title | Study on Parallel SVM Based on MapReduce |
| Publication Type | Conference Proceedings |
| Year of Publication | Submitted |
| Date Published | 07/2012 |
| Authors | Zhanquan, S., and G. Fox |
| Refereed Designation | Unknown |
| Conference Name | The 2012 International Conference on Parallel and Distributed Processing Techniques and Applications |
| Series Title | Proceedings of the 2012 International Conference on Parallel and Distributed Processing Techniques and Applications |
| Conference Location | Las Vegas NV USA |
| Publication Language | eng |
| Keywords | Large scale data, MapReduce, Parallel SVM, Twister |
| Abstract |
Support Vector Machines (SVM) are powerful classification and regression tools. They have been widely studied by many scholars and applied in many kinds of practical fields. But their compute and storage requirements increase rapidly with the number of training vectors, putting many problems of practical interest out of their reach. For applying SVM to large scale data mining, parallel SVM are studied and some parallel SVM methods are proposed. Most currently parallel SVM methods are based on classical MPI model. It is not easy to be used in practical, especial to large scale data-intensive data mining problems. MapReduce is an efficient distribution computing model to process large scale data mining problems. Some MapReduce software were developed, such as Hadoop, Twister and so on. In this paper, parallel SVM based on iterative MapReduce model Twister is studied. The program flow is developed. The efficiency of the method is illustrated through analyzing practical problems. |
| URL | Follow Link |