An Efficient Connection between Statistical Software and Database Management System

Sunghae Jun

Abstract


In big data era, we need to manipulate and analyze the big data. For the first step of big data manipulation, we can consider traditional database management system. To discover novel knowledge from the big data environment, we should analyze the big data. Many statistical methods have been applied to big data analysis. Most works of statistical analysis are depended on diverse statistical software such as SAS, SPSS, or R project. In addition, a considerable portion of big data is stored in diverse database systems. But, the data types of general statistical software are different from the database systems such as Oracle, or MySQL. So, many approaches to connect statistical software to database management system(DBMS) were introduced. In this paper, we study on an efficient connection between the statistical software and DBMS. To show our performance, we carry out a case study using real application.


Keywords


Statistical software; Database management system; Big data analysis; Database connection; MySQL; R project

Full Text:

PDF

References


A. Sathi, Big Data Analytics, IBM Corporation, 2012.

R. M. Heiberger, E. Neuwirth, R through Excel – A Spreadsheet Interface for Statistics, Data Analysis, and Graphics, Springer, 2009.

MySQL, The World’s most popular open source database, http://www.mysql.com, 2013.

S. Sim, H. Kang, Y. Lee, “Access to Database through the R-Language,†The Korean Communications in Statistics, vol. 15, no. 1, pp. 51-64, 2008.

UCI Machine Learning Repository, http://archive.ics.uci.edu/ml, 2013.

SAS, http://www.sas.com, 2012.

SPSS, http://www-01.ibm.com/software/analytics/spss/, 2013.

Minitab, http://www.minitab.com, 2013.

S-Plus, http://solutionmetrics.com.au/products/splus/, 2013.

Wikipedia, the free encyclopedia, http://en.wikipedia.org, 2013.

C. J. Date, An Introduction to Database Systems, 7th edition, Addition-Wesley, 2000.

Oracle, http://www.oracle.com, 2013.

B. Ripley, Package RODBC, CRAN R-Project, 2013.

J. Han, M. Kamber, Data Mining Concepts and Techniques, Morgan Kaufmann, 2001.

R system, The R Project for Statistical Computing, http://www.r-project.org, 2013.

P. Spector, Data Manipulation with R, Springer, 2008.

D. A. James, S. DebRoy, Package RMySQL, CRAN R-Project, 2013.


Refbacks

  • There are currently no refbacks.


ISSN: 1694-2507 (Print)

ISSN: 1694-2108 (Online)