MySQL is a flexible, widely used database server. The client tools are installed on all compute and submit nodes for access to your database. More information can be found at: http://dev.mysql.com/
MySQL in Wharton’s HPC Environment
MySQL is available for all users of our HPC environment, but requires activation. Please e-mail research-computing@wharton.upenn.edu with a request for MySQL access.
For These Demos: Request ‘patstat2015b’ Access
E-mail research-computing@wharton.upenn.edu and ask for access to the MySQL database: ‘patstat2015b’.
Set Up ODBC Connector
Create a .odbc.ini file in your home directory (Linux) on the cluster, like so:
[DATABASE] Description = My Database Driver = MySQL FileUsage = 1 SERVER = hpcc-sql Database = DATABASE PORT = 3306 USER = USERNAME Password = PASSWORD
Change both instances of ‘DATABASE’ to the correct DB name – usually your username, or another available DB like PATSTAT or IMDb. Change ‘USERNAME’ to your username, and ‘PASSWORD’ to your MySQL password (not your Wharton or Penn password!!!).
For example, setting up your ‘patstat2015b’ access:
[patstat2015b] Description = PATSTAT 2015b DB Driver = MySQL FileUsage = 1 SERVER = hpcc-sql Database = patstat2015b PORT = 3306 USER = USERNAME Password = PASSWORD
Change ‘USERNAME’ to your username, and ‘PASSWORD’ to your MySQL password (not your Wharton or Penn password!!!).
Test DSN Configuration and Connectivity
From an SSH session, test connectivity with ‘isql DATABASE’. For example:
$ isql patstat2015b +---------------------------------------+ | Connected! | | | | sql-statement | | help [tablename] | | quit | | | +---------------------------------------+ SQL> show tables +--------------------------------------------------------------------------+ | Tables_in_patstat2015b | +--------------------------------------------------------------------------+ | tls201_appln | | tls202_appln_title | | tls203_appln_abstr | | tls204_appln_prior | | tls205_tech_rel | | tls206_person | | tls207_pers_appln | | tls209_appln_ipc | | tls210_appln_n_cls | | tls211_pat_publn | | tls212_citation | | tls214_npl_publn | | tls215_citn_categ | | tls216_appln_contn | | tls221_inpadoc_prs | | tls222_appln_jp_class | | tls223_appln_docus | | tls224_appln_cpc | | tls226_person_orig | | tls227_pers_publn | | tls228_docdb_fam_citn | | tls229_appln_nace2 | | tls230_appln_techn_field | | tls801_country | | tls802_legal_event_code | | tls901_techn_field_ipc | | tls902_ipc_nace2 | | tls906_person | | tls909_eee_ppat | +--------------------------------------------------------------------------+ SQLRowCount returns 29 29 rows fetched SQL>
MySQL Server Access
Connect to our MySQL server hpcc-sql from any HPCC node (compute or login) with your provided MySQL account information.
MySQL and Python
Set Up and Test ODBC Connector
Create an .odbc.ini file and test connectivity per the instructions at the top of this page.
Testing Python to MySQL Connectivity
As usual, you’ll want to setup a virtualenv for your Python work:
qlogin mkvirtualenv mySqldemo
OR if you already have one setup, activate it:
qlogin workon mySqldemo
You’ll need to install the pyodbc package. The following works well:
pip install pyodbc
Once you have the package installed, create and run (python scriptname) the following script:
import pyodbc #create a mySQL connection cnx = pyodbc.connect("DSN=hughmac") #create a cursor for handling inserts, selects, etc cursor = cnx.cursor() # drop table. sql = 'DROP TABLE demo' cursor.execute(sql) #create a table. sql = 'CREATE TABLE demo (id int not null auto_increment primary key, firstname varchar(75), lastname varchar(75))' cursor.execute(sql) #insert some data first = 'Max' last = 'Rockatansky' cursor.execute("INSERT INTO demo(`firstname`, `lastname`) VALUES (?,?)", (first,last)) #need to commit the insert cnx.commit() #pull the data back out again and print sql = 'SELECT * FROM demo' cursor.execute(sql) recs = cursor.fetchall() print recs cnx.close()
MySQL and R
It’s possible to work with your (or other available) MySQL database directly from within R, either in the HPCC environment (preferred and documented here), from another R installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).
Generally, instructions are from https://cran.r-project.org/web/packages/RODBC/vignettes/RODBC.pdf
Set Up and Test ODBC Connector
Create a .odbc.ini file and test connectivity per the instructions at the top of this page.
Testing R to MySQL Connectivity
Now you’re ready to test connectivity from within R.
$ qlogin $ R --no-save ... > library("RODBC") > ch <- odbcConnect("patstat2015b") > sqlTables(ch) TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS 1 patstat2015b tls201_appln TABLE 2 patstat2015b tls202_appln_title TABLE 3 patstat2015b tls203_appln_abstr TABLE 4 patstat2015b tls204_appln_prior TABLE 5 patstat2015b tls205_tech_rel TABLE 6 patstat2015b tls206_person TABLE 7 patstat2015b tls207_pers_appln TABLE 8 patstat2015b tls209_appln_ipc TABLE 9 patstat2015b tls210_appln_n_cls TABLE 10 patstat2015b tls211_pat_publn TABLE 11 patstat2015b tls212_citation TABLE 12 patstat2015b tls214_npl_publn TABLE 13 patstat2015b tls215_citn_categ TABLE 14 patstat2015b tls216_appln_contn TABLE 15 patstat2015b tls221_inpadoc_prs TABLE 16 patstat2015b tls222_appln_jp_class TABLE 17 patstat2015b tls223_appln_docus TABLE 18 patstat2015b tls224_appln_cpc TABLE 19 patstat2015b tls226_person_orig TABLE 20 patstat2015b tls227_pers_publn TABLE 21 patstat2015b tls228_docdb_fam_citn TABLE 22 patstat2015b tls229_appln_nace2 TABLE 23 patstat2015b tls230_appln_techn_field TABLE 24 patstat2015b tls801_country TABLE 25 patstat2015b tls802_legal_event_code TABLE 26 patstat2015b tls901_techn_field_ipc TABLE 27 patstat2015b tls902_ipc_nace2 TABLE 28 patstat2015b tls906_person TABLE 29 patstat2015b tls909_eee_ppat TABLE > res <- sqlFetch(ch, "tls801_country", max = 10) > res ctry_code iso_alpha3 1 2 AD AND 3 AE ARE 4 AF AFG 5 AG ATG 6 AI AIA 7 AL ALB 8 AM ARM 9 AO AGO 10 AP st3_name state_indicator 1 unknown 2 Andorra Y 3 United Arab Emirates Y 4 Afghanistan Y 5 Antigua and Barbados Y 6 Anguilla Y 7 Albania Y 8 Armenia Y 9 Angola Y 10 African Regional Intellectual Property Organization (ARIPO) continent eu_member epo_member oecd_member discontinued 1 NA NA NA 2 Europe NA NA NA 3 Asia NA NA NA 4 Asia NA NA NA 5 America NA NA NA 6 America NA NA NA 7 Europe NA Y NA NA 8 Europe NA NA NA 9 Africa NA NA NA 10 NA NA NA
MySQL and MATLAB
It’s possible to work with your (or other available) MySQL database directly from within MATLAB, either in the HPCC environment (preferred and documented here), from another MATLAB installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).
Generally, instructions are from http://www.mathworks.com/help/database/ug/importing-data-from-databases-into-matlab.html
Testing MATLAB to MySQL Connectivity
There is no particular ‘setup’ required for MATLAB + JDBC connectivity, so you’re ready to test connectivity from within MATLAB.
$ qlogin $ matlab -nodisplay [ ... ] >> javaaddpath('/usr/local/etc/mysql_jdbc_driver/mysql-jdbc.jar') >> setdbprefs('DataReturnFormat','cellarray') >> patstat = database('patstat2015b', 'USERNAME', 'MYSQL_PASSWORD', 'com.mysql.jdbc.Driver', 'jdbc:mysql://hpcc-sql/patstat2015b'); >> sqlquery = 'show tables'; >> curs = exec(patstat, sqlquery); >> curs = fetch(curs); >> curs.Data ans = 'tls201_appln' 'tls202_appln_title' 'tls203_appln_abstr' 'tls204_appln_prior' 'tls205_tech_rel' 'tls206_person' 'tls207_pers_appln' 'tls209_appln_ipc' 'tls210_appln_n_cls' 'tls211_pat_publn' 'tls212_citation' 'tls214_npl_publn' 'tls215_citn_categ' 'tls216_appln_contn' 'tls221_inpadoc_prs' 'tls222_appln_jp_class' 'tls223_appln_docus' 'tls224_appln_cpc' 'tls226_person_orig' 'tls227_pers_publn' 'tls228_docdb_fam_citn' 'tls229_appln_nace2' 'tls230_appln_techn_field' 'tls801_country' 'tls802_legal_event_code' 'tls901_techn_field_ipc' 'tls902_ipc_nace2' 'tls906_person' 'tls909_eee_ppat' >> sqlquery = 'select * from tls801_country limit 10'; >> curs = exec(patstat, sqlquery); >> curs = fetch(curs); >> curs.Data ans = ' ' ' ' 'unknown' '' '' '' '' '' '' 'AD' 'AND' 'Andorra' 'Y' 'Europe' '' '' '' '' 'AE' 'ARE' 'United Arab Emir...' 'Y' 'Asia' '' '' '' '' 'AF' 'AFG' 'Afghanistan' 'Y' 'Asia' '' '' '' '' 'AG' 'ATG' 'Antigua and Barb...' 'Y' 'America' '' '' '' '' 'AI' 'AIA' 'Anguilla' 'Y' 'America' '' '' '' '' 'AL' 'ALB' 'Albania' 'Y' 'Europe' '' 'Y' '' '' 'AM' 'ARM' 'Armenia' 'Y' 'Europe' '' '' '' '' 'AO' 'AGO' 'Angola' 'Y' 'Africa' '' '' '' '' 'AP' ' ' 'African Regional...' '' '' '' '' '' '' >>
MySQL and Stata
It’s possible to work with your (or other available) MySQL database directly from within Stata, either in the HPCC environment (preferred and documented here), from another Stata installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).
Generally, instructions are from http://www.stata.com/manuals13/dodbc.pdf
Set Up and Test ODBC Connector
Create a .odbc.ini file and test connectivity per the instructions at the top of this page.
Testing Stata to MySQL Connectivity
Now you’re ready to test connectivity from within Stata.
$ qrsh stata ___ ____ ____ ____ ____ (R) /__ / ____/ / ____/ ___/ / /___/ / /___/ 14.0 Copyright 1985-2015 StataCorp LP Statistics/Data Analysis StataCorp 4905 Lakeway Drive Special Edition College Station, Texas 77845 USA 800-STATA-PC http://www.stata.com 979-696-4600 stata@stata.com 979-696-4601 (fax) 20-user Stata network perpetual license: Serial number: 401406203416 Licensed to: The Wharton School University of Pennsylvania Notes: 1. Stata is running in batch mode. 2. Unicode is supported; see help unicode_advice. 3. Maximum number of variables is set to 5000; see help set_maxvar. . set odbcmgr unixodbc set odbcmgr unixodbc . odbc load, exec("select * from tls801_country") noquote dsn(patstat2015b) odbc load, exec("select * from tls801_country") noquote dsn(patstat2015b) . describe describe Contains data obs: 234 vars: 9 size: 21,060 ------------------------------------------------------------------------------- storage display value variable name type format label variable label ------------------------------------------------------------------------------- ctry_code str2 %9s iso_alpha3 str3 %9s st3_name str59 %59s state_indicator str1 %9s continent str21 %21s eu_member str1 %9s epo_member str1 %9s oecd_member str1 %9s discontinued str1 %9s ------------------------------------------------------------------------------- Sorted by: Note: Dataset has changed since last saved. . list ctry_code list ctry_code +----------+ | ctry_c~e | |----------| 1. | | 2. | AD | 3. | AE | 4. | AF | 5. | AG | ...
MySQL and SAS
For an overview of SAS/ACCESS Interface for MySQL, see the appropriate SAS Documentation
A Wharton HPC and WRDS specific example — put some SAS data in MySQL:
* query WRDS and use SAS/ACCESS Interface to MySQL to create a MySQL table *; libname optionm '/wrds/optionm/sasdata'; libname mysqllib mysql server='hpcc-sql' database='USERNAME' user='USERNAME' password='MYMYSQLPASSWORD'; proc sql; create table mysqllib.sastest as select year(date)*10000+month(date)*100+day(date) as datenum, a.days, a.rate from optionm.zerocd as a;