MySQL

MySQL is a flexible, widely used database server. The client tools are installed on all compute and submit nodes for access to your database. More information can be found at: http://dev.mysql.com/

MySQL in Wharton’s HPC Environment

MySQL is available for all users of our HPC environment, but requires activation. Please e-mail research-computing@wharton.upenn.edu with a request for MySQL access.

For These Demos: Request ‘patstat2015b’ Access

E-mail research-computing@wharton.upenn.edu and ask for access to the MySQL database: ‘patstat2015b’.

Set Up ODBC Connector

Create a .odbc.ini file in your home directory (Linux) on the cluster, like so:

[DATABASE]
Description = My Database
Driver = MySQL
FileUsage = 1
SERVER = hpcc-sql
Database = DATABASE
PORT = 3306
USER = USERNAME
Password = PASSWORD

Change both instances of ‘DATABASE’ to the correct DB name – usually your username, or another available DB like PATSTAT or IMDb. Change ‘USERNAME’ to your username, and ‘PASSWORD’ to your MySQL password (not your Wharton or Penn password!!!).

For example, setting up your ‘patstat2015b’ access:

[patstat2015b]
Description = PATSTAT 2015b DB
Driver = MySQL
FileUsage = 1
SERVER = hpcc-sql
Database = patstat2015b
PORT = 3306
USER = USERNAME
Password = PASSWORD

Change ‘USERNAME’ to your username, and ‘PASSWORD’ to your MySQL password (not your Wharton or Penn password!!!).

Test DSN Configuration and Connectivity

From an SSH session, test connectivity with ‘isql DATABASE’. For example:

$ isql patstat2015b
+---------------------------------------+
| Connected! |
| |
| sql-statement |
| help [tablename] |
| quit |
| |
+---------------------------------------+
SQL> show tables
+--------------------------------------------------------------------------+
| Tables_in_patstat2015b |
+--------------------------------------------------------------------------+
| tls201_appln |
| tls202_appln_title |
| tls203_appln_abstr |
| tls204_appln_prior |
| tls205_tech_rel |
| tls206_person |
| tls207_pers_appln |
| tls209_appln_ipc |
| tls210_appln_n_cls |
| tls211_pat_publn |
| tls212_citation |
| tls214_npl_publn |
| tls215_citn_categ |
| tls216_appln_contn |
| tls221_inpadoc_prs |
| tls222_appln_jp_class |
| tls223_appln_docus |
| tls224_appln_cpc |
| tls226_person_orig |
| tls227_pers_publn |
| tls228_docdb_fam_citn |
| tls229_appln_nace2 |
| tls230_appln_techn_field |
| tls801_country |
| tls802_legal_event_code |
| tls901_techn_field_ipc |
| tls902_ipc_nace2 |
| tls906_person |
| tls909_eee_ppat |
+--------------------------------------------------------------------------+
SQLRowCount returns 29
29 rows fetched
SQL>

MySQL Server Access

Connect to our MySQL server hpcc-sql from any HPCC node (compute or login) with your provided MySQL account information.

MySQL and Python

Set Up and Test ODBC Connector

Create an .odbc.ini file and test connectivity per the instructions at the top of this page.

Testing Python to MySQL Connectivity

As usual, you’ll want to setup a virtualenv for your Python work:

qlogin
mkvirtualenv mySqldemo

OR if you already have one setup, activate it:

qlogin
workon mySqldemo

You’ll need to install the pyodbc package. The following works well:

pip install pyodbc

Once you have the package installed, create and run (python scriptname) the following script:

import pyodbc

#create a mySQL connection
cnx = pyodbc.connect("DSN=hughmac")

#create a cursor for handling inserts, selects, etc
cursor = cnx.cursor()

# drop table.
sql = 'DROP TABLE demo'
cursor.execute(sql)

#create a table.
sql = 'CREATE TABLE demo (id int not null auto_increment primary key, firstname varchar(75), lastname varchar(75))'
cursor.execute(sql)

#insert some data
first = 'Max'
last = 'Rockatansky'

cursor.execute("INSERT INTO demo(`firstname`, `lastname`) VALUES (?,?)", (first,last))
#need to commit the insert
cnx.commit()

#pull the data back out again and print
sql = 'SELECT * FROM demo'
cursor.execute(sql)

recs = cursor.fetchall()
print recs

cnx.close()

MySQL and R

It’s possible to work with your (or other available) MySQL database directly from within R, either in the HPCC environment (preferred and documented here), from another R installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).

Generally, instructions are from https://cran.r-project.org/web/packages/RODBC/vignettes/RODBC.pdf

Set Up and Test ODBC Connector

Create a .odbc.ini file and test connectivity per the instructions at the top of this page.

Testing R to MySQL Connectivity

Now you’re ready to test connectivity from within R.

$ qlogin
$ R --no-save

...

> library("RODBC")
> ch <- odbcConnect("patstat2015b")
> sqlTables(ch)
 TABLE_CAT TABLE_SCHEM TABLE_NAME TABLE_TYPE REMARKS
1 patstat2015b tls201_appln TABLE
2 patstat2015b tls202_appln_title TABLE
3 patstat2015b tls203_appln_abstr TABLE
4 patstat2015b tls204_appln_prior TABLE
5 patstat2015b tls205_tech_rel TABLE
6 patstat2015b tls206_person TABLE
7 patstat2015b tls207_pers_appln TABLE
8 patstat2015b tls209_appln_ipc TABLE
9 patstat2015b tls210_appln_n_cls TABLE
10 patstat2015b tls211_pat_publn TABLE
11 patstat2015b tls212_citation TABLE
12 patstat2015b tls214_npl_publn TABLE
13 patstat2015b tls215_citn_categ TABLE
14 patstat2015b tls216_appln_contn TABLE
15 patstat2015b tls221_inpadoc_prs TABLE
16 patstat2015b tls222_appln_jp_class TABLE
17 patstat2015b tls223_appln_docus TABLE
18 patstat2015b tls224_appln_cpc TABLE
19 patstat2015b tls226_person_orig TABLE
20 patstat2015b tls227_pers_publn TABLE
21 patstat2015b tls228_docdb_fam_citn TABLE
22 patstat2015b tls229_appln_nace2 TABLE
23 patstat2015b tls230_appln_techn_field TABLE
24 patstat2015b tls801_country TABLE
25 patstat2015b tls802_legal_event_code TABLE
26 patstat2015b tls901_techn_field_ipc TABLE
27 patstat2015b tls902_ipc_nace2 TABLE
28 patstat2015b tls906_person TABLE
29 patstat2015b tls909_eee_ppat TABLE
> res <- sqlFetch(ch, "tls801_country", max = 10)
> res
 ctry_code iso_alpha3
1
2 AD AND
3 AE ARE
4 AF AFG
5 AG ATG
6 AI AIA
7 AL ALB
8 AM ARM
9 AO AGO
10 AP
 st3_name state_indicator
1 unknown
2 Andorra Y
3 United Arab Emirates Y
4 Afghanistan Y
5 Antigua and Barbados Y
6 Anguilla Y
7 Albania Y
8 Armenia Y
9 Angola Y
10 African Regional Intellectual Property Organization (ARIPO)
 continent eu_member epo_member oecd_member discontinued
1 NA NA NA
2 Europe NA NA NA
3 Asia NA NA NA
4 Asia NA NA NA
5 America NA NA NA
6 America NA NA NA
7 Europe NA Y NA NA
8 Europe NA NA NA
9 Africa NA NA NA
10 NA NA NA

MySQL and MATLAB

It’s possible to work with your (or other available) MySQL database directly from within MATLAB, either in the HPCC environment (preferred and documented here), from another MATLAB installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).

Generally, instructions are from http://www.mathworks.com/help/database/ug/importing-data-from-databases-into-matlab.html

Testing MATLAB to MySQL Connectivity

There is no particular ‘setup’ required for MATLAB + JDBC connectivity, so you’re ready to test connectivity from within MATLAB.

$ qlogin
$ matlab -nodisplay

[ ... ]

>> javaaddpath('/usr/local/etc/mysql_jdbc_driver/mysql-jdbc.jar')
>> setdbprefs('DataReturnFormat','cellarray')
>> patstat = database('patstat2015b', 'USERNAME', 'MYSQL_PASSWORD', 'com.mysql.jdbc.Driver', 'jdbc:mysql://hpcc-sql/patstat2015b');
>> sqlquery = 'show tables';
>> curs = exec(patstat, sqlquery);
>> curs = fetch(curs);
>> curs.Data

ans =

 'tls201_appln'
 'tls202_appln_title'
 'tls203_appln_abstr'
 'tls204_appln_prior'
 'tls205_tech_rel'
 'tls206_person'
 'tls207_pers_appln'
 'tls209_appln_ipc'
 'tls210_appln_n_cls'
 'tls211_pat_publn'
 'tls212_citation'
 'tls214_npl_publn'
 'tls215_citn_categ'
 'tls216_appln_contn'
 'tls221_inpadoc_prs'
 'tls222_appln_jp_class'
 'tls223_appln_docus'
 'tls224_appln_cpc'
 'tls226_person_orig'
 'tls227_pers_publn'
 'tls228_docdb_fam_citn'
 'tls229_appln_nace2'
 'tls230_appln_techn_field'
 'tls801_country'
 'tls802_legal_event_code'
 'tls901_techn_field_ipc'
 'tls902_ipc_nace2'
 'tls906_person'
 'tls909_eee_ppat'

>> sqlquery = 'select * from tls801_country limit 10';
>> curs = exec(patstat, sqlquery);
>> curs = fetch(curs);
>> curs.Data

ans =

 ' ' ' ' 'unknown' '' '' '' '' '' ''
 'AD' 'AND' 'Andorra' 'Y' 'Europe' '' '' '' ''
 'AE' 'ARE' 'United Arab Emir...' 'Y' 'Asia' '' '' '' ''
 'AF' 'AFG' 'Afghanistan' 'Y' 'Asia' '' '' '' ''
 'AG' 'ATG' 'Antigua and Barb...' 'Y' 'America' '' '' '' ''
 'AI' 'AIA' 'Anguilla' 'Y' 'America' '' '' '' ''
 'AL' 'ALB' 'Albania' 'Y' 'Europe' '' 'Y' '' ''
 'AM' 'ARM' 'Armenia' 'Y' 'Europe' '' '' '' ''
 'AO' 'AGO' 'Angola' 'Y' 'Africa' '' '' '' ''
 'AP' ' ' 'African Regional...' '' '' '' '' '' ''

>>

MySQL and Stata

It’s possible to work with your (or other available) MySQL database directly from within Stata, either in the HPCC environment (preferred and documented here), from another Stata installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).

Generally, instructions are from http://www.stata.com/manuals13/dodbc.pdf

Set Up and Test ODBC Connector

Create a .odbc.ini file and test connectivity per the instructions at the top of this page.

Testing Stata to MySQL Connectivity

Now you’re ready to test connectivity from within Stata.

$ qrsh stata

 ___ ____ ____ ____ ____ (R)
 /__ / ____/ / ____/
___/ / /___/ / /___/ 14.0 Copyright 1985-2015 StataCorp LP
 Statistics/Data Analysis StataCorp
 4905 Lakeway Drive
 Special Edition College Station, Texas 77845 USA
 800-STATA-PC http://www.stata.com
 979-696-4600 stata@stata.com
 979-696-4601 (fax)

20-user Stata network perpetual license:
 Serial number: 401406203416
 Licensed to: The Wharton School
 University of Pennsylvania

Notes:
 1. Stata is running in batch mode.
 2. Unicode is supported; see help unicode_advice.
 3. Maximum number of variables is set to 5000; see help set_maxvar.

. set odbcmgr unixodbc
set odbcmgr unixodbc

. odbc load, exec("select * from tls801_country") noquote dsn(patstat2015b)
odbc load, exec("select * from tls801_country") noquote dsn(patstat2015b)

. describe
describe

Contains data
 obs: 234
 vars: 9
 size: 21,060
-------------------------------------------------------------------------------
 storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
ctry_code str2 %9s
iso_alpha3 str3 %9s
st3_name str59 %59s
state_indicator str1 %9s
continent str21 %21s
eu_member str1 %9s
epo_member str1 %9s
oecd_member str1 %9s
discontinued str1 %9s
-------------------------------------------------------------------------------
Sorted by:
 Note: Dataset has changed since last saved.

. list ctry_code
list ctry_code

 +----------+
 | ctry_c~e |
 |----------|
 1. | |
 2. | AD |
 3. | AE |
 4. | AF |
 5. | AG |
...

MySQL and SAS

For an overview of SAS/ACCESS Interface for MySQL, see the appropriate SAS Documentation

A Wharton HPC and WRDS specific example — put some SAS data in MySQL:

* query WRDS and use SAS/ACCESS Interface to MySQL to create a MySQL table *;

libname optionm '/wrds/optionm/sasdata';
libname mysqllib mysql server='hpcc-sql'
  database='USERNAME' user='USERNAME' password='MYMYSQLPASSWORD';

proc sql;
  create table mysqllib.sastest
    as select year(date)*10000+month(date)*100+day(date)
      as datenum, a.days, a.rate
    from optionm.zerocd as a;