MariaDB is a flexible, widely used database server. The client tools are installed on all compute and submit nodes for access to your database. More information can be found at: http://dev.mysql.com/
MariaDB in Wharton’s HPC Environment
Up to 32GB of storage space is available for free use for all users of our HPC environment, but requires activation. Please e-mail research-computing@wharton.upenn.edu with a request for a MariaDB database creation, or access to other existing databases.
For These Demos: Request us_census Database Access
E-mail research-computing@wharton.upenn.edu and ask for access to the MySQL database: us_census
Use The MySQL Client
On any HPC3 system, you can:
mysql -A -h hpcc-sql.wharton.upenn.edu -p Enter password: YOUR_MARIADB_PASSWORD (NOT PennKey or Wharton password!!)
You can ‘save’ your password in a ~/.my.cnf file, like:
[client] user = USERNAME password = MARIADB_PASSWORD host = hpcc-sql.wharton.upenn.edu
Set Up ODBC Connector
To reduce the need to store passwords in code (bad idea!), you can create an .odbc.ini file in your home directory (Linux) on the cluster, like so:
[DATABASE] Description = My Database Driver = MariaDB FileUsage = 1 SERVER = hpcc-sql.wharton.upenn.edu Database = DATABASE PORT = 3306 USER = USERNAME Password = PASSWORD
Change both instances of ‘DATABASE’ to the correct DB name – usually your username, or another available DB like us_census. Change ‘USERNAME’ to your username, and ‘PASSWORD’ to your MySQL password (not your Wharton or Penn password!!!).
For example, setting up your ‘patstat2015b’ access:
[us_census] Description = US Census Driver = MariaDB FileUsage = 1 SERVER = hpcc-sql.wharton.upenn.edu Database = us_census PORT = 3306 USER = USERNAME Password = PASSWORD
Change ‘USERNAME’ to your username, and ‘PASSWORD’ to your MySQL password (not your Wharton or Penn password!!!).
Test DSN Configuration and Connectivity
From an SSH session, test connectivity with ‘isql DATABASE’. For example:
isql us_census +---------------------------------------+ | Connected! | | | | sql-statement | | help [tablename] | | quit | | | +---------------------------------------+ SQL> show tables; +--------------------------------------------------------------------------+ | Tables_in_us_census | +--------------------------------------------------------------------------+ | census_blocks | +--------------------------------------------------------------------------+ SQLRowCount returns 1 1 rows fetched
MariaDB Server Access
Connect to our MariaDB server hpcc-sql.wharton.upenn.edu from any HPC3 node (compute or login) with your provided MySQL account information, or from a Penn or Wharton VPN-connected device.
MySQL and Python
Set Up and Test ODBC Connector
Create an .odbc.ini file and test connectivity per the instructions at the top of this page.
Testing Python to MySQL Connectivity
As usual, you’ll want to setup a virtualenv for your Python work:
qlogin mkvirtualenv mySqldemo
OR if you already have one setup, activate it:
qlogin workon mySqldemo
You’ll need to install the pyodbc package. The following works well:
pip install pyodbc
Once you have the package installed, create and run (python scriptname) the following script:
import pyodbc
#create a mySQL connection
cnx = pyodbc.connect("DSN=hughmac")
#create a cursor for handling inserts, selects, etc
cursor = cnx.cursor()
# drop table.
sql = 'DROP TABLE demo'
cursor.execute(sql)
#create a table.
sql = 'CREATE TABLE demo (id int not null auto_increment primary key, firstname varchar(75), lastname varchar(75))'
cursor.execute(sql)
#insert some data
first = 'Max'
last = 'Rockatansky'
cursor.execute("INSERT INTO demo(`firstname`, `lastname`) VALUES (?,?)", (first,last))
#need to commit the insert
cnx.commit()
#pull the data back out again and print
sql = 'SELECT * FROM demo'
cursor.execute(sql)
recs = cursor.fetchall()
print recs
cnx.close()
MySQL and R
It’s possible to work with your (or other available) MariaDB database directly from within R, either in the HPC3 environment (preferred and documented here), from another R installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).
Generally, instructions are from the R CRAN odbc package documentation.
Set Up and Test ODBC Connector
Create an .odbc.ini file and test connectivity per the instructions at the top of this page.
Now you’re ready to test connectivity from within R.
$ qlogin
$ R --no-save
> install.packages(c("odbc", "data.table"))
Testing R to ODBC Connectivity
Now you’re ready to test connectivity from within R.
$ qlogin $ R --no-save
library(DBI)
library(odbc)
library(data.table)
# set connection details
dsn <- 'YOUR_DSN_NAME_HERE'
database <- 'YOUR_DATABASE_NAME_HERE'
# set query details
sql <- 'SELECT * from YOUR_TABLE_HERE limit 10'
# connect to DB
con <- dbConnect(odbc::odbc(), dsn = dsn, database = database)
# get the data from the DB to a data.table
data <- setDT(dbGetQuery(con, sql))
# disconnect from the DB
dbDisconnect(con)
# show some of the data
head(data)
# save the data to gzipped CSV
fwrite(data, file="example_data.csv.gz")
MariaDB and MATLAB
It’s possible to work with your (or other available) MariaDB database directly from within MATLAB, either in the HPC3 environment (preferred and documented here), from another MATLAB installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).
Generally, instructions are from http://www.mathworks.com/help/database/ug/importing-data-from-databases-into-matlab.html
Testing MATLAB to MariaDB Connectivity
There is no particular ‘setup’ required for MATLAB + JDBC connectivity, so you’re ready to test connectivity from within MATLAB.
$ qlogin
$ matlab -nodisplay
[ ... ]
>> javaaddpath('/usr/local/etc/mysql_jdbc_driver/mysql-jdbc.jar')
>> setdbprefs('DataReturnFormat','cellarray')
>> patstat = database('patstat2015b', 'USERNAME', 'MYSQL_PASSWORD', 'com.mysql.jdbc.Driver', 'jdbc:mysql://hpcc-sql/patstat2015b');
>> sqlquery = 'show tables';
>> curs = exec(us_census, sqlquery);
>> curs = fetch(curs);
>> curs.Data
UPDATED DEMO COMING SOON
>> sqlquery = 'select * from tls801_country limit 10';
>> curs = exec(patstat, sqlquery);
>> curs = fetch(curs);
>> curs.Data
ans =
' ' ' ' 'unknown' '' '' '' '' '' ''
'AD' 'AND' 'Andorra' 'Y' 'Europe' '' '' '' ''
'AE' 'ARE' 'United Arab Emir...' 'Y' 'Asia' '' '' '' ''
'AF' 'AFG' 'Afghanistan' 'Y' 'Asia' '' '' '' ''
'AG' 'ATG' 'Antigua and Barb...' 'Y' 'America' '' '' '' ''
'AI' 'AIA' 'Anguilla' 'Y' 'America' '' '' '' ''
'AL' 'ALB' 'Albania' 'Y' 'Europe' '' 'Y' '' ''
'AM' 'ARM' 'Armenia' 'Y' 'Europe' '' '' '' ''
'AO' 'AGO' 'Angola' 'Y' 'Africa' '' '' '' ''
'AP' ' ' 'African Regional...' '' '' '' '' '' ''
>>
MariaDB and Stata
It’s possible to work with your (or other available) MariaDB database directly from within Stata, either in the HPC3 environment (preferred and documented here), from another Stata installation on Penn’s campus (less preferred but possible), or remotely with the VPN active (highly discouraged, but again possible).
Generally, instructions are from http://www.stata.com/manuals13/dodbc.pdf
Set Up and Test ODBC Connector
Create a .odbc.ini file and test connectivity per the instructions at the top of this page.
Testing Stata to MariaDB Connectivity
Now you’re ready to test connectivity from within Stata.
$ qrsh stata
___ ____ ____ ____ ____ (R)
/__ / ____/ / ____/
___/ / /___/ / /___/ 14.0 Copyright 1985-2015 StataCorp LP
Statistics/Data Analysis StataCorp
4905 Lakeway Drive
Special Edition College Station, Texas 77845 USA
800-STATA-PC http://www.stata.com
979-696-4600 stata@stata.com
979-696-4601 (fax)
20-user Stata network perpetual license:
Serial number: 401406203416
Licensed to: The Wharton School
University of Pennsylvania
Notes:
1. Stata is running in batch mode.
2. Unicode is supported; see help unicode_advice.
3. Maximum number of variables is set to 5000; see help set_maxvar.
. set odbcmgr unixodbc
set odbcmgr unixodbc
UPDATED DEMO COMING SOON
. odbc load, exec("select * from tls801_country") noquote dsn(patstat2015b)
odbc load, exec("select * from tls801_country") noquote dsn(patstat2015b)
. describe
describe
Contains data
obs: 234
vars: 9
size: 21,060
-------------------------------------------------------------------------------
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
ctry_code str2 %9s
iso_alpha3 str3 %9s
st3_name str59 %59s
state_indicator str1 %9s
continent str21 %21s
eu_member str1 %9s
epo_member str1 %9s
oecd_member str1 %9s
discontinued str1 %9s
-------------------------------------------------------------------------------
Sorted by:
Note: Dataset has changed since last saved.
. list ctry_code
list ctry_code
+----------+
| ctry_c~e |
|----------|
1. | |
2. | AD |
3. | AE |
4. | AF |
5. | AG |
...
MariaDB and SAS
For an overview of SAS/ACCESS Interface for MariaDB, see the appropriate SAS Documentation
A Wharton HPC and WRDS specific example — put some SAS data in MySQL:
* query WRDS and use SAS/ACCESS Interface to MariaDB to create a MariaDB table *; libname optionm '/wrds/optionm/sasdata'; libname mysqllib mysql server='hpcc-sql' database='USERNAME' user='USERNAME' password='MYMYSQLPASSWORD'; proc sql; create table mysqllib.sastest as select year(date)*10000+month(date)*100+day(date) as datenum, a.days, a.rate from optionm.zerocd as a;
