Accessing MySQL Database from JupyterHub

This document will show how to connect to a MySQL database which is on the same server as JupyterHub.

In MySQL, a new user ‘megan’ has been created with a ‘test’ password. This user has been given access to the database ‘demodb’.

Python Packages

Python Packages

The following python packages are used in this notebook: - pymysql - sqlalchemy - pandas and scikitlearn - these will be used in one example

Connecting to a MySQL database

Either pymysql or sqlalchemy can be used to connect to a database, however it is better to use sqlalchemy if you want to import a pandas dataframe as a table into your MySQL database.

# pymysql
import pymysql.cursors
#connect to database
connection = pymysql.connect(host='localhost', user='megan', password='test', database='demodb')
#sqlalchemy
from sqlalchemy import create_engine
eng = create_engine('mysql+pymysql://megan:test@localhost/demodb') #connects to the database demodb as user=megan and password=test
eng.connect() #this will show an error if it fails to connect to the database


<sqlalchemy.engine.base.Connection at 0x7f4c5c739f60>

After connecting to the database, you can check the list of tables using:

#if using sqlalchemy
eng.table_names()

['example', 'iris_table', 'tutorials_tbl']
#alternatively, pymysql allows you to execute SQL commands:
cursor = connection.cursor()
query = "SELECT * FROM iris_table" #example SQL query
cursor.execute(query) #executes the query, here it states there are 150 entries in the table

#output
  150

Import dataset into MySQL

The following example will show how to store an Iris dataset as a new table in our demo database.

#import python libraries
from sklearn import datasets
import pandas as pd

iris = datasets.load_iris() #load Iris dataset
print(iris.keys())

dict_keys(['data', 'target', 'frame', 'target_names', 'DESCR', 'feature_names', 'filename'])
#store dataset in a dataframe
data = pd.DataFrame(iris.data, columns=[iris.feature_names])
print(data) #print the dataframe

#The output would be:

      sepal length (cm) sepal width (cm) petal length (cm) petal width (cm)
  0                 5.1              3.5               1.4              0.2
  1                 4.9              3.0               1.4              0.2
  2                 4.7              3.2               1.3              0.2
  3                 4.6              3.1               1.5              0.2
  4                 5.0              3.6               1.4              0.2
  ..                ...              ...               ...              ...
  145               6.7              3.0               5.2              2.3
  146               6.3              2.5               5.0              1.9
  147               6.5              3.0               5.2              2.0
  148               6.2              3.4               5.4              2.3
  149               5.9              3.0               5.1              1.8

  [150 rows x 4 columns]
#create a new table in MySQL database and import this dataset into the table
#using the engine we have created to connect to the database demodb in MySQL,
#import the pandas dataframe 'data' as a new table in the database called 'iris_table_demo'
data.to_sql(con=eng, name ='iris_table_demo')

The command above also has an option for the case when the table name matches a table which already exists in the database. This means that you can either replace and overwrite the table in the database, or append the current table in MySQL and add this dataset to the data which is already there.

#confirm that the table is in the database:
eng.table_names()

  ['example', 'iris_table', 'iris_table_demo', 'tutorials_tbl']

This shows that we have a new table in the database demodb called ‘iris_table_demo’.

Let’s look at the table iris_table and print the records from that table.

query = "SELECT * FROM iris_table"
cursor.execute(query)
records = cursor.fetchall()
print(records)


  ((None, 5.1, 3.5, 1.4, 0.2), (None, 4.9, 3.0, 1.4, 0.2), (None, 4.7, 3.2, 1.3, 0.2), (None, 4.6, 3.1, 1.5, 0.2), (None, 5.0, 3.6, 1.4, 0.2), (None, 5.4, 3.9, 1.7, 0.4), (None, 4.6, 3.4, 1.4, 0.3), (None, 5.0, 3.4, 1.5, 0.2), (None, 4.4, 2.9, 1.4, 0.2), (None, 4.9, 3.1, 1.5, 0.1), (None, 5.4, 3.7, 1.5, 0.2), (None, 4.8, 3.4, 1.6, 0.2), (None, 4.8, 3.0, 1.4, 0.1), (None, 4.3, 3.0, 1.1, 0.1), (None, 5.8, 4.0, 1.2, 0.2), (None, 5.7, 4.4, 1.5, 0.4), (None, 5.4, 3.9, 1.3, 0.4), (None, 5.1, 3.5, 1.4, 0.3), (None, 5.7, 3.8, 1.7, 0.3), (None, 5.1, 3.8, 1.5, 0.3), (None, 5.4, 3.4, 1.7, 0.2), (None, 5.1, 3.7, 1.5, 0.4), (None, 4.6, 3.6, 1.0, 0.2), (None, 5.1, 3.3, 1.7, 0.5), (None, 4.8, 3.4, 1.9, 0.2), (None, 5.0, 3.0, 1.6, 0.2), (None, 5.0, 3.4, 1.6, 0.4), (None, 5.2, 3.5, 1.5, 0.2), (None, 5.2, 3.4, 1.4, 0.2), (None, 4.7, 3.2, 1.6, 0.2), (None, 4.8, 3.1, 1.6, 0.2), (None, 5.4, 3.4, 1.5, 0.4), (None, 5.2, 4.1, 1.5, 0.1), (None, 5.5, 4.2, 1.4, 0.2), (None, 4.9, 3.1, 1.5, 0.2), (None, 5.0, 3.2, 1.2, 0.2), (None, 5.5, 3.5, 1.3, 0.2), (None, 4.9, 3.6, 1.4, 0.1), (None, 4.4, 3.0, 1.3, 0.2), (None, 5.1, 3.4, 1.5, 0.2), (None, 5.0, 3.5, 1.3, 0.3), (None, 4.5, 2.3, 1.3, 0.3), (None, 4.4, 3.2, 1.3, 0.2), (None, 5.0, 3.5, 1.6, 0.6), (None, 5.1, 3.8, 1.9, 0.4), (None, 4.8, 3.0, 1.4, 0.3), (None, 5.1, 3.8, 1.6, 0.2), (None, 4.6, 3.2, 1.4, 0.2), (None, 5.3, 3.7, 1.5, 0.2), (None, 5.0, 3.3, 1.4, 0.2), (None, 7.0, 3.2, 4.7, 1.4), (None, 6.4, 3.2, 4.5, 1.5), (None, 6.9, 3.1, 4.9, 1.5), (None, 5.5, 2.3, 4.0, 1.3), (None, 6.5, 2.8, 4.6, 1.5), (None, 5.7, 2.8, 4.5, 1.3), (None, 6.3, 3.3, 4.7, 1.6), (None, 4.9, 2.4, 3.3, 1.0), (None, 6.6, 2.9, 4.6, 1.3), (None, 5.2, 2.7, 3.9, 1.4), (None, 5.0, 2.0, 3.5, 1.0), (None, 5.9, 3.0, 4.2, 1.5), (None, 6.0, 2.2, 4.0, 1.0), (None, 6.1, 2.9, 4.7, 1.4), (None, 5.6, 2.9, 3.6, 1.3), (None, 6.7, 3.1, 4.4, 1.4), (None, 5.6, 3.0, 4.5, 1.5), (None, 5.8, 2.7, 4.1, 1.0), (None, 6.2, 2.2, 4.5, 1.5), (None, 5.6, 2.5, 3.9, 1.1), (None, 5.9, 3.2, 4.8, 1.8), (None, 6.1, 2.8, 4.0, 1.3), (None, 6.3, 2.5, 4.9, 1.5), (None, 6.1, 2.8, 4.7, 1.2), (None, 6.4, 2.9, 4.3, 1.3), (None, 6.6, 3.0, 4.4, 1.4), (None, 6.8, 2.8, 4.8, 1.4), (None, 6.7, 3.0, 5.0, 1.7), (None, 6.0, 2.9, 4.5, 1.5), (None, 5.7, 2.6, 3.5, 1.0), (None, 5.5, 2.4, 3.8, 1.1), (None, 5.5, 2.4, 3.7, 1.0), (None, 5.8, 2.7, 3.9, 1.2), (None, 6.0, 2.7, 5.1, 1.6), (None, 5.4, 3.0, 4.5, 1.5), (None, 6.0, 3.4, 4.5, 1.6), (None, 6.7, 3.1, 4.7, 1.5), (None, 6.3, 2.3, 4.4, 1.3), (None, 5.6, 3.0, 4.1, 1.3), (None, 5.5, 2.5, 4.0, 1.3), (None, 5.5, 2.6, 4.4, 1.2), (None, 6.1, 3.0, 4.6, 1.4), (None, 5.8, 2.6, 4.0, 1.2), (None, 5.0, 2.3, 3.3, 1.0), (None, 5.6, 2.7, 4.2, 1.3), (None, 5.7, 3.0, 4.2, 1.2), (None, 5.7, 2.9, 4.2, 1.3), (None, 6.2, 2.9, 4.3, 1.3), (None, 5.1, 2.5, 3.0, 1.1), (None, 5.7, 2.8, 4.1, 1.3), (None, 6.3, 3.3, 6.0, 2.5), (None, 5.8, 2.7, 5.1, 1.9), (None, 7.1, 3.0, 5.9, 2.1), (None, 6.3, 2.9, 5.6, 1.8), (None, 6.5, 3.0, 5.8, 2.2), (None, 7.6, 3.0, 6.6, 2.1), (None, 4.9, 2.5, 4.5, 1.7), (None, 7.3, 2.9, 6.3, 1.8), (None, 6.7, 2.5, 5.8, 1.8), (None, 7.2, 3.6, 6.1, 2.5), (None, 6.5, 3.2, 5.1, 2.0), (None, 6.4, 2.7, 5.3, 1.9), (None, 6.8, 3.0, 5.5, 2.1), (None, 5.7, 2.5, 5.0, 2.0), (None, 5.8, 2.8, 5.1, 2.4), (None, 6.4, 3.2, 5.3, 2.3), (None, 6.5, 3.0, 5.5, 1.8), (None, 7.7, 3.8, 6.7, 2.2), (None, 7.7, 2.6, 6.9, 2.3), (None, 6.0, 2.2, 5.0, 1.5), (None, 6.9, 3.2, 5.7, 2.3), (None, 5.6, 2.8, 4.9, 2.0), (None, 7.7, 2.8, 6.7, 2.0), (None, 6.3, 2.7, 4.9, 1.8), (None, 6.7, 3.3, 5.7, 2.1), (None, 7.2, 3.2, 6.0, 1.8), (None, 6.2, 2.8, 4.8, 1.8), (None, 6.1, 3.0, 4.9, 1.8), (None, 6.4, 2.8, 5.6, 2.1), (None, 7.2, 3.0, 5.8, 1.6), (None, 7.4, 2.8, 6.1, 1.9), (None, 7.9, 3.8, 6.4, 2.0), (None, 6.4, 2.8, 5.6, 2.2), (None, 6.3, 2.8, 5.1, 1.5), (None, 6.1, 2.6, 5.6, 1.4), (None, 7.7, 3.0, 6.1, 2.3), (None, 6.3, 3.4, 5.6, 2.4), (None, 6.4, 3.1, 5.5, 1.8), (None, 6.0, 3.0, 4.8, 1.8), (None, 6.9, 3.1, 5.4, 2.1), (None, 6.7, 3.1, 5.6, 2.4), (None, 6.9, 3.1, 5.1, 2.3), (None, 5.8, 2.7, 5.1, 1.9), (None, 6.8, 3.2, 5.9, 2.3), (None, 6.7, 3.3, 5.7, 2.5), (None, 6.7, 3.0, 5.2, 2.3), (None, 6.3, 2.5, 5.0, 1.9), (None, 6.5, 3.0, 5.2, 2.0), (None, 6.2, 3.4, 5.4, 2.3), (None, 5.9, 3.0, 5.1, 1.8))

However, it is better to first import the table into a pandas data frame:

df = pd.read_sql('SELECT * FROM iris_table', con=connection)
print(df)
index  ('sepal length (cm)',)  ('sepal width (cm)',)  \
0    None                     5.1                    3.5
1    None                     4.9                    3.0
2    None                     4.7                    3.2
3    None                     4.6                    3.1
4    None                     5.0                    3.6
..    ...                     ...                    ...
145  None                     6.7                    3.0
146  None                     6.3                    2.5
147  None                     6.5                    3.0
148  None                     6.2                    3.4
149  None                     5.9                    3.0

('petal length (cm)',)  ('petal width (cm)',)
0                       1.4                    0.2
1                       1.4                    0.2
2                       1.3                    0.2
3                       1.5                    0.2
4                       1.4                    0.2
..                      ...                    ...
145                     5.2                    2.3
146                     5.0                    1.9
147                     5.2                    2.0
148                     5.4                    2.3
149                     5.1                    1.8

[150 rows x 5 columns]

References

https://overiq.com/sqlalchemy-101/installing-sqlalchemy-and-connecting-to-database/

https://pandas.pydata.org/pandas-docs/stable/reference/index.html

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_sql.html

https://towardsdatascience.com/sqlalchemy-python-tutorial-79a577141a91

https://linuxize.com/post/show-tables-in-mysql-database/

https://pynative.com/python-mysql-select-query-to-fetch-data/

MySQL

A list of useful MySQL commands can be found here:

http://g2pc1.bu.edu/~qzpeng/manual/MySQL%20Commands.htm