Skip to content

Commit c7e3058

Browse files
committed
Merge branch 'mangecoeur-master'
2 parents cc6ee40 + 8e8d067 commit c7e3058

10 files changed

+1814
-876
lines changed

README.md

+1
Original file line numberDiff line numberDiff line change
@@ -106,6 +106,7 @@ pip install pandas
106106
- [Cython](http://www.cython.org): Only necessary to build development version. Version 0.17.1 or higher.
107107
- [SciPy](http://www.scipy.org): miscellaneous statistical functions
108108
- [PyTables](http://www.pytables.org): necessary for HDF5-based storage
109+
- [SQLAlchemy](http://www.sqlalchemy.org): for SQL database support. Version 0.8.1 or higher recommended.
109110
- [matplotlib](http://matplotlib.sourceforge.net/): for plotting
110111
- [statsmodels](http://statsmodels.sourceforge.net/)
111112
- Needed for parts of `pandas.stats`

ci/requirements-2.6.txt

+1
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,4 @@ http://www.crummy.com/software/BeautifulSoup/bs4/download/4.2/beautifulsoup4-4.2
66
html5lib==1.0b2
77
bigquery==2.0.17
88
numexpr==1.4.2
9+
sqlalchemy==0.8.1

ci/requirements-2.7.txt

+1
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,4 @@ scipy==0.10.0
1919
beautifulsoup4==4.2.1
2020
statsmodels==0.5.0
2121
bigquery==2.0.17
22+
sqlalchemy==0.8.1

ci/requirements-2.7_LOCALE.txt

+1
Original file line numberDiff line numberDiff line change
@@ -15,3 +15,4 @@ scipy==0.10.0
1515
beautifulsoup4==4.2.1
1616
statsmodels==0.5.0
1717
bigquery==2.0.17
18+
sqlalchemy==0.8.1

ci/requirements-3.3.txt

+1
Original file line numberDiff line numberDiff line change
@@ -14,3 +14,4 @@ lxml==3.2.1
1414
scipy==0.12.0
1515
beautifulsoup4==4.2.1
1616
statsmodels==0.4.3
17+
sqlalchemy==0.9.1

doc/source/install.rst

+1
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,7 @@ Optional Dependencies
9595
version. Version 0.17.1 or higher.
9696
* `SciPy <http://www.scipy.org>`__: miscellaneous statistical functions
9797
* `PyTables <http://www.pytables.org>`__: necessary for HDF5-based storage
98+
* `SQLAlchemy <http://www.sqlalchemy.org>`__: for SQL database support. Version 0.8.1 or higher recommended.
9899
* `matplotlib <http://matplotlib.sourceforge.net/>`__: for plotting
99100
* `statsmodels <http://statsmodels.sourceforge.net/>`__
100101
* Needed for parts of :mod:`pandas.stats`

doc/source/io.rst

+163-52
Original file line numberDiff line numberDiff line change
@@ -1823,7 +1823,7 @@ class. The following two command are equivalent:
18231823
read_excel('path_to_file.xls', 'Sheet1', index_col=None, na_values=['NA'])
18241824
18251825
The class based approach can be used to read multiple sheets or to introspect
1826-
the sheet names using the ``sheet_names`` attribute.
1826+
the sheet names using the ``sheet_names`` attribute.
18271827

18281828
.. note::
18291829

@@ -3068,13 +3068,52 @@ SQL Queries
30683068
-----------
30693069

30703070
The :mod:`pandas.io.sql` module provides a collection of query wrappers to both
3071-
facilitate data retrieval and to reduce dependency on DB-specific API. These
3072-
wrappers only support the Python database adapters which respect the `Python
3073-
DB-API <http://www.python.org/dev/peps/pep-0249/>`__. See some
3074-
:ref:`cookbook examples <cookbook.sql>` for some advanced strategies
3071+
facilitate data retrieval and to reduce dependency on DB-specific API. Database abstraction
3072+
is provided by SQLAlchemy if installed, in addition you will need a driver library for
3073+
your database.
30753074

3076-
For example, suppose you want to query some data with different types from a
3077-
table such as:
3075+
.. versionadded:: 0.14.0
3076+
3077+
3078+
If SQLAlchemy is not installed a legacy fallback is provided for sqlite and mysql.
3079+
These legacy modes require Python database adapters which respect the `Python
3080+
DB-API <http://www.python.org/dev/peps/pep-0249/>`__.
3081+
3082+
See also some :ref:`cookbook examples <cookbook.sql>` for some advanced strategies.
3083+
3084+
The key functions are:
3085+
:func:`~pandas.io.sql.to_sql`
3086+
:func:`~pandas.io.sql.read_sql`
3087+
:func:`~pandas.io.sql.read_table`
3088+
3089+
3090+
In the following example, we use the `SQlite <http://www.sqlite.org/>`__ SQL database
3091+
engine. You can use a temporary SQLite database where data are stored in
3092+
"memory".
3093+
3094+
To connect with SQLAlchemy you use the :func:`create_engine` function to create an engine
3095+
object from database URI. You only need to create the engine once per database you are
3096+
connecting to.
3097+
3098+
For more information on :func:`create_engine` and the URI formatting, see the examples
3099+
below and the SQLAlchemy `documentation <http://docs.sqlalchemy.org/en/rel_0_9/core/engines.html>`__
3100+
3101+
.. code-block:: python
3102+
3103+
from sqlalchemy import create_engine
3104+
from pandas.io import sql
3105+
# Create your connection.
3106+
engine = create_engine('sqlite:///:memory:')
3107+
3108+
Writing DataFrames
3109+
~~~~~~~~~~~~~~~~~~
3110+
3111+
<<<<<<< HEAD
3112+
Assuming the following data is in a DataFrame "data", we can insert it into
3113+
=======
3114+
Assuming the following data is in a DataFrame ``data``, we can insert it into
3115+
>>>>>>> 6314e6f... ENH #4163 Tweaks to docs, avoid mutable default args, mysql tests
3116+
the database using :func:`~pandas.io.sql.to_sql`.
30783117

30793118

30803119
+-----+------------+-------+-------+-------+
@@ -3088,81 +3127,153 @@ table such as:
30883127
+-----+------------+-------+-------+-------+
30893128

30903129

3091-
Functions from :mod:`pandas.io.sql` can extract some data into a DataFrame. In
3092-
the following example, we use the `SQlite <http://www.sqlite.org/>`__ SQL database
3093-
engine. You can use a temporary SQLite database where data are stored in
3094-
"memory". Just do:
3095-
3096-
.. code-block:: python
3130+
.. ipython:: python
3131+
:suppress:
30973132
3098-
import sqlite3
3133+
from sqlalchemy import create_engine
30993134
from pandas.io import sql
3100-
# Create your connection.
3101-
cnx = sqlite3.connect(':memory:')
3135+
engine = create_engine('sqlite:///:memory:')
31023136
31033137
.. ipython:: python
31043138
:suppress:
31053139
3106-
import sqlite3
3107-
from pandas.io import sql
3108-
cnx = sqlite3.connect(':memory:')
3140+
c = ['id', 'Date', 'Col_1', 'Col_2', 'Col_3']
3141+
d = [(26, datetime.datetime(2010,10,18), 'X', 27.5, True),
3142+
(42, datetime.datetime(2010,10,19), 'Y', -12.5, False),
3143+
(63, datetime.datetime(2010,10,20), 'Z', 5.73, True)]
3144+
3145+
data = DataFrame(d, columns=c)
3146+
3147+
Reading Tables
3148+
~~~~~~~~~~~~~~
3149+
3150+
:func:`~pandas.io.sql.read_table` will read a databse table given the
3151+
table name and optionally a subset of columns to read.
3152+
3153+
.. note::
3154+
3155+
In order to use :func:`~pandas.io.sql.read_table`, you **must** have the
3156+
SQLAlchemy optional dependency installed.
31093157

31103158
.. ipython:: python
3111-
:suppress:
31123159
3113-
cu = cnx.cursor()
3114-
# Create a table named 'data'.
3115-
cu.execute("""CREATE TABLE data(id integer,
3116-
date date,
3117-
Col_1 string,
3118-
Col_2 float,
3119-
Col_3 bool);""")
3120-
cu.executemany('INSERT INTO data VALUES (?,?,?,?,?)',
3121-
[(26, datetime.datetime(2010,10,18), 'X', 27.5, True),
3122-
(42, datetime.datetime(2010,10,19), 'Y', -12.5, False),
3123-
(63, datetime.datetime(2010,10,20), 'Z', 5.73, True)])
3160+
sql.read_table('data', engine)
31243161
3162+
You can also specify the name of the column as the DataFrame index,
3163+
and specify a subset of columns to be read.
31253164

3126-
Let ``data`` be the name of your SQL table. With a query and your database
3127-
connection, just use the :func:`~pandas.io.sql.read_sql` function to get the
3128-
query results into a DataFrame:
3165+
.. ipython:: python
3166+
3167+
sql.read_table('data', engine, index_col='id')
3168+
sql.read_table('data', engine, columns=['Col_1', 'Col_2'])
3169+
3170+
And you can explicitly force columns to be parsed as dates:
31293171

31303172
.. ipython:: python
31313173
3132-
sql.read_sql("SELECT * FROM data;", cnx)
3174+
sql.read_table('data', engine, parse_dates=['Date'])
31333175
3134-
You can also specify the name of the column as the DataFrame index:
3176+
If needed you can explicitly specifiy a format string, or a dict of arguments
3177+
to pass to :func:`pandas.tseries.tools.to_datetime`.
3178+
3179+
.. code-block:: python
3180+
3181+
sql.read_table('data', engine, parse_dates={'Date': '%Y-%m-%d'})
3182+
sql.read_table('data', engine, parse_dates={'Date': {'format': '%Y-%m-%d %H:%M:%S'}})
3183+
3184+
3185+
You can check if a table exists using :func:`~pandas.io.sql.has_table`
3186+
3187+
In addition, the class :class:`~pandas.io.sql.PandasSQLWithEngine` can be
3188+
instantiated directly for more manual control over the SQL interaction.
3189+
3190+
Querying
3191+
~~~~~~~~
3192+
3193+
You can query using raw SQL in the :func:`~pandas.io.sql.read_sql` function.
3194+
In this case you must use the SQL variant appropriate for your database.
3195+
When using SQLAlchemy, you can also pass SQLAlchemy Expression language constructs,
3196+
which are database-agnostic.
31353197

31363198
.. ipython:: python
31373199
3138-
sql.read_sql("SELECT * FROM data;", cnx, index_col='id')
3139-
sql.read_sql("SELECT * FROM data;", cnx, index_col='date')
3200+
sql.read_sql('SELECT * FROM data', engine)
31403201
31413202
Of course, you can specify a more "complex" query.
31423203

31433204
.. ipython:: python
31443205
3145-
sql.read_sql("SELECT id, Col_1, Col_2 FROM data WHERE id = 42;", cnx)
3206+
sql.read_frame("SELECT id, Col_1, Col_2 FROM data WHERE id = 42;", engine)
31463207
3147-
.. ipython:: python
3148-
:suppress:
31493208
3150-
cu.close()
3151-
cnx.close()
3209+
You can also run a plain query without creating a dataframe with
3210+
:func:`~pandas.io.sql.execute`. This is useful for queries that don't return values,
3211+
such as INSERT. This is functionally equivalent to calling ``execute`` on the
3212+
SQLAlchemy engine or db connection object. Again, ou must use the SQL syntax
3213+
variant appropriate for your database.
31523214

3215+
.. code-block:: python
31533216
3154-
There are a few other available functions:
3217+
sql.execute('SELECT * FROM table_name', engine)
31553218
3156-
- ``tquery`` returns a list of tuples corresponding to each row.
3157-
- ``uquery`` does the same thing as tquery, but instead of returning results
3158-
it returns the number of related rows.
3159-
- ``write_frame`` writes records stored in a DataFrame into the SQL table.
3160-
- ``has_table`` checks if a given SQLite table exists.
3219+
<<<<<<< HEAD
3220+
<<<<<<< HEAD
3221+
:func:`~pandas.io.sql.tquery` returns a list of tuples corresponding to each row.
31613222

3162-
.. note::
3223+
:func:`~pandas.io.sql.uquery` does the same thing as tquery, but instead of
3224+
returning results it returns the number of related rows.
3225+
=======
3226+
>>>>>>> ac6bf42... ENH #4163 Added more robust type coertion, datetime parsing, and parse date options. Updated optional dependancies
3227+
3228+
In addition, the class :class:`~pandas.io.sql.PandasSQLWithEngine` can be
3229+
instantiated directly for more manual control over the SQL interaction.
3230+
=======
3231+
sql.execute('INSERT INTO table_name VALUES(?, ?, ?)', engine, params=[('id', 1, 12.2, True)])
3232+
3233+
>>>>>>> 6314e6f... ENH #4163 Tweaks to docs, avoid mutable default args, mysql tests
3234+
3235+
Engine connection examples
3236+
~~~~~~~~~~~~~~~~~~~~~~~~~~
3237+
3238+
.. code-block:: python
3239+
3240+
from sqlalchemy import create_engine
3241+
3242+
engine = create_engine('postgresql://scott:tiger@localhost:5432/mydatabase')
3243+
3244+
engine = create_engine('mysql+mysqldb://scott:tiger@localhost/foo')
3245+
3246+
engine = create_engine('oracle://scott:[email protected]:1521/sidname')
3247+
3248+
engine = create_engine('mssql+pyodbc://mydsn')
3249+
3250+
# sqlite://<nohostname>/<path>
3251+
# where <path> is relative:
3252+
engine = create_engine('sqlite:///foo.db')
3253+
3254+
# or absolute, starting with a slash:
3255+
engine = create_engine('sqlite:////absolute/path/to/foo.db')
3256+
3257+
3258+
Legacy
3259+
~~~~~~
3260+
To use the sqlite support without SQLAlchemy, you can create connections like so:
3261+
3262+
.. code-block:: python
3263+
3264+
import sqlite3
3265+
from pandas.io import sql
3266+
cnx = sqlite3.connect(':memory:')
3267+
3268+
And then issue the following queries, remembering to also specify the flavor of SQL
3269+
you are using.
3270+
3271+
.. code-block:: python
3272+
3273+
sql.to_sql(data, 'data', cnx, flavor='sqlite')
3274+
3275+
sql.read_sql("SELECT * FROM data", cnx, flavor='sqlite')
31633276
3164-
For now, writing your DataFrame into a database works only with
3165-
**SQLite**. Moreover, the **index** will currently be **dropped**.
31663277
31673278
.. _io.bigquery:
31683279

0 commit comments

Comments
 (0)