Introduction to retrieving data from your database - Perl DBI/DBD::ODBC Tutorial
Contents
- Introduction
- Pre-requisites
- Assumptions
- Simple methods of retrieving data
- Specifying what you want
- Simple prepare/execute
- Obtaining metadata from the result-set
- Using parameters
- Using bound columns
- Hash and Reference methods of retrieving data
- Special cases
- SQL_CHAR types and trialing spaces
- Long columns
- Handling NULL data
- NULL in SQL
- NULL in Perl
Introduction
Pre-requisites
Before you start part 2 of this tutorial you need to ensure you have satisfy all the pre-requisites:- Perl We used Perl 5.8 but you only need the minimum required by the DBI and DBD::ODBC modules which is currently 5.6. Use perl --version to see what version of Perl you have installed.
- DBI module We used DBI 1.45 but this tutorial should work with anything after 1.40. To see if you have a recent enough version of DBI installed run:
perl -e 'use DBI 1.40;'
If you get an error like "DBI version 1.40 required--this is only version 1.30 at -e line 1." you need to upgrade DBI.
If you get an error saying DBI cannot be found in @INC you've probably not got DBI installed.
Go to CPAN to get an up to date version of the DBI module. - DBD::ODBC We
used DBD::ODBC 1.11. You can use similar methods as above to determine
if DBD::ODBC is installed and to see what version you have:
To check you have the DBD::ODBC module installed:
perl -e 'use DBD::ODBC;'
If you have not got DBD::ODBC installed you should see Enabling ODBC support in Perl with Perl DBI and DBD::ODBC for instructions.
To show the DBD::ODBC version:
perl -MDBD::ODBC -e 'print $DBD::ODBC::VERSION;'
To show all drivers DBI knows about and their versions:
perl -MDBI -e 'DBI->installed_versions;'
Go to CPAN to get an up to date version of the DBD::ODBC module. - ODBC driver and driver manager Unsurprisingly you will need an ODBC driver for most of this tutorial and we recommend you use an ODBC driver manager under DBD::ODBC
Assumptions
Previous tutorials in this seriesThis tutorial assumes you have read or understand all the concepts in the previous tutorial DBD::ODBC Tutorial Part 1 - Drivers, Data Sources and Connection.
Operating System
This tutorial was designed on UNIX and we have assumed you are using UNIX too. However, all the Perl examples should work equally well on Windows so long as minor alterations for the command line are made.
ODBC Driver Manager
We have assumed you are using the unixODBC driver manager. All discussion in this document relating to the location and definition of ODBC data sources is for unixODBC.
DBI and ODBC drivers
DBI, DBD::ODBC Architecture
Perl script using DBI methods | v API | v DBI | v DBD::ODBC | v ODBC Driver Manager (e.g. unixODBC) | v ODBC Driver
Simple methods of retrieving data
Specifying what you want
You specify the data you want to retrieve from the database using SQL select statements. This is not a SQL tutorial and we have attempted to keep the SQL simple. The basic form of a SQL select statement we use here is:select <column_list> from <table>where:
- <column_list> is a comma separated list of columns, or * (for all columns). In fact it can be many more things; consult a SQL reference manual.
- <table> is a table or view in the database.
where <column> = 'text'Your SQL select statement is passed to the DBI prepare method which in ODBC terms ends up in a call to the
SQLPrepare
or SQLExecDirect
ODBC APIs.Be careful when creating SQL from Perl variables as you can include characters which invalidate the SQL. e.g.
my $name = "can't"; my $sql = qq/select * from table where column = '$name'/;$sql is now invalid because it is:
select * from table where column = 'can't'Most databases would require the second ' to be doubled up. You can use the DBI quote method to do this for you with:
my $name = $dbh->quote("can't"); my $sql = qq/select * from table where column = $name/;The other way (recommended) is to use bound parameters
Simple prepare/execute
The most simple way of obtaining all the rows in a table and displaying them is like this:my $dbh = DBI->connect(); # connect my $sql = qq/select * from table/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my @row; while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }Here we prepare the SQL, execute it, then use the
fetchrow_array
method to return an array of column values for each row. Repeatedly calling the fetchrow_array
method until fetchrow_array
returns undef, returns all the rows in the result-set.NOTE that NULL fields are returned as undef in the array returned by
fetchrow_array
(more about NULL values later).NOTE. Don't use
fetchrow_array
in a scalar context unless you really know what you are doing as it can
return the first or last column (dependent on the driver) which could
be a NULL value and hence undef. Also fetchrow_array
can return undef if there are no more rows or if an error occurs.There are other methods or obtaining the result-set - see later.
Obtaining metadata from the result-set
ODBC can describe the result-set for you listing the number of columns and the names of the columns. You can obtain this information with the statement attributesNUM_OF_FIELDS
, NAME, NAME_uc
, NAME_lc
, NAME_hash
, NAME_lc_hash
and NAME_uc_HASH
.Assuming you have the table 'fred' with columns 'a' (integer), 'b' (char(10)) and 'c' (float):
my $sth = $dbh->prepare(select * from fred); $sth->{NUM_OF_FIELDS} returns 3. $sth->{NAME}->[0] returns 'a' although it could return 'A' $sth->{NAME_uc}->[0] returns 'A'. $sth->{NAME_lc}->[0] returns 'a'.NOTE the
NAME
attribute can return lowercase or
uppercase column names depending on the database. Some database will
always uppercase column names in create statements if they are unquoted
and some databases retain the case of column names if they are enclosed
in the identifier quote character. TIP: For this reason it is best to use NAME_uc
or NAME_lc
.NOTE. Some drivers may be unable to return a column name at all e.g. if the column was an expression like 1 or a function. e.g. "select 1 from table" in MS SQL Server returns an empty string for $sth-{NAME}->[0]. TIP: You can get around this using column aliases as in "select 1 as col1 from table".
NAME_hash
, NAME_lc_hash
and NAME_uc_hash
are like NAME
, NAME_lc
and NAME_uc
except the result is a hash instead of an array with the keys being the
column names and the values being the column index starting at 0.Similarly, the TYPE attribute returns an array reference of column types. For the 'fred' table above:
print join(", ", @{$sth->{TYPE}}), "\n";returns 4, 1, 6. The column types are defined by international standards (see the DBI manual).
Using parameters
The main reasons for using parameters are:- You can prepare the SQL once then execute many times with different parameters thus saving the prepare parsing.
- With bound parameters you don't need to bother about quoting issues.
my $dbh = DBI->connect(); # connect my $var = $dbh->quote("value to search for"); my $sql = qq/select * from table where column = $var/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my @row; while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }This is fine but what if we want to execute the query multiple times with different values for $var. We can repeat the $sql assignement with different $var and re-run $dbh->prepare but this is inefficient because it causes the ODBC driver and database to re-parse the SQL each time and is unnecessary.
A better solution is the following:
my @row; # row data my $dbh = DBI->connect(); # connect my $var = "value to search for"; my $sql = qq/select * from table where column = ?/; # the query to execute with parameter my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(($var)); # execute the query with parameter while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; } $var = "another value to search for"; $sth->execute(($var)); # execute the query (no need to re-prepare) while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }Here the SQL contains a parameter marker (the '?') indicating to the driver that we will provide this later. The SQL is passed to the
prepare
method where the database will parse the SQL
and note a parameter is required. When the execute method is called we
pass the parameters required. You can use multiple parameter markers
e.g.$sql = q/select * from table where col1 = ? and col2 = ?/; . . $sth->execute(($param1, $param2));Note You must pass the array of parameters in the order which you want them to match the parameter markers in the SQL. i.e. in the above example $param1 will substitute the first parameter marker in "col1 = ?" and $param2 will substitute the parameter marker in "col2 = ?".
Note You cannot use parameter markers in place of column names e.g. select ? from table or select * from table where ? = 'A'.
When passing the parameters to the
execute
method a default binding type is used. DBD::ODBC attempts to find out
what the parameter markers represent and bind the parameters as the same
type. If your ODBC driver does not have the ODBC API SQLDescribeParam
then a string binding is used (SQL_VARCHAR). You can also bind the
parameters yourself before the execute call and in this way you can
specify how the parameters should be bound:# import SQL types for use in bind_param method: use DBI qw(:sql_types); $sql = q/select * from table where col1 = ?/; $sth = $dbh->prepare($sql); # uses default bind type: $sth->bind_param(1, $param1); # Specify the parameter should be bound as an SQL_INTEGER $sth->bind_param(1, $param1, {TYPE => SQL_INTEGER});Note Once the parameters are bound you cannot change the type they are bound as although you can obviously change the bound values.
Using bound columns
In previous examples in this tutorial we have used thefetchrow_array
method to retrieve the result-set. DBI also provides a method to bind the returned column data to perl variables using the bind_col
and bind_columns
methods:Going back to the first example we had:
my $dbh = DBI->connect(); # connect my $sql = qq/select * from table/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my @row; while (@row = $sth->fetchrow_array) { # retrieve one row print join(", ", @row), "\n"; }which can be rewritten to use bound columns as follows:
my $dbh = DBI->connect(); # connect my $sql = qq/select * from table/; # the query to execute my $sth = $dbh->prepare($sql); # prepare the query $sth->execute(); # execute the query my ($col1, $col2, $col3); # Note columns start at 1 (not 0). $sth->bind_col(1, \$col1); $sth->bind_col(2, \$col2); $sth->bind_col(3, \$col3); while ($sth->fetch) { # retrieve one row print "$col1, $col2, $col3\n"; }Binding columns and using the
fetch
method is usually faster than using methods like fetchrow_array
. As with bind_param
you can specify the type the column is bound as.$sth->bind_col(1, \$col1, {TYPE => SQL_INTEGER}); $sth->bind_col(2, \$col2, {TYPE => SQL_CHAR});$sth->bind_col(3, \$col3, {TYPE => SQL_FLOAT});In ODBC there is usually no need to specify the type.
Instead of separate calls to
bind_col
you can bind all the columns in the above example in one go with:$sth->bind_columns(\$col1, \$col2, \$col3);or
$sth->bind_columns(\($col1, $col2, $col3));
Hash and Reference methods of retrieving data
DBI supports a number of methods returning the result-set (or part of it) in hash or array references. All the following examples assume you have already prepared and executed a select on a table containing two rows of two columns named a and b which look like:column a | Column b ---------+--------- 1 | one 2 | twofetchrow_arrayref
my $ref; while($ref = $sth->fetchrow_arrayref) { print join (", ", @{$ref}), "\n"; }
1, one 2, twofetchrow_hashref
my $ref; while($ref = $sth->fetchrow_hashref) { print join (", ", keys %$ref), "\n"; print join (", ", values %$ref), "\n"; }
a, b 1, one a, b 2, twofetchall_arrayref
# # You must set RaiseError or check $sth->err because # fetchall_arrayref returns the data fetched so far. # $sth->{RaiseError} = 1; my $ref; $ref = $sth->fetchall_arrayref; print "Number of rows returned is ", 0 + @{$ref}, "\n"; foreach $r (@{$ref}) { print join(", ", @{$r}), "\n"; }
1, one 2, twoYou can add a slice to
fetchall_arrayref
to specify which columns to return:- To return all columns as above:
$ref = $sth->fetchall_arrayref([]);
1, one 2, two
- To return only the first column
$ref = $sth->fetchall_arrayref([0]);
1 2
- To return only the last column
$ref = $sth->fetchall_arrayref([-1]);
one two
- To return each row as a hashref
$ref = $sth->fetchall_arrayref({}); print "Number of rows returned is ", 0 + @{$ref}, "\n"; print join (", ", keys %{$ref->[0]}), "\n"; foreach $r (@{$ref}) { print join(", ", (values %$r)), "\n"; } print "Number of rows returned is ", 0 + @{$ref}, "\n"; print join (", ", keys %{$ref->[0]}), "\n"; foreach $r (@{$ref}) { print join(", ", (values %$r)), "\n"; }
Number of rows returned is 2 a 1 2
Note When specifing the names of the columns to return in this way the case of keys for the returned hashes always matches the case used in the parameter tofetchall_hashref
regardless of whatFetchKeyHashName
is set to.
fetchall_arrayref
to restrict the number of rows:$ref = $sth->fetchall_arrayref([], 1);
1, oneThis method is particularly useful if your database engine does not support the "top n" SQL syntax or if you want the increased performance of
fetchall_arrayref
, want to keep memory down but also need to process all the rows because fetchall_arrayref
can be called repeatedly on the same result-set to get rows in chunks.The rather elegant way of doing this in the DBI documentation is:
$sth->{RaiseError} = 1; my $rows = []; # cache for batches of rows while(my $row = (shift(@$rows) || # get row from cache, or reload cache: shift(@{$rows=$sth->fetchall_arrayref(undef,1)||[]}))) { print join(", ", @{$row}), "\n"; }
1, one 2, two
Special cases
SQL_CHAR types and trialing spaces
Databases store char(n) columns as exactly n characters so if you have a char(10) column which you insert 'FRED' into when you retrieve it you will get 'FRED '. This often leads to confusion, especially in tests like $var eq 'FRED'. e.g. With the following table definition and inserts:create table tut2_8 (a int, b char(10)) insert into tut2_8 values (1, 'one') insert into tut2_8 values (2, 'two')the following code never prints out "Found 'two'".
my ($col1, $col2); $sth->bind_columns(\($col1, $col2)); while ($sth->fetch) { # retrieve one row print "$col1, $col2\n"; print "Found 'two'" if ($col2 eq 'two'); }The
ChopBlanks
attribute is provided to help you here. ChopBlanks
may be set on the connection or statement handle and is inherited from connections. The default is for ChopBlanks
to be false. If you insert "$sth->{ChopBlanks} = 1;" before the call to execute method call then the above test now works.Note
ChopBlanks
only works on fixed-length CHAR columns.Long columns
Suppose you have the table created with the following code:my $dbh = DBI->connect(); # connect $dbh->do(q/create table tut2_9 (a int, b text)/); my $sth = $dbh->prepare(q/insert into tut2_9 values (1, ?)/); my $p = 'x' x 500; $sth->execute($p);The 'text' column type here is a MS SQL Server long data type. Other databases have memo, blob etc.
The following retrieval code:
$sth = $dbh->prepare(q/select * from tut2_9/); $sth->execute; my ($col1, $col2); $sth->bind_columns(\($col1, $col2)); while ($sth->fetch) { # retrieve one row print "$col1, $col2\n"; }fails with
DBD::ODBC::st fetch failed: [unixODBC][][Microsoft] [ODBC SQL Server Driver] String data, right truncation (SQL-01004) (DBD: st_fetch/SQLFetch (long truncated DBI attribute LongTruncOk not set and/or LongReadLen too small) err=-1) at ./tut2_9.pl line 19.The attributes
LongReadLen
and LongTruncOk
control how long columns are retrieved. By default, DBD::ODBC has LongReadLen
set to 80 and LongTruncOk
is false. You can find the current settings with:print "LongReadLen is '", $h->{LongReadLen}, "'\n"; print "LongTruncOk is ", $h->{LongTruncOk}, "\n";When
LongTruncOk
is false (as above), if you attempt to retrieve more than LongReadLen
bytes from a column you will get an error like above. If you are not bothered about the column data being truncated then set LongTrunkOk
to 1. If you need more than 80 bytes from the column then you need to set LongReadLen
. If you want long columns ignored (i.e. no data fetched from them but undef returned instead) then you need to set LongReadLen
to 0 and LongTruncOk to true.Handling NULL data
NULL in SQL
Columns which have the NULL value are special. NULL is not the same as an empty string. You need to be careful using comparisons on columns which are nullable. For instance:Given the table:
create table fred (a int, b char(1) null)
Col1 | Col2 -----+----- 1 | Y 2 | N 3 | NULLselect * from fred where col2 = 'Y' returns:
col1 | col2 -----+----- 1 | Yand select * from fred where col2 <> 'Y' returns:
col1 | col2 -----+----- 1 | NSome people expect to see the result of the second query including row 3 with the NULL col2 because they think "col2 <> 'Y'" includes NULL; it doesn't. In relational operations, unlike normal binary, there are THREE states that a operation can result in, TRUE, FALSE, and UNKNOWN. The last state, is the one that needs a closer look.
The UNKNOWN state occurs for any logical operation where the result cannot be evaluated as either TRUE or FALSE, and is connected with the NULL concept.
A NULL in a SQL table, can be regarded as a placeholder for missing information, its not a VALUE as such, just a marker to indicate the lack of information. So in the query for all rows where col2 is not equal to 'Y' the rows containing a NULL are not returned, as the answer to the question "Does the col2 column contain values not equal to Y" returns false, as the column contains the unknown NULL and not a value that could or could not be equal to 'Y'.
The normal way to select NULL columns in SQL syntax is using "column is null" (the opposite being "column is not null"). So to find all the rows in table Fred where col2 has a NULL value you use:
select * from fred where col2 is null
NULL in Perl
In perl DBI, NULL column values are represented by undef (the undefined value). As a result, the following Perl run against the table above:$sth = $dbh->prepare(q/select * from tut2_19/); $sth->execute; my @row; while(@row = $sth->fetchrow_array) { print join(", ", @row), "\n"; }
1, Y 2, N 3,and if you have warnings enabled (and you should) then you will see "Use of uninitialised value in join or string at XXX.pl line NNN". You can use either of the following examples to avoid this:
# print out row print DBI::neat_list(\@row), "\n"; or # change all NULLs to empty strings foreach (@row) {$_ = '' unless defined};If you are using a parameterised query you may expect to do:
$sql = q/select * from table where a = ?/; $stmt = $dbh->prepare($sql); $stmt->execute(undef); # pass NULL as parameterbut with some databases this does not work. In ODBC terms, the above code results in a call to the ODBC API
SQLBindParameter
with a data pointer of NULL and a indicator of SQL_NULL_DATA. Some ODBC
drivers and databases will not select all the rows where the column is
NULL when this method is used. As shown above a standard way of
selecting a column which is NULL is "where column is null" but you
cannot parameterise this. This has been the subject of much discussion
and one method put forward is:my $test = undef; $sql = q/select * from table where (col = ? or (col is null and ? = 1))/; $stmt->execute($test,defined($test)?0:1)
No comments:
Post a Comment