Easy to Learn Java: Programming Articles, Examples and Tips

Start with Java in a few days with Java Lessons or Lectures

Home

Code Examples

Java Tools

More Java Tools!

Java Forum

All Java Tips

Books

Submit News
Search the site here...
Search...
 
Search the JavaFAQ.nu
1000 Java Tips ebook

1000 Java Tips - Click here for the high resolution copy!1000 Java Tips - Click here for the high resolution copy!

Java Screensaver, take it here

Free "1000 Java Tips" eBook is here! It is huge collection of big and small Java programming articles and tips. Please take your copy here.

Take your copy of free "Java Technology Screensaver"!.

Lack of Streaming leads to Screaming

JavaFAQ Home » Story by Dr. Kabutz Go to all tips in Story by Dr. Kabutz


Bookmark and Share

The Java Specialists' Newsletter [Issue 047] - Lack of Streaming leads to Screaming

Author: Dr. Heinz M. Kabutz

JDK version:

Category: Language

You can subscribe from our home page: http://www.javaspecialists.co.za (which also hosts all previous issues, available free of charge Smile

Welcome to the 47th edition of The Java(tm) Specialists' Newsletter, read by over 3600 Java programmers in 82 countries. I have put my Mauritius trip with photos on my website http://www.javaspecialists.co.za, please have a look under "Courses". Warning: prolonged looking at that webpage is known to cause envy - please enter at own risk!

I am planning Java and Design Patterns courses in South Africa for June. Please have a look at our website for more information.

Is your company thinking of venturing into new business but you do not have the necessary resources? Are you scared of subcontracting work to a non-English speaking country due to the communication problems involved? (no offense to my non-English readers - this is my advert, ok? Wink Then South Africa is your dream come true. In South Africa we speak and write English fluently, so you will not have the typical communication problems that you would find in non-English speaking countries. Our software developers are highly skilled, very good at solving problems and able to pull up their sleeves and get to work. Internationally, South Africans are known for their hard work and dedication to the task at hand. If your company is looking for such resources, please contact me by simply replying to this email, and I will personally see to it that you are contacted within 24 hours.

In the last two weeks I received two questions from readers who ran out of memory when trying to read a big object from the database. In this newsletter I want to explore how you can read a big object from a database without killing your poor JVM, and be scalable as well.

Lack of Streaming leads to Screaming

How do you retrieve big objects from the database in Java? Say you have a database containing previews of movies in DivX format, stored as IMAGE columns. How do you retrieve the 25 megabyte file from the database using JDBC?

Simple. We write a SELECT statement, execute it, and say result_set.getBytes(1). We run the code and it works well for small movie snippets, but as soon as we have a 25 megabyte file, our poor JVM throws an OutOfMemoryError. What's annoying about an OutOfMemoryError is that the stack trace is not filled in (because hey, you've run out of memory!), so you cannot exactly determine where the error occurred, unless you add trace logging. What makes it even more tricky is that some JDBC drivers try to be too clever, resulting in OutOfMemoryErrors.

Let's look at some test code. I have written two test classes, TestDatabaseBlobInsert and TestDatabaseBlobFetch. What surprised me was that of the several drivers that I tested (DataDirect, iNet SPRINTA, Avenir, MS SQL Server Type 4, JDBC/ODBC bridge), the JDBC/ODBC bridge was the fastest for inserting big objects into MS SQL Server. For fetching the data it was the slowest:

import java.sql.DriverManager;
import java.sql.Statement;
import java.sql.SQLException;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.io.ByteArrayInputStream;

public class TestDatabaseBlobInsert {
  private static final String TABLE_DROP =
    "DROP TABLE MovieArchive";
  private static final String TABLE_CREATE =
    "Create Table MovieArchive (moviedata image, title varchar(255))";
  private static final String TABLE_INSERT =
    "INSERT INTO MovieArchive (title, moviedata) VALUES (?,?)";

  private static final int size = 25 * 1024 * 1024;
  private final byte[] data = new byte[size];

  private final Connection con;

  public TestDatabaseBlobInsert(String driver, String url,
      String user, String password)
      throws SQLException, ClassNotFoundException {
    Class.forName(driver);
    con = DriverManager.getConnection(url, user, password);
    System.out.println("Driver: " + driver);
    for (int i=0; ibyte)(Math.random()*255);
  }

  public void setUp() throws SQLException {
    Statement st = con.createStatement();
    try {
      System.out.println("Dropping old table");
      st.executeUpdate(TABLE_DROP);
    } catch(SQLException ex) {} // table might not exist
    System.out.println("Creating new table");
    st.executeUpdate(TABLE_CREATE);
    st.close();
  }

  public void testInsertWithBinaryStream() throws SQLException {
    long start = -System.currentTimeMillis();
    System.out.println("Inserting via BinaryStream");
    PreparedStatement stmt = con.prepareStatement(TABLE_INSERT);
    ByteArrayInputStream bis = new ByteArrayInputStream(data);
    stmt.setString(1, "Babe");
    stmt.setBinaryStream(2, bis, data.length);
    stmt.executeUpdate();
    start += System.currentTimeMillis();
    System.out.println("That took " + start + "ms");
    stmt.close();
  }

  public void testInsertWithSetBytes() throws SQLException {
    long start = -System.currentTimeMillis();
    System.out.println("Inserting via setBytes()");
    PreparedStatement stmt = con.prepareStatement(TABLE_INSERT);
    stmt.setString(1, "On Her Majesty's Secret Service");
    stmt.setBytes(2, data);
    stmt.executeUpdate();
    start += System.currentTimeMillis();
    System.out.println("That took " + start + "ms");
    stmt.close();
  }

  public void testAll() throws SQLException {
    setUp();
    testInsertWithBinaryStream();
    testInsertWithSetBytes();
  }

  public static void main(String[] args) throws Exception {
    if (args.length != 4) usage();
    TestDatabaseBlobInsert test = new TestDatabaseBlobInsert(
      args[0], args[1], args[2], args[3]);
    test.testAll();
  }

  private static void usage() {
    System.out.println(
      "Usage: TestDatabaseBlobInsert driver url username password");
    System.exit(1);
  }
}

I ran this code by setting up an ODBC source pointing to the MS SQL Server database Movies, and then running it with a maximum heap space of 256MB:

java -Xmx256m -classpath . TestDatabaseBlobInsert sun.jdbc.odbc.JdbcOdbcDriver jdbc:odbc:Movies sa ""

The result on my little notebook was the following:

Driver: sun.jdbc.odbc.JdbcOdbcDriver
Dropping old table
Creating new table
Inserting via BinaryStream
That took 78975ms
Inserting via setBytes()
That took 73419ms

Back to the issue at hand - how do we get this data out of the database? The seemingly easiest way is to do the following:

1:  PreparedStatement st = con.prepareStatement(
  "SELECT moviedata FROM MovieArchive WHERE title = ?");
2:  st.setString(1, "Babe");
3:  ResultSet rs = st.executeQuery();
4:  if (rs.next()) {
5:    byte[] data = rs.getBytes(1);
}

This code can easily cause an OutOfMemoryError if the available heap memory is less than the size of the data that you are reading. Now for the 1'000'000 dollar question: Where does OutOfMemoryError occur? That depends on your driver. If you are using the iNet SPRINTA or the Avenir drivers, then you will run out of memory on line 4, i.e. when you call rs.next(). If you are using the DataDirect, Microsoft or ODBC bridge drivers, you will only get the out of memory error on line 5.

How can we write this so that we won't get an out of memory error? Here is some sample code. It is very important that you read the data blocks of bytes at a time, rather than in one big chunk, otherwise your system will definitely not scale to support many users.

import java.sql.*;
import java.io.*;

public class TestDatabaseBlobFetch {
  private static final String TABLE_SELECT =
    "SELECT moviedata FROM MovieArchive WHERE title = ?";

  private final Connection con;

  public TestDatabaseBlobFetch(String driver, String url,
      String user, String password)
      throws SQLException, ClassNotFoundException {
    Class.forName(driver);
    con = DriverManager.getConnection(url, user, password);
    System.out.println("Driver: " + driver);
  }

  public void testSelectBlocksAtATime() throws SQLException {
    long start = -System.currentTimeMillis();
    System.out.println("SELECT: 64kb blocks at a time");
    PreparedStatement stmt = con.prepareStatement(TABLE_SELECT);
    stmt.setString(1, "Babe");
    ResultSet rs = stmt.executeQuery();
    int count=0;
    if (rs.next()) {
      try {
        System.out.println("Retrieving Data");
        OutputStream out = new BufferedOutputStream(
          new FileOutputStream("Data.1"));
        InputStream in = new BufferedInputStream(
          rs.getBinaryStream(1));
        byte[] buf = new byte[65536];
        int i;
        while((i = in.read(buf, 0, buf.length)) != -1) {
          out.write(buf, 0, i);
          count += i;
        }
        out.close();
      } catch(IOException ex) { ex.printStackTrace(); }
    }
    System.out.println("fetched " + count + " bytes");
    start += System.currentTimeMillis();
    System.out.println("That took " + start + "ms");
    stmt.close();
  }

  public void testSelectWithGetBytes() throws SQLException {
    long start = -System.currentTimeMillis();
    System.out.println("SELECT: all at once");
    PreparedStatement stmt = con.prepareStatement(TABLE_SELECT);
    stmt.setString(1, "Babe");
    ResultSet rs = stmt.executeQuery();
    byte[] data = null;
    if (rs.next()) {
      System.out.println("Retrieving Data");
      data = rs.getBytes(1);
      try {
        FileOutputStream out = new FileOutputStream("Data.2");
        out.write(data, 0, data.length);
        out.close();
      } catch(IOException ex) { ex.printStackTrace(); }
    }
    System.out.println("fetched " + data.length + " bytes");
    start += System.currentTimeMillis();
    System.out.println("That took " + start + "ms");
    stmt.close();
  }

  public void testAll() throws SQLException {
    testSelectBlocksAtATime();
    testSelectWithGetBytes();
  }

  public static void main(String[] args) throws Exception {
    if (args.length != 4) usage();
    TestDatabaseBlobFetch test = new TestDatabaseBlobFetch (
      args[0], args[1], args[2], args[3]);
    test.testAll();
  }

  private static void usage() {
    System.out.println(
      "usage: TestDatabaseBlobFetch driver url username password");
    System.exit(1);
  }
}

I tried this with several JDBC drivers, the only Type 4 driver that worked correclty was the DataDirect driver (now released under the Microsoft label). I will not go into the differences between the Microsoft driver and the others, that's for another article. If you want to try this out, you can run it like this:

java -Xmx2m -classpath .;msbase.jar;mssqlserver.jar;msutil.jar
  TestDatabaseBlobFetch com.microsoft.jdbc.sqlserver.SQLServerDriver
  jdbc:microsoft:sqlserver://localhost:1433;DatabaseName=Movies sa ""

Naturally you have to download the Microsoft SQL Server Type 4 driver and put the jar files into the directory from which you are running this code. The output from using the DataDirect Microsoft driver is the following on my machine:

Driver: com.microsoft.jdbc.sqlserver.SQLServerDriver
SELECT: 64kb blocks at a time
Retrieving Data
fetched 26214400 bytes
That took 62746ms
SELECT: all at once
Retrieving Data
Exception in thread "main" java.lang.OutOfMemoryError
        <>

The iNet SPRINTA driver falls over much sooner - actually when you call rs.next():

Driver: com.inet.tds.TdsDriver
SELECT: 64kb blocks at a time
Exception in thread "main" java.lang.OutOfMemoryError
        <>

What is wrong here?

I am finding it very hard to think of a reason to store 25mb files in a database. They are too big to stay in the database's cache for very long. I think that the design is flawed to start with. I would personally rather store the URL to the file in the database, instead of the actual data, and then retrieve the data directly from the file system.

I don't know all the conditions why someone would want to do that, but just remember, you have to stream such big data out of the database chunk by chunk, otherwise you have a serious problem.

Until the next issue ...

Heinz


Copyright 2000-2004 Maximum Solutions, South Africa

Reprint Rights. Copyright subsists in all the material included in this email, but you may freely share the entire email with anyone you feel may be interested, and you may reprint excerpts both online and offline provided that you acknowledge the source as follows: This material from The Java(tm) Specialists' Newsletter by Maximum Solutions (South Africa). Please contact Maximum Solutions for more information.

Java and Sun are trademarks or registered trademarks of Sun Microsystems, Inc. in the United States and other countries. Maximum Solutions is independent of Sun Microsystems, Inc.

 Printer Friendly Page  Printer Friendly Page
 Send to a Friend  Send to a Friend

.. Bookmark and Share

Search here again if you need more info!
Custom Search



Home Code Examples Java Forum All Java Tips Books Submit News, Code... Search... Offshore Software Tech Doodling

RSS feed Java FAQ RSS feed Java FAQ News     

    RSS feed Java Forums RSS feed Java Forums

All logos and trademarks in this site are property of their respective owner. The comments are property of their posters, all the rest 1999-2006 by Java FAQs Daily Tips.

Interactive software released under GNU GPL, Code Credits, Privacy Policy