internet.com
Library
Magazines

webreference.com

Java Boutique

Search Engine Watch

PC Webopedia

The Web Developer's Virtual Library

Library

When Java Meets CGI

By Jay Lorenzo

Both Java and ActiveX have fundamentally changed how we interact with the Internet by introducing a higher level of interactivity and functionality than was previously possible. I believe this level of interactivity will grow even further, with the emergence of a new breed of applications that rely on these technologies to be integrated into both client- and server-side components. The Jeeves API, which has recently been introduced by Sun, is an example of this trend.

As exciting as some of these changes may be, it is still important to realize that in terms of interactivity, a great many sites rely on server-based CGI programs to provide compatibility with the widest number of browser platforms. As Java and ActiveX become more common in Web use, there will be many more instances where it will be necessary to maintain compatibility with preexisting server-side CGI programs. Java-enabled pages can be an integrated part of this environment, and in the process can ensure that data submitted contains the correct formatting and data validation before being submitted to the server.

There are some limitations to this approach. Sun has built several security restrictions into Java that will affect us when we wish to use applets for network communications. Applets loaded over the Internet are typically only permitted to communicate with the server from which they were downloaded. This means that if you want to communicate with other servers, you will most likely rely on a proxy service or another CGI back-end process on the originating host to accomplish this. In this column, we will examine how to use Java's networking capabilities to communicate via the POST method to an existing server-side CGI program.

Using the Java.Net.* Classes

Sun has provided a rich set of class libraries in java.net.* that make it extremely easy to provide network functionality in applets. These classes provide a quick way to create communication using TCP and UDP sockets and URL objects, without relying on platform-specific APIs to do so. For a review of these classes, point your browser to http://www.javasoft.com/products/JDK/CurrentRelease/api/ for the current API release. How can we create code that simulates the steps taken by a browser to do a POST action to a CGI program? As you may be aware, a POST is usually handled by the browser if the
indicates that a POST action is required. Once the form data is submitted, the browser connects to the server specified in the POST action and sends a data stream that includes HTTP header information and data in a form similar to this:
POST /cgi-bin/script.cgi
Content-type: application/x-www-form-urlencoded
Content-length:

While the connection remains open, the server processes the input and returns the output, which presumably also includes HTTP header info to assist the browser in processing the request.

To approximate this behavior in Java, we can write code that opens a new TCP socket, connecting to a specific host. For clarity, we will simplify this discussion by omitting error-handling code that should be implemented to catch any failures that may occur along the way. We will pretend that we already have an encoded string called form_data, which contains the URL encoded data that is being sent to the server:

Socket my_socket = newSocket("www.cgi-host.com", 80);
This creates the socket object, and in doing so also creates the subsequent connection to the host www.cgi-host.com on port 80, the well-known HTTP port. As a side note, the current JDK (Java Development Kit, version 1.0.2 as of this writing) will not allow the creation of sockets with a numeric IP address if that address cannot be resolved by a reverse lookup in the hosts file or the DNS server. JDK 1.1 will fix this limitation. If the creation of the socket and connection to the host is successful, we then set up two data streams that will be responsible for handling the data that will be passed between the applet and server:
String cgi_data, temp_data;
DataOutputStream send_data = newDataOutputStream(my_socket.get OutputStream()):
DataInputStream read_data = newDataInputStream(my_socket.getInputStream());
Once we succeed in setting up the streams, we then write out the necessary header information that is to be processed by the server, by using the writeBytes method of the DataOutputstream object:
send_data.writeBytes("POST /cgi-bin/script.cgi HTTP/1.0\r\n");
send_data.writeBytes("Content-type: application/x-www-form-urlencoded\r\n");
send_data.writeBytes("Content-length:"+ form_data.length() + "\r\n");
send_data.writeBytes("POST /cgi-bin/script.cgi HTTP/1.0\r\n");
send_data.writeBytes(form_data);
The connection is hopefully still open, so now we will receive the output from the CGI program. We are going to assume that there is no URL encoding in the response, which means there is no need to decode the response. The plan here is to temporarily store the response into temp_data, and then concatenate it to the cgi_data string object:

while ( (temp_data = read_data.readLine() ) != null) {
  cgi_data += temp_data + "\n"
}
Once we have captured the response from the server, we will then close the socket:
my_socket.close();
Then we proceed to process the data accordingly. One issue that we have intentionally ignored is the fact that the data is POSTed as x-www-form-urlencoded data, meaning that data that we have sent to the server needs to be encoded before transmission. This encoding is relatively simple but somewhat time-consuming-we need to collect all of our INPUT and VALUE pairs, and put them into a string where each pair takes the form of INPUT1=VALUE1&INPUT2=VALUE2, etc. After accomplishing that, we would convert all of the string, white space, and certain ASCII characters in the stream into hexadecimal.

As straightforward as this example is (especially when compared to writing it from scratch in C or C++), we can simplify it even further by using the java.net.URL* classes, and rely on the classes' protocol handlers to do most of the work for us. In particular, the URLConnection class provides a simple way to pass data to and from CGI POST without requiring direct manipulation of the datastream.

One of the most significant features of the URL classes is the fact that they contain protocol handlers for commonly used Internet protocols. Even more promising is the fact that new protocol handlers can be built for specific purposes, which can remove a tremendous amount of complexity when writing network-based communications. Let's take a look at how to implement some of these classes within the context of HTTP.

The URLConnection

The URLConnection class contains a protocol handler that will do all of the decoding and handling necessary when interfacing with CGI-based systems. Once we create a new URL object, we simply pass the URL object the necessary information needed to create a URL connection, such as the protocol, host name, port, and file name, and then have the URLConnection handle the data (again, for reasons of clarity, we will omit the exception handling code that is necessary for this to run):
URL my_url = new URL("http://www.cgi-host.com/");
URLConnection connection =  my_url.openConnection(); 
connect.setDoOutput(true);
This creates a connection to the CGI host and invokes a handler to take care of the data that will be passed back and forth. The setDoOutput method is used as a workaround for Netscape browsers-certain versions of Navigator do not allow POST methods, due to security issues. As of this writing (using the Navigator 3.0 final release), POST actions will be accepted, providing this method is set to true.

We then send out data through the use of a PrintStream. Once we are done sending our data, we close the PrintStream and read the response:

PrintStream send_data = newPrintStream(connection.getOutputStream());
send_data.print("input1=value1\n\n");
send_data.close( );
We would print as many INPUT VALUE pairs as our form requires and then close the PrintStream. Once we have closed the PrintStream, we can use a DataInputStream or other mechanism to read the results back in, in a fashion similar to our first example.

In closing, let me mention a few resources to explore. Check out this nice "How To..." page on CGI and Java..

Finally, if you are planning to pursue CGI and Java integration, be sure to read the Javasoft security FAQ at http://www.javasoft.com/sfaq/index.html.


Web Developer® Site Feedback
Web Developer®
Copyright © 2000 internet.com Corporation. All rights reserved.

http://www.internet.com/

Web Developer® Home Over a dozen topics in detail Live Chat Downloads Book and Product Reviews Threaded Discussions How-To/Articles/Links Developer Daily News Subscribe Search Corporate Information Advertise Events Publications internet.com Home