Click to See Complete Forum and Search --> : error in input argument.....


athanach
11-23-2007, 07:03 AM
I am running the code bellow in NetBeans 5.5.1 and it cant take the input argument that i give?

Any suggestions?


import java.io.File;

// DOM imports
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.w3c.dom.NodeList;
import org.w3c.dom.traversal.DocumentTraversal;
import org.w3c.dom.traversal.NodeFilter;
import org.w3c.dom.traversal.NodeIterator;

// Vendor parser
import org.apache.xerces.parsers.DOMParser;

/**
* <b><code>ItemSearcher</code></b> shows how the DOM Level 2 Traversal
* module can be used for searching through a document.
*/
public class ItemSearcher {

/** The default namespace for the document to search through */
private String docNS = "http://www.oreilly.com/javaxml2";

/**
* <p>This method takes a file, and searches it for specific
* pieces of data using DOM traversal.</p>
*
* @param filename name of XML file to search through.
* @throws <code>Exception</code> - generic problem handling.
*/
public void search(String filename) throws Exception {
// Parse into a DOM tree
File file = new File(filename);
DOMParser parser = new DOMParser();
parser.parse(file.toURL().toString());
Document doc = parser.getDocument();

// Get node to start iterating with
Element root = doc.getDocumentElement();
NodeList descriptionElements =
root.getElementsByTagNameNS(docNS, "description");
Element description = (Element)descriptionElements.item(0);

// Get a NodeIterator
NodeIterator i = ((DocumentTraversal)doc)
.createNodeIterator(description, NodeFilter.SHOW_ALL,
new FormattingNodeFilter(), true);

Node n;
while ((n = i.nextNode()) != null) {
System.out.println("Search phrase found: '" + n.getNodeValue() + "'");
}
}

/**
* <p>Provide a static entry point.</p>
*/
public static void main(String[] args) {
if (args.length == 0) {
System.out.println("No item files to search through specified.");
return;
}

try {
ItemSearcher searcher = new ItemSearcher();
for (int i=0; i<args.length; i++) {
System.out.println("Processing file: " + args[i]);
searcher.search(args[i]);
}
} catch (Exception e) {
e.printStackTrace();
}
}
}

class FormattingNodeFilter implements NodeFilter {

public short acceptNode(Node n) {
if (n.getNodeType() == Node.TEXT_NODE) {
Node parent = n.getParentNode();
if ((parent.getNodeName().equalsIgnoreCase("b")) ||
(parent.getNodeName().equalsIgnoreCase("i"))) {
return FILTER_ACCEPT;
}
}
// If we got here, not interested
return FILTER_SKIP;
}
}

chazzy
11-23-2007, 10:26 AM
does it give you an error when you give it the argument? what argument do you give it?

athanach
11-23-2007, 11:57 AM
no it doesn't give me a error it runs and give me

"No item files to search through specified."

for input i give a html or xml file , and i chance the variable description with the world link to find the link tag

chazzy
11-24-2007, 08:18 AM
Based on your main method, that's because you didn't give it any arguments


public static void main(String[] args) {
if (args.length == 0) {
System.out.println("No item files to search through specified.");
return;
}


When running a java class, the proper syntax should be (from command line)


java -cp (put in your classpath here) ItemSearcher args

athanach
11-24-2007, 08:39 AM
Because i run it from NetBeans can you tell me how to do it and what arguments to give?

chazzy
11-24-2007, 09:29 AM
I have no idea what arguments to give it. This is your program, no?

As for how to get it to run in netbeans, maybe this is what you need?

http://www.netbeans.org/kb/55/using-netbeans/deploy.html

athanach
11-24-2007, 10:23 AM
No it's a code that i found in web and i want it to search in html pages and icant understand what to give

athanach
11-24-2007, 10:41 AM
It's ok with the arg but i run it and it give me a warning for the line

parser.parse(file.toURL().toString());

Warning : deprecation toURL() in java.io.File has been deprecated !!!!!!!!!

And while i run it it goes in search class but it doesn't give me if it founds the world

chazzy
11-24-2007, 05:07 PM
All that it's saying is that the file.toURL() method is deprecated, which it is. You should be able to just use the URL class.

it looks like it's only going to match well formed HTML (proper XML syntax)
So if you create a page like this to search against:

<html>
<b><i>Hello, world!</b>

it won't match it. It looks like it's looking for a <b><i></i></b> sequence only, if I read the code correctly.

Have you tried contacting the original author though?

athanach
11-25-2007, 08:20 AM
i can't find the developer

i thought that putting the tag i want to search in description variable i can search for this tag. That i said is in that line

NodeList descriptionElements =
root.getElementsByTagNameNS(docNS, "description");


So in html page

<html>
<b><i>Hello, world!</b>

i put <b> in description and finds for me.

Can u help with this with any way ????????

And chazzy thanks a lot for that help you give me

chazzy
11-25-2007, 10:50 AM
well in this class

class FormattingNodeFilter implements NodeFilter {

public short acceptNode(Node n) {
if (n.getNodeType() == Node.TEXT_NODE) {
Node parent = n.getParentNode();
if ((parent.getNodeName().equalsIgnoreCase("b")) ||
(parent.getNodeName().equalsIgnoreCase("i"))) {
return FILTER_ACCEPT;
}
}
// If we got here, not interested
return FILTER_SKIP;
}
}

You're telling it to only accept nodes that have the parent element b or i.

Either way, description isn't a valid HTML tag.

athanach
11-26-2007, 08:01 AM
i have change the line

parser.parse(file.toURL().toString());

in

parser.parse(file.toString());

and it has no warning or error

But in search method the element root is printing
[html:null]
the description
[#document:null]
and the variable i
null
so it doesn't entry in while loop to print the results

Can you help me for one more time??????



public void search(String filename) throws Exception {
// Parse into a DOM tree
File file = new File(filename);
DOMParser parser = new DOMParser();
parser.parse(file.toURL().toString());
Document doc = parser.getDocument();

// Get node to start iterating with
Element root = doc.getDocumentElement();
NodeList descriptionElements =
root.getElementsByTagNameNS(docNS, "description");
Element description = (Element)descriptionElements.item(0);

// Get a NodeIterator
NodeIterator i = ((DocumentTraversal)doc)
.createNodeIterator(description, NodeFilter.SHOW_ALL,
new FormattingNodeFilter(), true);

Node n;
while ((n = i.nextNode()) != null) {
System.out.println("Search phrase found: '" + n.getNodeValue() + "'");
}
}

chazzy
11-26-2007, 08:58 PM
File's toString() method doesn't do what you think it's doing. In general, the toString method returns some gibberish about the object (usually the FQCN + it's memory address - not the contents of the object).

athanach
11-27-2007, 04:36 AM
I thing the error is somewhere in lines

DOMParser parser = new DOMParser();
parser.parse(file.toURL().toString());
Document doc = parser.getDocument();

because the doc is null and then the others variables become null. Is something i do wrong??????????



I put again the toURL method as you said but the only difference is a warning that toURL has been deprecated.

athanach
11-27-2007, 05:53 AM
Ok is running !!!!!!!!!!

But a new problem has occur. The page that the program search mast be well formed else it gives me errors of page. Can tell me how to let the check outside of the program?

Also the problem i think was

athanach
11-27-2007, 05:55 AM
Ok is running !!!!!!!!!!

But a new problem has occur. The page that the program search mast be well formed else it gives me errors of page. Can tell me how to let the check outside of the program?

Also the problem i think was that i used wrong namespace in

/** The default namespace for the document to search through */
private String docNS = "http://www.oreilly.com/javaxml2";


so i put the www.w3.org/1999/xhtml and works. Is the same for html??

athanach
11-29-2007, 04:54 AM
Any suggestions for my problem?

It doesn't run for the most pages and it shows errors of the pages and no the element inside the tags.

For namespaces i found
html : www.w3.org/TR/REC-html40
xhtml : www.w3.org/1999/xhtml

is the namespaces the problem or something else?????????

chazzy
11-29-2007, 07:27 AM
it's probably the fact that you're looking at non well formed pages. without seeing what pages you're looking at, there's not much I can tell you.

athanach
11-29-2007, 07:44 AM
for example pages like
www.in.gr
my.ceid.upatras.gr
www.ajaxian.com
it gives me errors of pages like other programs i have that check the html pages

chazzy
11-29-2007, 08:29 PM
the first page (in.gr) is not well formed (it has no root tag)

ajaxian.com is an XHTML page, not HTML. i'm not sure but your code might look at the difference.

athanach
11-30-2007, 06:14 AM
For example in www.ajaxian.com the errors who gives me is

The element type "img" must be terminated by the matching end-tag "</img>"

can i takeout the check of the pages and have only the search, or with parcing a page do the check/??????????

Also about if the program regognize the html code i not use now namespace i chance the line

root.getElementsByTagNameNS(docNS, "description");
with
root.getElementsByTagName( "description");