Click to See Complete Forum and Search --> : java searching html


athanach
11-09-2007, 06:44 AM
i want to use java for searching html tags. More specific i want to know if a webpage has rss. If it has i see it in meta tag. Then i want to download the link for rss and store it in database. For going to pages i use a java crawler.

Anyone to help me?

athanach
11-09-2007, 10:29 AM
Do you want any clarification because my English isn't good enough.........

Tell me someone

chazzy
11-10-2007, 07:08 PM
couldn't you use DOM to parse the webpage?

athanach
11-11-2007, 09:55 AM
any good DOM api?

this that i want to do to fount the world rss and store the url

<head>

<title>SPORT 24</title>
<meta http-equiv="content-type" content="text/html; charset=windows-1253"/>

<link href="/ast/css/s24_022.css" type="text/css" rel="stylesheet"/>
<script language="javascript" src="/ast/js/s24_007.js"></script>
<link rel="Shortcut Icon" href="/favicon.ico" >
<link rel="icon" href="/favicon.png" type="image/png">


<link rel="alternate" type="application/rss+xml" href="http://www.sport24.gr/svc/rss/topNews/" title="SPORT 24 RSS: Σημαντικότερες ειδήσεις" />
<link rel="alternate" type="application/rss+xml" href="http://www.sport24.gr/svc/rss/lastNews/" title="SPORT 24 RSS: Ροή ειδήσεων, όλες οι κατηγορίες" />

</head>

for example to found the from th e type tag

<link rel="alternate" type="application/rss+xml" href="http://www.sport24.gr/svc/rss/topNews/" title="SPORT 24 RSS: Σημαντικότερες ειδήσεις" />

and store the

href="http://www.sport24.gr/svc/rss/topNews/"


any help?

chazzy
11-11-2007, 02:08 PM
The org.w3c.dom package maybe? It's always what I use.

This came up in a quick search in google.

http://htmlparser.sourceforge.net/