How to get a list of links in a web page ?

Just starting out? Need help? Post your questions and find answers here.
fonkfonk
New User
New User
Posts: 9
Joined: Mon Oct 06, 2003 10:50 am

How to get a list of links in a web page ?

Post by fonkfonk »

Hello,

I'm planing to create a prog that would allow to generate a list of links in a webpage.

I assume I would have to download the page and then parse the <a href=""></a> lines.

Is there a way to recursevely read all the links "on line" (I found nothing for this particular purpose)...

Thanks for any idea.

Pierre
plouf
Enthusiast
Enthusiast
Posts: 284
Joined: Fri Apr 25, 2003 6:35 pm
Location: Athens,Greece

Post by plouf »

there are some com ways DCOM ithink but dunno more
personally i believe the sipmpler way will be to downloads the whole page as .html and repeatively seek for "href=" reference
Christos
Max.
Enthusiast
Enthusiast
Posts: 225
Joined: Fri Apr 25, 2003 8:39 pm

Post by Max. »

Maybe the source code of Freak's IETool can give you a hint:

viewtopic.php?t=2698

It's an extension to the MS Internet Explorer which allows you to compile selected text of a HTML pages with PureBasic.

Next I'd check for is FloHimself's Regular Expression Library; ideal to parse for links with an expression like

Code: Select all

<a href=".*</a>
Don't have a link for Flo's lib handy, but if you search for Regular Expression, you should come up with a result quickly.
freak
PureBasic Team
PureBasic Team
Posts: 5953
Joined: Fri Apr 25, 2003 5:21 pm
Location: Germany

Post by freak »

> Maybe the source code of Freak's IETool can give you a hint:

my tool only simulates CTRL+C to copy the text, you can't read all the
contents of the page like that.
Timo
quidquid Latine dictum sit altum videtur
fonkfonk
New User
New User
Posts: 9
Joined: Mon Oct 06, 2003 10:50 am

Post by fonkfonk »

Well, it seems that plouf and I agree on the same idea ...
And this is more easy for me ! :roll:

Thanks all for taking the time to find a solution !

Regards,

Pierre
Post Reply