/    Sign up×
Community /Pin to ProfileBookmark

Is This A Function Value Collision Or A Default Override ?

Look at this piece of code at the top of thew script …

[code]
function crawl_page($url, $depth = 5)
{
[/code]

And this one at the bottom of the script ….

[code]
crawl_page(“http://google.com”, 2);
[/code]

What is the depth of the pages to crawl ? 5 or 2 ? And why two different values here ?
One a default and the other an override ? if so, top code default ?

Full Code from StackOverflow …
I can’t ask over there as i do not have 50 points to make a comment to question. So asking here.

[code]
<?php

//https://stackoverflow.com/questions/2313107/how-do-i-make-a-simple-crawler-in-php
//WORKING!

ini_set(‘display_errors’, true);
error_reporting(E_ALL);

function crawl_page($url, $depth = 5)
{
static $seen = array();
if (isset($seen[$url]) || $depth === 0) {
return;
}

$seen[$url] = true;

$dom = new DOMDocument(‘1.0’);
@$dom->loadHTMLFile($url);

$anchors = $dom->getElementsByTagName(‘a’);
foreach ($anchors as $element) {
$href = $element->getAttribute(‘href’);
if (0 !== strpos($href, ‘http’)) {
$path = ‘/’ . ltrim($href, ‘/’);
if (extension_loaded(‘http’)) {
$href = http_build_url($url, array(‘path’ => $path));
} else {
$parts = parse_url($url);
$href = $parts[‘scheme’] . ‘://’;
if (isset($parts[‘user’]) && isset($parts[‘pass’])) {
$href .= $parts[‘user’] . ‘:’ . $parts[‘pass’] . ‘@’;
}
$href .= $parts[‘host’];
if (isset($parts[‘port’])) {
$href .= ‘:’ . $parts[‘port’];
}
$href .= dirname($parts[‘path’], 1).$path;
}
}
crawl_page($href, $depth – 1);
}
echo “URL:”,$url,PHP_EOL,”CONTENT:”,PHP_EOL,$dom->saveHTML(),PHP_EOL,PHP_EOL;
}
crawl_page(“http://localhost/ebrute/crawler/6/1.php”, 2);

?>
[/code]

to post a comment
PHP

6 Comments(s)

Copy linkTweet thisAlerts:
@NachfolgerMay 12.2020 — Yes, that's a default parameter. You could have tested this yourself, very easily.

Function
``<i>
</i>function xyz($message = "No message provided") {
echo $message;
}<i>
</i>
`</CODE>

Tests
<CODE>
`<i>
</i>xyz(); // No message provided
xyz("Hey.. I overrode the default value"); // Hey.. I overrode the default value
...<i>
</i>
``
Copy linkTweet thisAlerts:
@NogDogMay 12.2020 — FYI (or RTFM): https://www.php.net/manual/en/functions.arguments.php#functions.arguments.default
Copy linkTweet thisAlerts:
@developer_webauthorMay 12.2020 — @Nachfolger#1618373

I added a few lines. Check comment for where it says i added the lines.

I added an increment so each url the crawler finds and echoes gets numbered. Guess what ? All found links are numbered as 3! Why is that ?

Look what gets echoed:

**3:

URL:http://localhost/test/crawler/6/50.php CONTENT:

This is Page 50.

Go to:-> 1

Go to:-> 40

3:

URL:http://localhost/test/crawler/6/40.php CONTENT:

This is Page 40.

Go to:-> 50

Go to:-> 30

3:

URL:http://localhost/test/crawler/6/30.php CONTENT:

This is Page 30.

Go to:-> 40

Go to:-> 20

3:

URL:http://localhost/test/crawler/6/20.php CONTENT:

This is Page 20.

Go to:-> 30

Go to:-> 10

3:

URL:http://localhost/test/crawler/6/10.php CONTENT:

This is Page 10.

Go to:-> 20

Go to:-> 5

3:

URL:http://localhost/test/crawler/6/5.php CONTENT:

This is Page 5.

Go to:-> 10

Go to:-> 4

3:

URL:http://localhost/test/crawler/6/4.php CONTENT:

This is Page 4.

Go to:-> 5

Go to:-> 3

3:

URL:http://localhost/test/crawler/6/3.php CONTENT:

This is Page 3.

Go to:-> 4

Go to:-> 2

3:

URL:http://localhost/test/crawler/6/2.php CONTENT:

This is Page 2.

Go to:-> 3

Go to:-> 1

3:

URL:http://localhost/test/crawler/6/1.php CONTENT:

This is Page 1.

Go to:-> 2

Go to:-> 50**

Why are they all numbered "3" ? Should start at "1" and end at 10 as it crawled 10 urls after finding them on each page.


<i>
</i>&lt;?php

//https://stackoverflow.com/questions/2313107/how-do-i-make-a-simple-crawler-in-php
//WORKING!

ini_set('display_errors', true);
error_reporting(E_ALL);

function crawl_page($url, $depth = 20)
{
static $seen = array();
if (isset($seen[$url]) || $depth === 0) {
return;
}

<i> </i>$seen[$url] = true;

<i> </i>$dom = new DOMDocument('1.0');
<i> </i>@$dom-&gt;loadHTMLFile($url);

<i> </i>$anchors = $dom-&gt;getElementsByTagName('a');
<i> </i>$i=1;// MY ADDED LINE
<i> </i>foreach ($anchors as $element)
<i> </i>{
<i> </i> $href = $element-&gt;getAttribute('href');
<i> </i> if (0 !== strpos($href, 'http'))
<i> </i> {
<i> </i> $path = '/' . ltrim($href, '/');
<i> </i> if (extension_loaded('http'))
<i> </i> {
<i> </i> $href = http_build_url($url, array('path' =&gt; $path));
<i> </i> }
<i> </i> else
<i> </i> {
<i> </i> $parts = parse_url($url);
<i> </i> $href = $parts['scheme'] . '://';
<i> </i> if (isset($parts['user']) &amp;&amp; isset($parts['pass']))
<i> </i> {
<i> </i> $href .= $parts['user'] . ':' . $parts['pass'] . '@';
<i> </i> }
<i> </i> $href .= $parts['host'];
<i> </i> if (isset($parts['port']))
<i> </i> {
<i> </i> $href .= ':' . $parts['port'];
<i> </i> }
<i> </i> $href .= dirname($parts['path'], 1).$path;
<i> </i> }
<i> </i> }
<i> </i> crawl_page($href, $depth - 1);
<i> </i> $i++;//MY ADDED LINE
<i> </i>}
<i> </i>//echo "URL:",$url,PHP_EOL,"CONTENT:",PHP_EOL,$dom-&gt;saveHTML(),PHP_EOL,PHP_EOL;
<i> </i>echo "$i:"; echo "&lt;br&gt;"; echo "URL:",$url,PHP_EOL,"CONTENT:&lt;br&gt;",PHP_EOL,$dom-&gt;saveHTML(),PHP_EOL,PHP_EOL; //My Added line. Added &lt;br&gt;.
}
crawl_page("http://localhost/test/crawler/6/1.php", 20);

?&gt;
Copy linkTweet thisAlerts:
@developer_webauthorMay 12.2020 — @NogDog#1618376

Thanks for the link.

However, this was way too much over my head to memorise the syntaxes or code from the FM ;):
<i>
</i>&lt;?php
function makecoffee($types = array("cappuccino"), $coffeeMaker = NULL)
{
$device = is_null($coffeeMaker) ? "hands" : $coffeeMaker;
return "Making a cup of ".join(", ", $types)." with $device.n";
}
echo makecoffee();
echo makecoffee(array("cappuccino", "lavazza"), "teapot");
?&gt;
Copy linkTweet thisAlerts:
@NogDogMay 12.2020 — > @developer_web#1618380 However, this was way too much over my head to memorise the syntaxes or code from the FM 😉:

You cannot memorize every PHP function, but you can learn how to use/read/interpret the manual. Your productivity will go way up if you concentrate on learning _**how**_ to program, instead of copying/pasting code from the web, then copying/pasting it here when it doesn't work.
Copy linkTweet thisAlerts:
@developer_webauthorMay 15.2020 — @NogDog#1618388

Manual too deep. For work experience, so to speak, I checkout php tutorials and when they show errors I come here for you guys to fix so I learn non-buggy code.
×

Success!

Help @developer_web spread the word by sharing this article on Twitter...

Tweet This
Sign in
Forgot password?
Sign in with TwitchSign in with GithubCreate Account
about: ({
version: 0.1.9 BETA 4.19,
whats_new: community page,
up_next: more Davinci•003 tasks,
coming_soon: events calendar,
social: @webDeveloperHQ
});

legal: ({
terms: of use,
privacy: policy
});
changelog: (
version: 0.1.9,
notes: added community page

version: 0.1.8,
notes: added Davinci•003

version: 0.1.7,
notes: upvote answers to bounties

version: 0.1.6,
notes: article editor refresh
)...
recent_tips: (
tipper: @Yussuf4331,
tipped: article
amount: 1000 SATS,

tipper: @darkwebsites540,
tipped: article
amount: 10 SATS,

tipper: @Samric24,
tipped: article
amount: 1000 SATS,
)...