Hi there IR
I wonder if it's possible (with php everything is possible right? ) to get data from other websites.
Example:
I know a news website and i wanted to create an algorithm (i don't even know what this means but more like a system) that catches all the news regarding a subject i point. Like: Sporting (my soccer team) and it searches on the recent news page for news with that name.
So for that i need to retrieve into an array every news subject and stuff to compare to my given string and then display only what i want.
Thank you.
PHP Get Data from other websites
- vitinho444
- Posts: 2825
- Joined: Mon Mar 21, 2011 4:54 pm
Re: PHP Get Data from other websites
You def. can use API in PHP. Not sure how, but I know you can!
- vitinho444
- Posts: 2825
- Joined: Mon Mar 21, 2011 4:54 pm
Re: PHP Get Data from other websites
what API? I googled it and people talked about a function built in php but I can't make it work :S Maybe there's other methods.
- Jackolantern
- Posts: 10893
- Joined: Wed Jul 01, 2009 11:00 pm
Re: PHP Get Data from other websites
There are 2 methods to get data from another website: web services and screen scraping. A web service uses a publicly available interface to request data, often in JSON or XML format that you can use. Web services are the basis for "mashup" websites that combine data from many different sources. Of course, to use a web server, the data source has to actually publish it and make it available. Here is some more info and links on how to use web services. Halls also created a program that is somewhere around here that accesses a web service for stock info, which was his stock market game. You may not be able to use the website you intended since they may not maintain a web service, but there are literally thousands out there, so it is likely somebody has a web service that offers what you want for free (many web services are not free). Here is one of the definitive web service directories.
The second method is quite a bit more complicated, and should only be used if there is no available web service. However, it can be made quite a bit easier by several libraries out there. Screen scraping is basically where you download an entire HTML file from another site and parse out the info you want from it. The more complicated aspect is building up the DOM inside of PHP and then traversing it, but there are libraries for that. Here is a tutorial that can help get you started if you have to go that route.
EDIT: Here is Halls' stock market game that consumes stock market web services.
The second method is quite a bit more complicated, and should only be used if there is no available web service. However, it can be made quite a bit easier by several libraries out there. Screen scraping is basically where you download an entire HTML file from another site and parse out the info you want from it. The more complicated aspect is building up the DOM inside of PHP and then traversing it, but there are libraries for that. Here is a tutorial that can help get you started if you have to go that route.
EDIT: Here is Halls' stock market game that consumes stock market web services.
The indelible lord of tl;dr
- vitinho444
- Posts: 2825
- Joined: Mon Mar 21, 2011 4:54 pm
Re: PHP Get Data from other websites
Wow that's a nice explanation.
I've followed a tutorial about stock market analyzer, that pulled data from yahoo finance from a xls file that can be read with php line by line.
I never knew those webservices existed, i checked up on some about sports and live scores, and got this one: http://www.programmableweb.com/api/visu ... tball-pool
But now how to start getting the data i need / want?
Maybe the easiest way is the hardest by screen scraping?
I've followed a tutorial about stock market analyzer, that pulled data from yahoo finance from a xls file that can be read with php line by line.
I never knew those webservices existed, i checked up on some about sports and live scores, and got this one: http://www.programmableweb.com/api/visu ... tball-pool
But now how to start getting the data i need / want?
Maybe the easiest way is the hardest by screen scraping?
- Jackolantern
- Posts: 10893
- Joined: Wed Jul 01, 2009 11:00 pm
Re: PHP Get Data from other websites
No, it would still be easier and much more efficient to simply learn how to use the web service if it offers what you want. There are also legality issues with screen scraping if you re-upload data you scraped off of another webpage, whereas you are generally able to use data from a webservice you legally access in any way you want.
The web service you linked is a SOAP web service, which is a type of WS protocol. Here is the PHP manual page for the PHP SoapClient, which allows you to access SOAP WS. Here is a short tutorial on nuSOAP, which is a PHP SOAP WS library. And did you see the listing of the public interface for that web service?
The web service you linked is a SOAP web service, which is a type of WS protocol. Here is the PHP manual page for the PHP SoapClient, which allows you to access SOAP WS. Here is a short tutorial on nuSOAP, which is a PHP SOAP WS library. And did you see the listing of the public interface for that web service?
The indelible lord of tl;dr
Re: PHP Get Data from other websites
Hey man i'm also doing a similar project in which consists of getting data from groups in facebook in which people exchange items.
For screen scraping first you would need a library like Curl. Which can get you the HTML of a website. A script like this will work.
You could then use DOMDocument and XPath to get the value inside a div for example e.g.
That's just a quick example. But it's not always nice and sometimes you will not get the results you want, and by experience it can be a great headeche dealing with HTML. Since not everybody respects web standards. The example above wont work with facebook it's such a mess.
Oh and DOMDocument it's enabled by default on PHP but not the CURL library.
Good luck
For screen scraping first you would need a library like Curl. Which can get you the HTML of a website. A script like this will work.
Code: Select all
<?php
$cookie_file = "/".time();
$url = "http://www.uefa.com/";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0)');
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Accept-Language: en, es-es"));
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_COOKIEFILE, $cookie_file);
curl_setopt($ch, CURLOPT_COOKIEJAR, $cookie_file);
$html = curl_exec($ch);
$error = curl_error($ch);
curl_close($ch);
echo $error;
echo $html;
?>
Code: Select all
$dom = new DOMDocument();
@$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
// getting the text inside a div id = 'aaa'
$resultDom = $xpath -> evaluate("//div[@id='aaa']");
$value = $resultDom -> item(0) -> nodeValue;
Oh and DOMDocument it's enabled by default on PHP but not the CURL library.
Good luck
Orgullo Catracho
Re: PHP Get Data from other websites
vitinho444 wrote:Hi there IR
I wonder if it's possible (with php everything is possible right? ) to get data from other websites.
Example:
I know a news website and i wanted to create an algorithm (i don't even know what this means but more like a system) that catches all the news regarding a subject i point. Like: Sporting (my soccer team) and it searches on the recent news page for news with that name.
So for that i need to retrieve into an array every news subject and stuff to compare to my given string and then display only what i want.
Thank you.
Every computer program is essentially an algorithm. An algorithm is just a series of steps or instructions, like a recipe. Obviously some are far more advanced than others.
Check this out, algorithms go back to before 1600 B.C.:
http://en.wikipedia.org/wiki/Timeline_of_algorithms
Very cool stuff!
"In order to understand recursion, one must first understand recursion".