Saturday 19th of May 2012

logo

Technology
Data scraping using cURL in PHP PDF Print E-mail
User Rating: / 19
PoorBest 
CodeFire Blog - Technology
Written by Pranjal   
Friday, 01 July 2011 15:23

(Please ensure that you are a genuine user of the site, and site allows you to do some automation and does not consider it as hacking attempt)

Data scraping

Let’s assume that the task for the day is to pull some data out of a site (eg. download a csv file) programmatically where data is behind a login. We shall use PHP/cURL for this automation. I am not going to be talking about cURL or any basic technical details associated with it. So please do not treat this as tutorial for learning cURL J. I am only going to talk about high level process to be followed and some gotchas that could make your life difficult in this process.

First step: You will need to login to the site using cURL. Inspect the login form using a tool such as Firebug or view source to see what all fields are being sent and what is the endpoint of the request. You will need all this information to send login request to the site using cURL. In fact, make a small HTML version of form (using code from the site) on local machine and try and login with that. If it works part of the job is done.

Another point that could be very helpful here is, if the form uses POST request convert that to GET and send the request to a local url to see what all parameters are passed. Sometimes there could be some hidden variables which are not very easy to track. Now inspect the query string from the GET request and create a url for cURL POST based on this string. One important point while writing login request is not to forget saving the cookie. So set the option CURLOPT_COOKIEFILE with filename. Also, you could get the filename using $tmp_fname = tempnam("/tmp", "COOKIE"); to make it platform independent (windows, Linux, Mac)

Every site comes with its own site of rules for login but broadly keeping above points in mind you should be able to login to any site using cURL.

 

Second Step: After login the next step is to get the file and save it on disk. If the site URL is simple (non dynamic) then there is no problem just invoke a simple GET request for that URL with cURL and save it to disk. However if the URL for the file is dynamic then you need to fetch the page which has the link, search for the link in the page of that text (knowledge of REGEX would come handy here) and get the dynamic URL. And then invoke cURL again on the dynamic URL to get the data. One point to keep in mind in case you are dealing with dynamic URL is that when you get the string for URL in php variables if there is any & it gets converted to & so if you directly invoke the url to get the data it will not work. Use htmlspecialchars_decode to get actual URL and you should be able to save the data.

 

 

 

 

 
Open Source option (Gammu) to connect your Phone PDF Print E-mail
CodeFire Blog - Technology
Written by Pranjal   
Sunday, 26 June 2011 12:25
I started this experiment to be able to test the integration of a web based (PHP) application and SMS using a phone connected to machine on which web server is running. 
Gammu seems to be a good open source option to setup PC to phone connection. Gammu comes with multiple utilities such as Wammu (which adds UI based interface to this).  Since I was looking specifically for SMS based options, there is a utility called gammu-smsd which comes bundled with Gammu which runs as daemon and can be used to integrate incoming and outgoing SMS.  It comes with multiple options for integration such as file based integration, SQL (MySQL, PostgreSQL, SOL Lite) based integrations. 
While trying Gammu or gammu-smsd one of the first things to so it set up configuration file so that these applications can connect with Phone and if needed DB (which was true in our case). Config file follows ini file pattern and all the options are very well explained at gammu-config , for gammu-smsd config options look at gammu-smsd config. After going through these links I created a simple config file 
 
---------------------------------
[gammu]
device = com3:
connection = at115200
[smsd]
Service = sql
Driver = native_mysql
PIN = 1234
LogFile = c:\syslog
User = root
Password = password
PC = localhost
Database = sms  
----------------------------------------
 
  I connected my Nokia 5800 using Bluetooth with my windows based machine. Com3 was the port on which my phone was connected. After setting up the config file I tried the information command from command line and got following output
 
 
So that means the connection was established.  After this the next step I tried was connecting gammu-smsd. Gammu comes with db schema for creating database so that gammu-smsd can connect with it. The schema is located in source code under folder 
Gammu-1.29.0-Windows\Gammu-1.29.0-Windows\share\doc\gammu\examples\sql
 
I created the db named smsd and imported the tables and ran the command to connect gammu-smsd and got error
 
 
 

I thought I had setup the DB etc correctly but it was somehow looking foe sms database and not smsd (Initially I had set ‘smsd’ as database name in config file). So it seems there is a bug in version 1.29.0 that it always looks for sms named database. Realizing this I modified the DB name and modified the config file and then the gammu-smsd ran successfully.
It seems that gammu is not able to send SMS using Nokia 5800 Express music since I got following output 
 
 
 

 Same is reported on Gammu site as well (Nokia 5800).   I think I will have to get hold of a supported phone to test outgoing and incoming messages. 
Gammu also comes with PHP classe gammu, which has wrapper to send and other functions exposed by gammu which can easily be used to integrate with PHP. 
So it seems very much possible to be able to create a web based application that can send SMS using a GSM phone connected via Gammu. 

 
Adding custom Facebook tab PDF Print E-mail
CodeFire Blog - Technology
Written by Pranjal   
Friday, 24 June 2011 12:34
CodeFire Facebook page

 

Facebook gives you a functionality to add facebook applications as tabs on your existing pages. Also there are (programmatic) ways to be able to find out if users have liked your page or not. This functionality can very well be used to 'entice' users to like your pages. We have added a "Welcome" tab to our Facebook page. Do take a look at the same and let us know what you think about it. This application is completely CMS driven, so you can change what displays on the pages without knowing any HTML. If you like the application and want to know more about it mail us at fbapp (at) codefire.in

 



 Copyright © 2010 CodeFire Technologies Pvt Ltd All Rights Reserved.