Simple PHP Adwords Click Fraud Tracker

RJM62

Touchdown! Greaser!
Joined
Jun 15, 2007
Messages
13,157
Location
Upstate New York
Display Name

Display name:
Geek on the Hill
Hi all. No, I'm not dead. I've just been unusually busy. But I thought sharing this script would be as good a reason as any to check back in to POA. :lol:

Here's the situation: One of my clients had a MAJOR problem with a particular type of click fraud (competitors clicking on his Google Adwords ads to deplete his budget and raise his costs), to the tune of thousands of dollars a month.

He asked me to look into commercial monitoring services, but they're pretty pricey and provided way more functionality than he needed; so I decided to write a simpler, home-grown script. I'm sharing it here in case anyone else is having the same problem. (Obviously, the script pages and screenshots are redacted, but the functionality should still be obvious.)

This script is very simple and doesn't have all the functionality that the commercial services do. It's designed to detect only one thing -- multiple-clicks by the same users -- and tabulate and display the information in a user-friendly manner. It also incidentally gathers other somewhat useful information, such as the pages the ads were displayed on.

The client can then send the information to Google to have them investigate the clicks, and refund the money for the bad clicks.

HOW IT WORKS

After trying out a variety of ways to track whether clicks were coming from Adwords, I became persuaded that the most dependable way was to have the client append "?source=adw" to every landing page URL in his Adwords account. This is imperfect, but less so than the other options I considered.

The script uses a MySQL database with a single table, tblAdwordsClicks, that has six rows: id, ClickDate, DestinationPage, SourcePage, RemoteAddress, and RemoteHost. It also requires a bit of code in the header of every potential landing page:

PHP:
$currentFile = $_SERVER["PHP_SELF"];
$parts = Explode('/', $currentFile);
$currentPage = $parts[count($parts) - 1];
$currentPage = "www.[domain.tld]/" . $currentPage;
if ($_GET["source"]=="adw") include("tracker/gatherer.php");
"domain.tld" being replaced with the actual domain.

If the variable is detected, it includes gatherer.php:

PHP:
<?
$date = time();
$ipaddy = $_SERVER['REMOTE_ADDR'];
$adpage = $_SERVER["HTTP_REFERER"];
$hostname = gethostbyaddr($ipaddy);

// ignores google's various robots
if    (
    (preg_match("/1e100\.net/i", $hostname)) ||
    (preg_match("/rate\-limited\-proxy.*google\.com/i", $hostname)) ||
    (preg_match("/postnews.*google\.com/i", $hostname)) )
    {
    die;
    }

// connect to database
$con = mysql_connect("localhost","dbUserName","dbPassword");
if (!$con)
    {
    die('Could not connect: ' . mysql_error());
}

// store data
mysql_select_db("dbName", $con);
mysql_query("INSERT INTO tblAdwordsClicks (DestinationPage,SourcePage,RemoteAddress,RemoteHost) VALUES ('$currentPage','$adpage','$ipaddy','$hostname')");
mysql_close($con);
?>
which stores the data.

When the client wants to view the clicks coming in from Adwords, he logs in (I'm omitting all the password stuff) and index.php opens:

PHP:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Adwords Click Tracker</title>
<link href="tracker.css" rel="stylesheet" type="text/css" />
</head>
<body>
<div id="content">
  <h1>Click Tracker Dashboard </h1>
  <p>&nbsp;</p>
  <p><a href="recent.php">Most Recent 250 Clicks</a></p>
  <p><a href="multiple_clicks.php">Multiple Clicks from Same IP</a></p>
  <p>&nbsp;</p>
</div>
</body>
</html>
Which is just a simple "dashboard" with two links. The first opens recent.php, which just displays the most recent 250 clicks coming from Adwords, with time, user's IP, hostname, sending page, and landing page:

PHP:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Recent Ad Clicks</title>
<link href="tracker.css" rel="stylesheet" type="text/css" />
</head>
<body>

<div id="content">
<p><a href="index.php">Back to Dashboard</a></p>
<p>&nbsp; </p>
<h1>Most Recent 250 Ad Clicks</h1>
<ol>

<?php
// initialize
date_default_timezone_set('America/New_York');

// connect
$con = mysql_connect("localhost","dbUserName","dbPassword");
if (!$con)
    {
    die('Could not connect: ' . mysql_error());
}

// get data
    mysql_select_db("dbName", $con);
    $result = mysql_query("SELECT * FROM tblAdwordsClicks ORDER BY id DESC LIMIT 250");
    $row = mysql_fetch_array( $result );
    $num = mysql_num_rows( $result );
$i=0;

// assign variables
while ($i < $num) {
    $timeClicked = mysql_result($result,$i,"ClickDate");
    $landingPage = mysql_result($result,$i,"DestinationPage");
    $refererPage = mysql_result($result,$i,"SourcePage");
    if (empty($refererPage)) $refererPage = '(Direct, Bookmarked, or Unknown)';
    $ipAddress = mysql_result($result,$i,"RemoteAddress");
    $hostName = mysql_result($result,$i,"RemoteHost");
    if (empty($hostName)) $hostName = 'unresolved host';

echo "<li>";
echo "<strong>Date and Time:</strong> <span class='data'>" . $timeClicked . " US Eastern Time</span><br />";
echo "<strong>Landing Page:</strong> <span class='data'>" . $landingPage . "</span><br />";
echo "<strong>Sending Page:</strong> <blockquote>" . $refererPage . "</blockquote>";
echo "<strong>Visitor's IP:</strong> <span class='data'>" . $ipAddress . "</span><br />";
echo "<strong>Hostname:</strong><span class='data'> " . $hostName . "</span><br />";
echo "</li><br />";
$i++;
}
?>
</ol>
<?
// display retrieval time
echo "<p>Report generated on " . date('l jS \of F Y h:i:s A') . " US Eastern Time<br />\nEnd of report</p>";
?>
</div>
</body>
</html>
The other link goes to multiple_clicks.php, which will look at the last 1,000 clicks coming through Adwords and find multiple clicks from the same IP addresses:

PHP:
<?php
//initialize
date_default_timezone_set('America/New_York');
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Multiple Clicks from Same IP</title>
<link href="tracker.css" rel="stylesheet" type="text/css" />
</head>
<body>
<div id="content"> <a href="index.php">Back to Dashboard</a>
<p>&nbsp;</p>
  <h1>Multiple Clicks from Same IP Addresses</h1>
  <p>Analyzing Most Recent 1,000 clicks and sorted by number of clicks (highest number of clicks first)</p>
<ol>

<?
//connect
$con = mysql_connect("localhost","dbUserName","dbPassword");
if (!$con)
    {
    die('Could not connect: ' . mysql_error());
}

//fetch data
    mysql_select_db("dbName", $con);
    $result = mysql_query("SELECT RemoteAddress,COUNT(*) as count FROM tblAdwordsClicks GROUP BY RemoteAddress ORDER BY count DESC LIMIT 1000");
    $row = mysql_fetch_assoc($result);
    $num = mysql_num_rows( $result );    
$i=0;

//assign variables
while ($i < $num) {
    $ipAddress = mysql_result($result,$i,"RemoteAddress");
    $_SESSION['ipAdddress'] = $ipAddress;
    $clicks = mysql_result($result,$i,"count");
    $hostname = gethostbyaddr($ipAddress);
    
if ($clicks > 1) {
//print IP Address Heading
echo "<li>";
echo "<strong>IP Address:</strong> " . $ipAddress . " (" . $clicks . " clicks)<br />";
echo "<strong>Hostname:</strong> " . $hostname . "<br />";
echo "<a href='details.php?ip=" . $ipAddress ."' target='_blank'>Details</a>";
echo "</li><br />";
}
$i++;
}
?>
</ol>
<?
// display retrieval time
echo "<p>Report generated on " . date('l jS \of F Y h:i:s A') . " US Eastern Time<br />\nEnd of report</p>";
?>
</div>
</body>
</html>
along with links to the details of the clicks for each IP. That link will open details.php:

PHP:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>IP Address Details</title>
<link href="tracker.css" rel="stylesheet" type="text/css" />
</head>
<body>
<div id="content">
<ol>

<?php
// initialize
    unset($ip);
    $ipAddress = $_GET["ip"];
    date_default_timezone_set('America/New_York');

//connect
$con = mysql_connect("localhost","dbUserName","dbPassword");
if (!$con)
    {
    die('Could not connect: ' . mysql_error());
}
// sanitize GET data
    $ipAddress = mysql_real_escape_string($ipAddress);

// fetch data
    mysql_select_db("dbName", $con);
    $result = mysql_query("SELECT * FROM tblAdwordsClicks WHERE RemoteAddress='$ipAddress' ORDER BY id DESC LIMIT 100");
    $row = mysql_fetch_array( $result );
    $num = mysql_num_rows( $result );
$i=0;

// display heading
echo "<h1>Click Tracker Ad Click Details for IP " . $ipAddress . "</h1>";
echo "<p>Of the most recent 1,000 ad clicks, " . $num . " clicks were from IP address " . $ipAddress . ". They are listed here in order of recency (newest clicks first).</p>";

//assign variables
while ($i < $num) {
    $timeClicked = mysql_result($result,$i,"ClickDate");
    $landingPage = mysql_result($result,$i,"DestinationPage");
    $refererPage = mysql_result($result,$i,"SourcePage");
    if (empty($refererPage)) $refererPage = '(Direct, Bookmarked, or Unknown)';
    $hostName = mysql_result($result,$i,"RemoteHost");
    if (empty($hostName)) $hostName = 'unresolved host';


// display data
echo "<li>";
echo "<strong>Date and Time:</strong><span class='data'> " . $timeClicked . "</span><br />";
echo "<strong>Landing Page:</strong><span class='data'> " . $landingPage . "</span><br />";
echo "<strong>Sending Page:</strong><blockquote> " . $refererPage . "</blockquote>";
echo "<strong>Hostname:</strong><span class='data'> " . $hostName . "</span><br />";
echo "</li><br />";
$i++;
}
?>
</ol>
<?
echo "<p>End of record for IP " . $ipAddress . ".</p>";
echo "<p>Report generated on " . date('l jS \of F Y h:i:s A') . " US Eastern Time</p>";
?>
</div>
</body>
</html>
This is a simple script and still needs some tidying up (the client was at wits end and wanted an immediate solution), but it does do the one thing it was designed to do, and already has uncovered quite a few instances of fraudulent clicks. The client then takes screenshots of the "details" pages and sends them off to Google with the Click Quality Form for reimbursement.

To their credit, in most cases, Google has already detected the fraud and no-billed it before my client even submits the Click Quality Form. But they've missed a few cases, which Google promptly acknowledged and made good on when they were brought to their attention.

But the client still likes being able to get real-time information on clicks, and he's also looking at taking action against the biggest offenders who are multiple-clicking his ads, once we have enough data to identify them.

-Rich

ADDENDUM: here's the stylesheet, tracker.css, if anyone wants it.

Code:
body {
    margin: 0;
    padding: 0;
    background: #BAF9FC;
    font-family: "Palatino Linotype", "Book Antiqua", Palatino, serif;
    font-size: 12px;
}

h1 { font-size: 14px }

#content {
    margin-left: auto;
    margin-right: auto;
    margin-top: 10px;
    margin-bottom: 10px;
    min-height: 350px;
    background: #FFF;
    width: 95%;
    padding: 10px;
    border: #03C 2px solid;
    border-radius: 10px;
    moz-border-radius: 10px;
}

.data { position: absolute; left: 170px  }

li { background: #FFF; word-wrap: break-word; height: auto }

li:hover { background: #FF9; }
 

Attachments

  • dashboard.jpg
    dashboard.jpg
    21.7 KB · Views: 22
  • most_recent.jpg
    most_recent.jpg
    36.5 KB · Views: 24
  • IP_details.jpg
    IP_details.jpg
    37.6 KB · Views: 23
  • multiple_clicks_summary.jpg
    multiple_clicks_summary.jpg
    38.1 KB · Views: 23
  • table_describe.jpg
    table_describe.jpg
    40.2 KB · Views: 23
Last edited:
A little refinement was made necessary by the unexpected number of users who bookmark the landing page -- along with the source variable -- making it appear that they multiple-clicked the ads.

To me, these clicks were obviously to be ignored; but to the client, not so much. Besides, they clogged the database and the output. So I inserted a few lines to ignore them based on the absence of referrer information for those clicks. So:

PHP:
// ignores google's various robots
if    (
    (preg_match("/1e100\.net/i", $hostname)) ||
    (preg_match("/rate\-limited\-proxy.*google\.com/i", $hostname)) ||
    (preg_match("/postnews.*google\.com/i", $hostname)) )
    {
    die;
    }

was changed to:

PHP:
// ignores google's various robots and users who bookmarked the ad pages
if    (
    (preg_match("/1e100\.net/i", $hostname)) ||
    (preg_match("/rate\-limited\-proxy.*google\.com/i", $hostname)) ||
    (preg_match("/postnews.*google\.com/i", $hostname)) ||
    (empty($adpage)) ) // ignores bookmarks or direct entry
    {
    echo "<meta http-equiv=\"refresh\" content=\"0;URL=http://" . $currentPage . "\">";
    die;
    }

This kills the tracker script and redirects the visitor to the intended page, minus the variable used for tracking.



-Rich
 
Back
Top