Ever since reports started coming out that there was massive click fraud taking place on Google's AdSense platform, I remember thinking that it was the end for Google. After all, virtually all of Google's revenues come in the form of what are essentially commissions for getting people to look at other companies' advertisements. And given how easy it is to build a simple web bot to randomly click through those ads, I had a hard time understanding how those clickstreams weren't getting polluted enough with massive amounts of phony clicks that the companies bank rolling Google's financial future wouldn't be up in arms and all but drop Google in the water. I have since come to the realization that Google must have some very smart folks, probably dozens of PhDs, working on combatting click fraud using intelligent statistical algorithms and luckily have thus far have been able to stem the tide by deducting the majority of any revenues coming from fraudulent clickthroughs as chargebacks. But still no algorithm will ever be 100% fool proof against click fraud, not even close.
I mean let's think for a moment about what kinds of things you have at your disposal to combat click fraud if you're Google. You can look at where the ip addresses are coming from and thus set up detectors to monitor any clickstreams above a certain threshold all coming from the same ip. You can look at who the referrer is, the page that was visited before clicking on the ad, and assume any clickthroughs that didn't have accompanying referrers are fraudulent. You can look at historical trendlines and detect any highly bursty clickstreams whose volumes are extreme and assume that some of those ips are probably fraudelent and should be further investigated. Probably the best way to do it and the way I would imagine Google is doing it collect a large historical dataset of traffic signatures for ips that are known to be real humans as well as those that are known to be fraudulent, and build a machine learning model that will automatically classify ips as being either fraud or not fraud.
But even if you use advanced machine learning algorithms, it will always be a game of cat and mouse since 1) machine learning algorithms are trained on historical data and thus can only detect patterns it has already seen and 2) if fraud signatures are indistinguishable from non-fraud isgnatures then they can't be detected. So all a fraudster has to do is figure out what the signature for a non-fraud ip looks like, e.g. no more than one click per ip of a given ad and no more than twenty ads clicked on per ip per day, and he can always stay under the radar. See this article for an example of what I'm talking about.
Now one might think the hardest part for an individual fraudster may be collecting enough ip addresses so that each one can stay below Google's radar yet the fraudster could use them collectively to perpetrate massive amounts of fraud. But this is simply not the case. Talking about the why IP banning is ineffective for stopping spam, Adam Kalsey writes:
... IP addresses are
very easy to get or fake for spammers who care about such things. There
are hundreds of thousands of open proxies that will let anyone direct
Web traffic through them. When I’m using an open proxy, my IP address is effectively masked. And I can use simple software to switch to a different open proxy (and thus a different IP address) every few minutes. So my spamming activity isn’t tied to a specific IP address.
Hypothetically speaking, if the problem of open proxies were to
disappear overnight, there are two other mechanisms that provide a
limitless set of IP addresses to spammers: dialup and spoofing.
Which all leads me to believe one day a rogue trader and a rogue programmer (might even be the same person) might conspire to carry out what I call "the great Google heist." The idea is simple. Slowly accumulate millions of dollars worth of of out-of-the-money puts on Google under different false pseudo-names with long enough expiration dates for the earlier trades so that they don't expire too soon. Simultaneously pollute Google's ad clickstream so heavily, while staying below Google's fraud detectors, that Google's ad revenues rise far above what they were before the heist. Then wait. First for Google's corporate clients to start complaining and eventually drop Google in the water for the ridiculous amounts of money they are being asked to pay. Then wait for the news media to develop such a frenzy around the issue that many of Google's shareholders feel the panic to such an extent that they begin to sell their shares. Once the whole situation hits a blazing crescendo, start selling all of those puts.
While it is relatively unlikely that such a scenario will ever be carried out as I have envisioned it, given the likelihood of getting caught, the fact that such a scenario can even be envisioned has always made me scared of ever buying any Google stock. While so far it's all been to my detriment, I still contend that Google, so long its has no alternative sources of revenue, and even if had a P/E ratio closer to 30 instead of above 100, will always be an extremely risky investment.