How ‘Big Data’ is an inescapable force that is shaping the future

In the digital world, everything a person does can be recorded and analysed and more than ever, this data is being used - for both good and bad.

Bletchley Park, Buckinghamshire – the British intelligence hub during the Second World War where codebreakers used theorems to unravel Nazi ciphers. Those theorems are still in use. SSPL / Getty Images
Powered by automated translation

You’ve spent an eternity tracking down and applying for a loan online, and finally get to the bit about “I have read the terms and conditions”. You tick it and press “send”. And through the wonders of technology, in an instant, you get a decision: “We regret to inform you that your request has been rejected.”

What did you do wrong? You were honest about your income and haven’t missed a credit card payment in years.

In fact, your application was doomed when you muttered “Yeah, whatever” when you got to that tickbox about the T&Cs.

No, your computer didn’t hear you say it, or read your mind. It didn’t need to. It simply checked how long you spent reading the T&Cs, found you hadn’t bothered and duly binned your application on the grounds you’re too cavalier about financial risk.

Welcome to the brave new world of Big Data, where all your online behaviour is sliced, diced and teased apart by computers to find out all about you.

It’s a world whose landscape will be explored in depth at the GCC Big Data Summit, which opens today at The Address Dubai Marina.

And it’s a world in which the UAE will become increasingly involved. Deputy Prime Minister Sheikh Saif bin Zayed has already dropped hints that the Government is about to announce a major Big Data initiative for the region.

The level of interest in Big Data is clear from the list of attendees, which include Abu Dhabi Commercial Bank, Etisalat, Daman health insurance, IBM and Microsoft.

With its potential to boost profitability and customer satisfaction, there’s already talk of Big Data becoming as vital to the region’s future as oil.

Inevitably, the race to make the most of Big Data has also sparked concerns about invasions of privacy.

Comforting assurances are routinely made about personal identities being fully protected by anonymisation. Yet these are looking increasingly hollow in the face of recent demonstrations of the power of computers to piece together our IDs from fragments of Big Data.

In January, the journal Science published disturbing evidence of this, in a study by data scientists given credit card records of more than one million shoppers.

Despite having no access to any personal data, the researchers showed that 90 per cent of the shoppers could be successfully identified using just the dates and locations of a few random purchases.

This followed a similar analysis of anonymised taxi cab records in New York City, which revealed the identities and travel details of passengers – including some Hollywood stars.

The potential for criminals using the same techniques to plan kidnaps – or worse – hardly bears thinking about.

Whether the benefits of Big Data are worth the threats has yet to be settled. But it is already clear that the billions of terabytes online are at the centre of an unseen war taking place all around us.

Bizarrely, a short mathematical formula first derived more than 250 years ago is emerging as one of the most powerful weapons in this 21st century war.

Known as Bayes’ theorem, it was used by Alan Turing and his colleagues at Bletchley Park, England, to break Nazi codes, including the notorious Enigma ciphers. Now it is being combined with Big Data to halt the advance of fraudsters, hackers and other cyber criminals.

Named after an English amateur mathematician, Bayes’ theorem was developed in the 18th century to solve a simple but fundamental problem: how to turn data into insight.

Put simply, the theorem shows how to update existing beliefs in the light of fresh evidence.

For example, if you believe a coin is fair but then you see it land heads-up 10 times on the trot, Bayes’ theorem allows you to calculate the impact of the evidence you’ve seen on your prior belief about the coin.

Turing and his colleagues used the theorem to turn Big Data – in the form of intercepted signals – into evidence about how enemy cipher machines had been set up.

Now the theorem is proving its value once again. Fraudsters are being fought by financial institutions using so-called Bayesian classifiers, which are trained on huge data sets to spot the telltale traits of fraudulent behaviour.

Meanwhile, hackers are being kept at bay using a new approach to cyber security, with Bayes’ theorem at its core.

Until now, the standard means of keeping intruders out of computer networks has been to build ever-higher firewalls, in the form of tougher passwords and other obstacles.

When these fail, the usual response has been simply to upgrade the standard, which works well until the hackers upgrade their techniques to match.

Yet there’s one thing that never changes about hackers: by definition, they’re not authorised system users. As such, no matter how long they pretend otherwise, they eventually have to behave like hackers – snooping for sensitive files, for example, or attempting to link with outside agencies.

In short, hackers – be they electronic or human – have telltale signs that can be learnt and looked for.

Now growing numbers of cyber security companies are combining this strategy with Bayes’ theorem to give them a decisive edge in the war against hackers.

For example, UK-based Darktrace uses software that first identifies what normal system activity in a company looks like. This sets the baseline for the probability that hackers are at work.

The whole computer network of the company is then monitored for any signs of anomalous behaviour.

Using so-called recursive Bayesian estimation, the security software ranks the potential threats according to the chances of there being hackers. This allows countermeasures to be launched before they succeed.

Whether we like it or not, Big Data is here to stay. But those seeking to exploit it have a duty to keep it safe.

And that means ensuring they make the most of techniques such as Bayes’ theorem – just as Turing and his colleagues did in the First Big Data War, against the Nazis more than 70 years ago.

newsdesk@thenational.ae

Robert Matthews is Visiting Reader in Science at Aston University, Birmingham