Join leaders from MIT and Vizio for a Remote IoT Development Virtual Summit.

Subscribe to our blog to get the latest articles straight to your inbox.

“The ability to autonomously learn and evolve as new data is introduced — without explicitly programming to do so — is the holy grail of business intelligence.”Carlton Sapp, research director at Gartner.


Big data. Machine learning. They go together like, well, a horse and carriage. Or maybe more like jet fuel and a jet. Without machine learning (ML), big data doesn’t reach its full potential. Without data, machine learning algorithms can’t learn.

So, to briefly review: In machine learning, algorithms are fed data and asked to process it without specific programming. It’s all about pattern recognition and self-learning. Machine learning algorithms learn and improve from experience without being explicitly programmed. They evaluate data, learn, and change their behavior accordingly. This is similar to the way we humans learn — at least when we’re at the top of our game.

The results are most accurate when the machine has access to massive amounts of data to refine its algorithm. That’s where big data comes in. Big data machine learning algorithms can yield big results.

So what is big data?

Data is the Fuel

Gartner defines “big data” as “high-volume, -velocity, and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making.”

That would take more than a blog post to unpack, so we’ll leave it there. SAS has a far simpler, business-centric one. Big data "describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves."

Big data also makes your machine learning applications smarter.

“Data is the fuel for machine learning, as you need data to train algorithms,” said Markus Noga, SAP’s SVP of machine learning, in an interview with Networks Asia. “Depending on the quality of the underlying data, ML can even outperform human performance (e.g., in image recognition). Hence, enterprises need to capture, prepare, and clean data in order to build intelligent models.”

Big data, properly deployed, enables businesses to make informed, real-time decisions. Big-data machine-learning examples abound. Deji Atoyebi, founder of Cray Visibility, offered the following, relatively simple example in a column for Entrepreneur. In 2009, Orbitz, using machine learning, discovered a pattern in its data: Mac users were willing to spend significantly more per night for hotel rooms compared to Windows users. So Orbitz displayed costlier hotels to Apple users higher in the list.

Here’s another one: Mayo Clinic and IBM Watson Health have partnered to develop a machine-learning tool that accurately matched breast cancer patients with relevant clinical trials. Preliminary results? Mayo Clinic saw an 80 percent increase in trial enrollment.

Combining big data with machine learning improves business intelligence, and it doesn’t really matter what business you are in. McKinsey has identified more than 400 use cases across 19 industries and nine business functions.

Preparing the Data: Still Old School

Machines may be smart, but you can’t just dump raw data and expect stellar results. It must be relevant data, cleaned, prepared and — if it isn’t already — digitized.

It’s not surprising that data preparation remains a huge bottleneck for organizations. Sixty percent of IT professionals spend half or more of their time at work on data quality assurance, cleanup or preparation; organizations are spending over $450 billion on data preparation, according to a survey from Trifacta, a company that makes data preparation technology.

Context Matters

Machine-learning algorithms can identify patterns, correlations, etc. But they aren’t going to know which data are relevant. TechEmergence offered the following example of how and why context matters.

An algorithm could be given data about historical customer lifetime value, without taking into account that many of the customers with the highest lifetime value were contacted via a phone outreach program that ran for over two years but failed to break even, despite generating new sales. If such a telephone follow-up program will not be part of future eCommerce sales growth, then those sales shouldn’t have been fed to the machine.

It gets back to the advice we’ve given many times before: Know your use cases. Have a business problem to solve.

A Virtuous Circle

The relationship between big data and machine learning is often described as a virtuous circle: Data fuels better insights via machine learning. Here’s how market intelligence company Aberdeen frames it:
  • AI-ingested data enables improved insights.
  • Improved insights result in better products.
  • Better products attract even more users.
  • Users generate new data, which AI ingests, and the virtuous cycle begins anew.

Teradata VP Anu Jain also calls it “a virtuous circle.” He explains: “Machine algorithms are virtually useless without clean data. It’s the old garbage in equals garbage out axiom. However, if you feed them clean data, they can enhance your insights far beyond [what] your non-ML-augmented analytics can achieve. And, ML algorithms can also help themselves by helping you clean your data. That’s a win-win.”

That only works if the cycle continues: It’s never once and done. If you decide to embrace the power of big data and machine learning, it’s a commitment. It’s an iterative process.

Too many companies think machine learning is a “one-shot process,” Dr. Danko Nikolic, brain scientist and AI expert, told TechEmergence. It’s a common mistake: “They send data to data scientists, and data scientists send back THE model. In contrast to that, finding good ML solutions is an iterative process that involves research, trials and errors, experimenting, talking to the business experts, etc.”

Next Steps: Your Machine Learning and Big Data Projects

Now is the time: 90 percent of all the data in the world has been generated over the last two years. “Data and analytics have even greater potential to create value today than they did when companies first began using them,” according to a McKinsey Global Institute report.

As with any important technology, organizations that move quickly into machine learning will be able to create significant value; others risk being left behind.

To derive the most value from machine learning requires trained talent, expert guidance and, usually, a tremendous amount of “data cleansing,” TechEmergence counsels. “If Google, Amazon, and Facebook could get their interns to set up ML systems, would they really be spending millions and millions of dollars to scoop the world’s top AI talent out of academics to work for them?”

You need big-data machine-learning tools and expertise to spin data into intelligence. If you are ready to turn big data into big insight, we have the expertise you need. Let’s get that virtuous circle spinning.