With Big Data, Asking Right Questions Is Key

shutterstock_281485034 (2)

In the realm of Big Data, many people are trained to use analytics to find the right answers. But do we have enough people who can ask the right questions?

Many articles have been written in the business and tech press about intuition, the gut feeling that allows a professional to cognitively leap to an insight. It is a skill needed to ask the right question. Yet no one knows how to create an “intuitive,” much less train one. And guidance on when to use intuition is limited to “use your judgment,” which is not exactly precise, or helpful.

So where does intuition come from?

You can cultivate intuition, perhaps by finding people who have it. Or you can ignore people entirely and focus on artificial intelligence and machine learning. Which one of these methods will serve you best?

Making Space for Intuition

Intuitive people do not just walk in off the street to join a team. Intuition must be kindled somewhere in the team to make analysis work.

“It’s really hard to quantify and give a recipe for creation,” said Ted Dunning, chief application architect at MapR, a California-based software firm that develops data platforms based on Apache Hadoop. “It is hard doing the unexpected by rote. By definition, it can’t happen that way.”

Instead, one can exercise the team to think intuitively. Such thought exercises encourage approximation as a first step in pursuing exactness.

One way is to ask team members questions that they must answer with a range, explained Ellen Freidman, a solutions consultant at MapR. For example, what is the weight of a Boeing 747? Or what is the diameter of the moon? If the factual answer falls within the range, then the approximate answer is correct. Narrowest range wins.

“Developing the ability to approximate is a huge skill that goes underdeveloped,” Freidman said. Approximation does not contradict precision, nor is it a wild guess. It does allow people to solve a problem within the boundaries of error, she noted: “People need to recognize the difference between what they know and what they don’t know.”

Another key step: formally setting some time aside to think, even if it just 30 minutes a week, Freidman continued. This allows you to get to the essential concept of the problem you are tying to solve. “It leaves the door open to making new connections.” she said. “No one takes time to do this.”

Also, “you need to make room for them (the team) to be wrong,” Dunning added. “If you have room to be wrong 90 percent of the time and catch errors quickly, then you have solid preparation for high creativity.”

In the end, the purpose is to arrive at questions quickly, test them, and if they fail, discard the query. And start over, quickly. Intuition can be a guide to a question, but it is analysis which tests the premise, pass or fail.

Typecasting

Gauging the limits of intuition may still come down to the team member’s personality or academic background, noted Scott Gnau, chief technology officer at Hortonworks.

“The world is full of really great data scientists,” Gnau said. But you can’t “create” a person with a solid work ethic or intuition: “Some people are more that way. Some people are less that way.”

The process of gauging intuition is the same on both a team and individual level: come up with a question, test it with analytics to see if it yields a plausible answer, and keep testing should the query yield a dead end. The type of person who does this must be “dogged, determined to get an answer—something that makes sense,” Gnau said.

Technique must also be shared between teams as well as within them. One method is to get data scientists to compile and share best practices using Apache Notebook. “Some successful companies are using internal social networks,” Gnau added.

There has also been a proliferation of toolsets, such as Apache Zeppelin, which can be used to aid team building. Such apps allow team leaders to monitor who is communicating and who is collaborating.

In the end, “perhaps it is not about intuition and more about a willingness to experiment—try something and fail fast,” Gnau said. Keep trying until something works.

Humans Are Biased, Machines Not

“Watson Analytics is the difference between biased and unbiased discovery,” said Marc Altshuller, general manager for business analysis at IBM. His starting point: use analytics to develop questions and get answers.

This insight came from the experience of seeing users rely on IBM’s Cognos Analytics to develop insights based on Big Data. Users were taking any data they had and going straight to visualization to make their presentations. Major decisions were being made despite the fact that “no one knew where that data came from.” Altshuller pointed out, adding that the data in question was probably biased.

Those users based the premise of their inquiry on a preconceived notion and obtained data that may have fit that notion.

The alternative, empowered by Watson Analytics, relies on machine learning and artificial intelligence to search for specific terms, then show all the associated data, and the pattern in which the data emerges. Watson will also display an “interestingness pattern,” flagging any field of data that affects any other field of data, Altshuller continued. Watson will even suggest questions—called ‘starting points’—to help users sharpen a line of inquiry.

“People didn’t necessarily know how to ask a question,” Altshuller said. They will change the wording of their query, hoping to get a different answer. Or they may simply query within the bounds of conventional expectation.

Altshuller offered one example illustrating this. A company wanted to know which department had the highest attrition. A straight-up numbers crunch showed that the research and development department suffered the most turnover.

Conventional wisdom pointed to the unit’s management as the cause of attrition. But a Watson-based analysis showed an increase in commuting time of the R&D workers as a significant factor, given that the office has just relocated to a new address. Commuting time was the red flag that did not show up in the conventional query, and it was a factor that affected turnover in other departments, as well.

Single-factor cause-and effect thinking has become entrenched as industry wisdom because of the limitations of analysis 10-20 years ago, Altshuller noted. Now that query can be analyzed for multiple factors, the expected answer may be wrong and the unexpected answer may be material.

Big Data, Big Answers

The science and practice of Big Data has moved from its infancy, past proof of concept into early maturity. Fishing for insight in a big, unstructured pool of facts and figures has become practical for many companies. Projects that once took weeks now take minutes, making Big Data easier and cheaper to use.

No matter how good the science of Big Data gets, humans will be using this system. One can only hope that an open mind will see more, unbound from conventional wisdom.

Image Credit: solarseven/Shutterstock.com

Post a Comment

Your email address will not be published.