Grubhub went looking for help. It came in the shape of Melissa Schreiber, a culinary school grad and author of two books about the food of Brooklyn. “I came in and they handed me the classifications of all the menu items on our platform, and they weren’t organized into usable categories for search,” Schreiber says. “I basically tuned up what the data had turned up.”
Schreiber created a cuisine dictionary for the data team that broke down the ingredients in many of the dishes, an internal document that included names of cuisines, history, sometimes maps to show the geographic relationships. She built decks to explain to the data scientists dishes that didn’t have obvious names. “The taxonomy was obviously data driven, and it needed that human touch, that finesse of somebody that understood food more than data,” Schreiber says.
She helped the team map dishes to cuisines, drawing lines like the one between Japanese curry rice and Indian curries, let’s say, or how to separate tacos from burritos. “Do you have Sushiritto in San Francisco?” Schreiber asks me. “That was weeks of conversation. Is it sushi? Is it a burrito? Every time someone would go they’d take a picture of it and post it to me.”
Pay dirt. Days later, Wikelski’s sensors detected “dynamic body acceleration,” meaning the animals expended dramatically more energy than usual as many as 14 hours before the aftershocks hit, at times when they’d normally be asleep or docile. He’s putting the finishing touches on a study, due out later this year in the journal Science, that solidifies the concept of movement ecology—the causes and effects of organisms’ movements on the world around them. The sensors turn animals into something like environmental buoys, using them to predict and monitor things beyond earthquakes, perhaps illustrating environmental patterns with broad economic significance for humans. The big data collected from the animals can “do absolutely crazy things,” he says.
Soon the best example will be Wikelski’s Icarus project, an open-source online database designed to follow animals around the world via embedded tracking devices that relay their locations to a satellite scheduled for launch in October. He says the 16-year venture by the Max Planck Institute, the German Aerospace Center, and the Russian Federal Space Agency has tagged dozens of mammals, birds, fish, and even flying insects with the tiny tracking sensors, usually with super glue. With help from volunteers who sign up online, he says, he expects to hit a few thousand by the end of next year.
The financial services industry is going high-tech, and so the behemoth corporate organizations behind the big-name banks, asset managers and insurance companies are facing a fundamental challenge: how to attract, train and retain a very different sort of talent. “We are a tech company,” Goldman Sachs CEO Lloyd Blankfein has said on a number of occasions — admittedly, Wall Street execs have been saying this for years — but it’s finally beginning to ring true. Goldman now employs around “9,000 individuals in various engineering roles,” according to a recent company report, out of the company’s 33,000 or so employees. More than one in three new recruits are STEM majors.
Gone are the days when salespeople, traders, bankers and managers ruled Wall Street — in 2000 Goldman had 600 equity traders; now it has two. The technological shift is forcing finance companies to reexamine their recruiting practices and reinvent their corporate culture to bring a little Silicon Valley style to Wall Street.
In this post, I cover the data science which went into the analysis of mountains of data before, during and after the America’s Cup races. Oracle invited me and a few other analysts to witness a few races last weekend in Bermuda. It is a vendor of many of the technologies used by the USA team. In Part 1, I wrote about the design thinking and material science which has gone into the AC yachts. In Part 2, I discussed the extreme athletic conditioning and performance of the crews. In Part 3 I covered the broadcast technology which brought the races to millions of fans around the world.
Spurred by Moneyball, every sport is seeing an explosion of data – injury analytics, fan analytics and stadium revenue analytics among others. ESPN and other networks have come up with new metrics and acronyms like WAR in baseball that fans have embraced. MIT hosts an annual Sloan Sports Analytics conference. Even benchmarked against the data gathered over decades in other sports, I would venture to say the data collected and analyzed during the AC was an order of magnitude greater.
The USA team said they collected as many as 40,000 data points per second. That could amount to 500 gb of data from each practice run. Much of that came from sensors like the MEMS which measure aerodynamic pressure at 400 points on the sails, the monitoring harnesses on the crew members and the video from the on-board cameras and those on chase boats and drones. They used Oracle Exadata and Oracle R Advanced Analytics among other technologies to crunch and process the data.
Some of the data applications included:
Design and adjustment of the yacht – There was simulation through virtual wind tunnels and towing tanks using computational fluid mechanics. That required solving among others, Navier-Stokes equations which describe flow patterns in various liquids. Data from each run was also used to make adjustments to the yacht as the video below describes
Weather mapping – In the lead up to the AC finals, the teams had collected over months, millions of weather data points in and around Bermuda’s Great Sound. This is far more granular data than anything meteorology agencies track.
Crew monitoring – sensors captured heart rate, perspiration, lactic acid levels, wattage the grinders were generating and other data to help them fine tune their diets and techniques.
Race playbooks – while each team was coy about the tools and data available to the crew during the races, most had tactical tools like the Land Rover BAR team’s in video below
Broadcast – The largest data volumes came from hours of raw video feeds from helicopters, drones, chase boats and on-board cameras and the data for the LiveLine augmented reality superimposed on the frame. The photo below shows the multiple feeds into the Ross Mobile Productions’s Future is Now 5 (FIN5) production van (they supported the NBC coverage of the races). We only got to see one at a time on our TVs or mobile devices.
Drowning in data would be an appropriate metaphor for the AC. But that would be grossly unfair to the teams which used the data to literally fly in and out of the water. And exposed a new generation of fans to metrics like velocity made good.
That conversation led Mr. Ballmer to pursue what may be one of the most ambitious private projects undertaken to answer a question that has long vexed the public and politicians alike. He sought to “figure out what the government really does with the money,” Mr. Ballmer said. “What really happens?”
On Tuesday, Mr. Ballmer plans to make public a database and a report that he and a small army of economists, professors and other professionals have been assembling as part of a stealth start-up over the last three years called USAFacts. The database is perhaps the first nonpartisan effort to create a fully integrated look at revenue and spending across federal, state and local governments.
Want to know how many police officers are employed in various parts of the country and compare that against crime rates? Want to know how much revenue is brought in from parking tickets and the cost to collect? Want to know what percentage of Americans suffer from diagnosed depression and how much the government spends on it? That’s in there. You can slice the numbers in all sorts of ways.
The partnership was logical because Teledyne’s GroundLink technology is equipped on more than half of Boeing’s in-service aircraft and about 70% of the Airbus fleet—or about 10,000 aircraft total. “There has been a move to wirelessly enabled aircraft because it eliminates manual downloads,” says Cecil.
The partnership focuses on making the data flow more simply off the aircraft, enriching it with other sources to provide more value and quickening the process of delivering the value to the operators, says Nelson.
Initially the partnership will focus on post-flight data collection for GE engine-powered aircraft data, with real-time data collection later. “There is still a lot of value to be gained from post-flight data collection,” capturing the low-hanging fruit that is easy to gain, says Nelson.
Sea travel on the Caribbean became a routine for Spain, this is why it had detailed records of ship travels. Storms accounted for many of the shipwrecks in the Caribbean.
Florida Keys' tree-ring records extend all the way back to the 1707. These tree-rings show when there is a hurricane in a particular year, because the ring growth slowed down whenever one occurs. The team gathered wood samples from shipwrecks and began dating them.
The team used two books in the study to combine shipwreck data with tree-rings data, namely "Shipwrecks In The Americas: A Complete Guide To Every Major Shipwreck In The Western Hemisphere" by Robert F. Marx and "Shipwrecks Of Florida: A Comprehensive Listing" by Steven D. Singer.
MIT has developed a predictive tool it says can give ships and their crews a two- to three-minute advanced warning, allowing them to shut down essential operations on a ship or offshore platform.
Combining ocean-wave data available from measurements taken by ocean buoys with a nonlinear analysis of the underlying water wave equations, Sapsis' team quantified the range of wave possibilities for a given body of water. They then developed a simpler and faster way to predict which wave groups will evolve into rogue waves.
The resulting tool is based on an algorithm that sifts through data from surrounding waves. Depending on a wave group’s length and height, the algorithm computes a probability that the group will turn into a rogue wave within the next few minutes.
Currently, as few as 37% of puppies make it through the raising program to become successful service dogs for the blind. Given that it costs Guiding Eyes more than $40,000 to raise each dog, even a 5% increase in performance can yield the non-profit considerable savings.
The first step was to move all the data — which includes 30 years of structured genetic breeding data and thousands of unstructured questionnaire documents — to IBM Cloud.
Now, Professor Chris Tseng of San Jose State University and a group of his machine-learning students are using IBM Watson services on Bluemix to look for insight in all that data.
By combining the hard and soft data, the study will connect complex patterns, and yield useful insights that will help inform every stage of guide dog development.
Medium – thanks to Vijay Vijayasankar of IBM for sharing
Over the past two years, Under Armour has spent close to $1 billion buying and investing in three leading makers of activity- and diet-tracking mobile apps. By doing so, the company has amassed the world's largest digital health-and-fitness community, with 150 million users. Plank envisions all of those users, and their metrics, as a big data engine to drive everything from product development to merchandising to marketing.
Today, Under Armour has 13,500 employees around the world and nearly $4 billion in revenue. But Plank is still every bit the entrepreneur, chasing audacious dreams--chief among them overtaking Nike as the world's largest sportswear maker. Under Armour leapfrogged the longtime number two, Adidas, in the U.S. sportswear market in 2014, but worldwide it's still third. And Nike remains far larger, with more than $30 billion in revenue in 2015 Which is part of why Plank wants to move so aggressively. Nike has about a fifth as many users on its Nike+ platform as Under Armour does on its apps, and in 2014 the shoe giant shut down its FuelBand fitness-tracker business.
Steve Ballmer’s USAFacts
That conversation led Mr. Ballmer to pursue what may be one of the most ambitious private projects undertaken to answer a question that has long vexed the public and politicians alike. He sought to “figure out what the government really does with the money,” Mr. Ballmer said. “What really happens?”
On Tuesday, Mr. Ballmer plans to make public a database and a report that he and a small army of economists, professors and other professionals have been assembling as part of a stealth start-up over the last three years called USAFacts. The database is perhaps the first nonpartisan effort to create a fully integrated look at revenue and spending across federal, state and local governments.
Want to know how many police officers are employed in various parts of the country and compare that against crime rates? Want to know how much revenue is brought in from parking tickets and the cost to collect? Want to know what percentage of Americans suffer from diagnosed depression and how much the government spends on it? That’s in there. You can slice the numbers in all sorts of ways.
New York Times
April 18, 2017 in Analytics, Industry Commentary | Permalink | Comments (0)