Smart Devices and Open Data Centers
May 21, 2012, Opem Source Business Conference, San Francisco—Elliot Garbus from Intel presented "Sparks of intelligence: where smart devices connect mobile users to open data centers". He noted that systems need to change to adapt to the emergence of big data.
One of the first drivers for today's data explosion was a pack of gum. 38 years ago, Juicy Fruit was the first consumer product to use barcodes. This change elevated the supply chain and started a data drive that now exceeds 7 exabytes a day and doubling every year. By mastering the data, vendors found that they could increase margins by up to 60 percent.
The sources of data are growing, smart phones up 20 percent per year, network sensors up 30 percent per year, Kia Motors is collecting data to predict vehicle issues, China is collecting lots of data on all its citizens. Now, users must apply computing to analyze all of the data. The machine-generated and social medial data are the biggest drivers for increasing data volume.
The data attributes like sensed, user generated, etc., are now helping to drive business decisions. One challenge is that data volume is projected to increase by 50X by '20, but IT budgets are expected to increase by less than 10 percent per year. To make maters worse, about 90 percent of all that data is unusable.
To address the growing gap, Anukool Lakhina from Guavus described how they changed the data flow. At Sprint, automated sensor nets generate lots of data on the datacenter. The challenge is to collect and centralize the data collection. The economics for data centers is about half to store, 3/8 to transport, and 1/8 for compute on the data. Big data changes the ratios to 30 percent for store, 10 percent for transport, and 60 percent for compute.
One way to reduce costs is to do the compute at the data source to eliminate the transport and store all data, and only transport and store the useful data. The changes in architecture require edge analytics after ingest to enable data fusion and aggregation. This new flow causes the system to perform analytics on data in motion, but reduces the data transfer to the data center by a factor of 1000.
Now, Sprint can query the remaining data and get useful information for billing purposes and for users to track their usage patterns. Now the cell phone records can get details on their data use in the same granularity as their voice minutes. The data used to be only total data volume, because the data is a constant feed. The voice data include calls, number of minutes, time of day, etc.
New opportunities require changes in economic models and compute patterns. Data from multiple sources needs to be fused and analyzed to be useful for decision making. The differences in use patterns affects all areas in a company. The changes also introduce a class of apps to address the upcoming big data.
Garbus then described the efforts Intel is making in creating platforms for solutions. They have invested in new manufacturing, develop new systems architectures and software, focus on energy efficiency in their new designs, and consider security in all facets of their work. Their efforts in software include work with the open source communities. They are the number 2 corporate contributor to the Linux kernel.