178x Filetype PPTX File size 0.71 MB Source: msrg.org
THE BIGBENCH PROPOSAL End to end benchmark Application level Based on a product retailer (TPC-DS) Focused on Parallel DBMS and MR engines History st Launched at 1 WBDB, San Jose Published at SIGMOD 2013 Spec at WBDB proceedings 2012 (queries & data set) Full kit at WBDB 2014 Collaboration with Industry & Academia First: Teradata, University of Toronto, Oracle, InfoSizing Now: bankmark, CLDS, Cisco, Cloudera, Hortonworks, Infosizing, Intel, Microsoft, MSRG, Oracle, Pivotal, SAP 05.09.2014 EXTENDING BIGBENCH 2 DATA MODEL Structured: TPC-DS + market prices Structured Data Unstructure Marketpric d Data Semi-structured: website click-stream e Item Unstructured: customers’ reviews Sales Reviews Web Custome Page r Web Adapted Log TPC-DS Semi-Structured Data BigBench Specific 05.09.2014 EXTENDING BIGBENCH 3 DATA MODEL – 3 VS Variety Different schema parts Volume Based on scale factor Similar to TPC-DS scaling, but continuous Weblogs & product reviews also scaled Velocity Refresh for all data 05.09.2014 EXTENDING BIGBENCH 4 WORKLOAD Workload Queries 30 “queries” Specified in English (sort of) No required syntax (first implementation in Aster SQL MR) Kit implemented in Hive, HadoopMR, Mahout, OpenNLP Business functions (Adapted from McKinsey) Marketing Cross-selling, Customer micro-segmentation, Sentiment analysis, Enhancing multichannel consumer experiences Merchandising Assortment optimization, Pricing optimization Operations Performance transparency, Product return analysis Supply chain Inventory management Reporting (customers and products) 05.09.2014 EXTENDING BIGBENCH 5 WORKLOAD - TECHNICAL ASPECTS Generic Characteristics Hive Implementation Characteristics Data Sources #Queries Percenta Query Types #Queries Percentag ge e Structured 18 60% Pure HiveQL 14 46% Semi-structured 7 23% Mahout 5 17% Un-structured 5 17% OpenNLP 5 17% Analytic techniques #Queries Percenta ge Custom MR 6 20% Statistics analysis 6 20% Data mining 17 57% Reporting 8 27% 05.09.2014 EXTENDING BIGBENCH 6
no reviews yet
Please Login to review.