collect the videos you love
collect | share | explore
Tag results for big_data
sort by: relevance | recent
Results from all user's collections (20 out of ~20)
The results from your search appear low, try our web search for better results.
cluster computing and mapreduce lecture 1

lecture 1 in a five part series introducing mapreduce and cluster computing see http:codegooglecomeducontentsubmissionsmapreduce-minilecturelistinghtml for slides and other resources
semantically augmenting hadoop with geotemporal reasoning and social networking analytics

e-commerce sites auction sites financial institutions insurance companies and telephone companies all have event based data that describes transactions between customers social networks that are located in time and space geotemporalall these transactions together form interesting social graphs and patterns of customer behavior some of these behaviors are very interesting from a marketing perspective other behaviors might point to fraudulent actions analyzing graphs and geospatial oriented data is notoriously hard to do with typical big data solutions such as hadoop so we use a hyper scalable graph database to do this analysiswe will present a number of new technologies to make it very straightforward and user friendly to analyze behavioral patterns we discuss extending sparql 11 with a large number of magic predicates for geospatial temporal and social network analysis so that non-specialists can very easily build very powerful queries we will present new visual discovery capabilities to gruff a graphical user interface for graph search we will demonstrate how users can explore visual graphs and easily turn interesting patterns into sparql queries
high performance predictive analytics in r and hadoop

hadoop is rapidly being adopted as a major platform for storing and managing massive amounts of data and for computing descriptive and query types of analytics on that data however it has a reputation for not being a suitable environment for high performance complex iterative algorithms such as logistic regression generalized linear models and decision trees at revolution analytics we think that reputation is unjustified and in this talk i discuss the approach we have taken to porting our suite of high performance analytics algorithms to run natively and efficiently in hadoop our algorithms are written in c and r and are based on a platform that automatically and efficiently parallelizes a broad class of algorithms called parallel external memory algorithms pema039s this platform abstracts both the inter-process communication layer and the data source layer so that the algorithms can work in almost any environment in which messages can be passed among processes and with almost any data source mpi and rpc are two traditional ways to send messages but messages can also be passed using files as in hadoop i describe how we use the file-based communication choreographed by mapreduce and how we efficiently access data stored in hdfs
big data - the technology behind the solutions brighttalk

this webinar will define the attributes of big data and the behavior of big analytics that break traditional it infrastructure we will explore the technologies that enable large amounts of data to not only be streamed real time but provide the processing capabilities to sort and analyze the data to gain meaningful insight
a new paradigm for big data storage brighttalk

the unparalleled growth of unstructured and semi-structured data is leading it managers to look for new ways to scale storage systems other trends such as virtualization cloud computing and big data are also driving the need for a modern storage architecture yet many it departments are forced to deal with traditional storage challenges in this webinar we