Skip to content Skip to navigation

Faculty Candidate Talk: MapReduce Programming Model in Hadoop and Big Data Workflow Systems

Abstract: 

MapReduce is a programming model and software framework first developed by Google to facilitate and simplify the processing of vast amounts of data in parallel on large clusters of commodity hardware. Hadoop is an Apache Software Foundation project that importantly provides two things: a distributed filesystem called HDFS (Hadoop Distributed File System) and a framework and API for building and running MapReduce jobs. In this lecture, I will talk about the MapReduce in Hadoop, and then I will present the MapReduce workflow that I developed via our big data workflow system called DATAVIEW (www.dataview.org). DATAVIEW is a big data workflow system for managing data analysis processes. A big data workflow is the computerized modeling and automation of a process consisting of a set of computational tasks and their data interdependencies to process and analyze data of ever increasing in scale, complexity, and rate of acquisition.

Speaker: 
Mahdi Ebrahimi
Location: 
H350
Event Date: 
04/24/2017 - 4:15pm
Committee: 
Dr. James Abello
Event Type: 
Faculty Candidate Talk
Organization: 
Computer Science