Presentation is loading. Please wait.

Presentation is loading. Please wait.

Cloud Computing for Big Data – Monitoring & Resource Management

Similar presentations


Presentation on theme: "Cloud Computing for Big Data – Monitoring & Resource Management"— Presentation transcript:

1 Cloud Computing for Big Data – Monitoring & Resource Management

2 Cloud Monitoring Requirements
Cloud ≈ virtualization + elasticity Types of clouds: IaaS: virtual VMs and network devices, elasticity in number/size of devices PaaS: virtual, elastically sized platform SaaS: software provided by employing virtual, elastic resources Cloud is a collection of virtual resources provided in physical infrastructure Cloud provides resources elastically

3 Cloud Monitoring Requirements
Why should someone use clouds? Cloud consumer can outsource IT infrastructure No fixed costs for cloud consumer Pay for resource utilization Cloud provider responsible for building and maintaining physical infrastructure Cloud provider can rent out unused IT infrastructure Eliminate waste Get money back for overcapacity

4 Mesos簡介 由美國加州大學柏克萊分校開發
將Data Center中伺服器CPU、Memory、Storage…etc,全部加以虛擬化,並進行管理 可依使用者需求動態資源分配 以Framework來支援各種應用 Marathon  Docker Container Myriad  Big Data Analysis Platform Spark  Run Spark

5 Mesos簡介 Mesos kernel: Mesos-master Mesos-slave Framework Executor
管理Mesos-task 資源回報 Framework 計算框架 e.g. Spark Executor 啟動框架內部task MesosExecutorDiver

6 Mesos Goals High utilization of resources
Support diverse frameworks (current & future) Scalability to 10,000’s of nodes Reliability in face of failures Resulting design: Small microkernel-like core that pushes scheduling logic to frameworks

7 Element 1: Fine-Grained Sharing
Coarse-Grained Sharing (HPC): Fine-Grained Sharing (Mesos): Fw. 3 Fw. 2 Fw. 3 Fw. 1 Framework 1 Fw. 1 Fw. 2 Fw. 2 Storage System (e.g. HDFS) Storage System (e.g. HDFS) Fw. 2 Fw. 1 Fw. 1 Fw. 3 Framework 2 Fw. 3 Fw. 3 Fw. 2 Fw. 2 Fw. 3 Fw. 2 Framework 3 Fw. 1 Fw. 2 Fw. 1 Fw. 3 + Improved utilization, responsiveness, data locality

8 Element 2: Resource Offers
Option: Global scheduler Frameworks express needs in a specification language, global scheduler matches them to resources + Can make optimal decisions – Complex: language must support all framework needs – Difficult to scale and to make robust – Future frameworks may have unanticipated needs

9 Element 2: Resource Offers
Mesos: Resource offers Offer available resources to frameworks, let them pick which resources to use and which tasks to launch Keeps Mesos simple, lets it support future frameworks Decentralized decisions might not be optimal

10 Pick framework to offer resources to
Mesos Architecture MPI job Hadoop job MPI scheduler Hadoop scheduler Pick framework to offer resources to Mesos master Allocation module Resource offer Mesos slave Mesos slave MPI executor MPI executor task task

11 Pick framework to offer resources to
Mesos Architecture MPI job Hadoop job MPI scheduler Hadoop scheduler Resource offer = list of (node, availableResources) E.g. { (node1, <2 CPUs, 4 GB>), (node2, <3 CPUs, 2 GB>) } Pick framework to offer resources to Mesos master Allocation module Resource offer Mesos slave Mesos slave MPI executor MPI executor task task

12 Mesos Architecture MPI job Hadoop job MPI scheduler Hadoop scheduler
Framework-specific scheduling task Pick framework to offer resources to Mesos master Allocation module Resource offer Mesos slave Mesos slave Launches and isolates executors MPI executor MPI executor Hadoop executor task task

13 Users Twitter uses Mesos on > 100 nodes to run ~12 production services (mostly stream processing) Berkeley machine learning researchers are running several algorithms at scale on Spark Conviva is using Spark for data analytics UCSF medical researchers are using Mesos to run Hadoop and eventually non-Hadoop apps

14 安裝Mesos(1/5) 在Master與Slave,設定安裝來源,並且安裝Mesos
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]') CODENAME=$(lsb_release -cs) echo "deb ${CODENAME} main" | sudo tee /etc/apt/sources.list.d/mesosphere.list sudo apt-get update sudo apt-get -y install mesos

15 安裝Mesos(2/5) 在Master,設定Zookeeper ID為1,之後再設定zookeeper #1 IP位置,並重啟zookeeper sudo vi /etc/zookeeper/conf/myid sudo vi /etc/zookeeper/conf/zoo.cfg sudo service zookeeper restart

16 安裝Mesos(3/5) 在Master與Slave,設定zk為Master的ZooKeeper IP
sudo vi /etc/mesos/zk

17 安裝Mesos(4/5) 在Master與Slave,分別啟動不同服務 Master Slave
sudo service mesos-master restart sudo service mesos-slave restart Slave sudo service zookeeper stop sudo sh -c "echo manual > /etc/init/zookeeper.override" sudo service mesos-master stop sudo sh -c "echo manual > /etc/init/mesos-master.override"

18 安裝Mesos(5/5) 打開瀏覽器可看到我們啟動的兩個資源

19 Mesos API Scheduler Callbacks resourceOffer(offerId, offers)
offerRescinded(offerId) statusUpdate(taskId, status) slaveLost(slaveId) Scheduler Actions replyToOffer(offerId, tasks) setNeedsOffers(bool) setFilters(filters) getGuaranteedShare() killTask(taskId) Executor Callbacks launchTask(taskDescriptor) killTask(taskId) Executor Actions sendStatus(taskId, status)

20 Big Data Analysis Platform Processing Engine
YARN Framework安裝

21 Myriad 簡介(1/3) 讓Data Center中的Mesos與YARN結合 動態擴增YARN的Cluser
效能優於VM所組成的Virtual Cluster

22 Myriad 簡介(2/3)

23 Myriad 簡介(3/3) 可以依需求增加運算引擎 批次  Mapreduce 影像  Storm In-memory  Spark

24 YARN Framework安裝(1/12) 硬體建議 在Master,安裝Myriad的建置套件Gradle CPU: 4 Core
Memory: 8 GB 在Master,安裝Myriad的建置套件Gradle sudo add-apt-repository ppa:cwchien/gradle sudo apt-get update sudo apt-get install -y gradle

25 YARN Framework安裝(2/12) 在Master,從GitHub Clone Myriad的Source Code
git clone

26 YARN Framework安裝(3/12) 在Master,切換至Myriad Source Code目錄,設定Java Home後進行建置 cd incubator-myriad echo "export JAVA_HOME=/opt/java-8-openjdk-amd64" >> ~/.bashrc source ~/.bashrc ./gradlew build

27 YARN Framework安裝(4/12) 在Master,建置完成後,修改myriad-config-default.yml中的 mesosMaster, frameworkUser, zkServers vi myriad-scheduler/src/main/resources/myriad-config-default.yml

28 YARN Framework安裝(5/12) 在Master,設定myriad-config-default.yml,新增一個profile “micro” 1 core CPU, 1 GB memory vi myriad-scheduler/src/main/resources/myriad-config-default.yml

29 YARN Framework安裝(6/12) 在Master,修改myriad-config-default.yml中的 executor path, YARN_HOME, JAVA_HOME ,並將預設啟動的 vi myriad-scheduler/src/main/resources/myriad-config-default.yml

30 YARN Framework安裝(6/12) 在Master,修改myriad-config-default.yml完後,建置Scheduler與Executor ./gradlew :myriad-scheduler:build ./gradlew :myriad-executor:build

31 YARN Framework安裝(7/12) 在Master,複製所有建置好的libs到Hadoop YARN的lib資料夾
cp myriad-scheduler/build/libs/*.jar /opt/hadoop-2.7.1/share/hadoop/yarn/lib/ cp myriad-executor/build/libs/myriad-executor jar /opt/hadoop-2.7.1/share/hadoop/yarn/lib/ cp myriad-scheduler/build/resources/main/myriad-config-default.yml /opt/hadoop-2.7.1/etc/hadoop/

32 YARN Framework安裝(8/12) 在Master,設定hadoop-env.sh,增加 mesos 的 lib 路徑
vi /opt/hadoop-2.7.1/etc/hadoop/hadoop-env.sh

33 YARN Framework安裝(9/12) 在Master,設定yarn-site.xml,新增下圖紅框參數
vi /opt/hadoop-2.7.1/etc/hadoop/yarn-site.xml

34 YARN Framework安裝(10/12) 在Master,設定mapred-site.xml,新增下圖紅框參數
vi /opt/hadoop-2.7.1/etc/hadoop/mapred-site.xml

35 YARN Framework安裝(11/12) 在Master,將設定完成的完整Hadoop資料夾複製到Slave,並啟動YARN Framework scp -r /opt/hadoop Slave:/opt /opt/hadoop-2.7.1/sbin/yarn-daemon.sh start resourcemanager

36 YARN Framework安裝(12/12) 安裝完畢可調整動態開啟(Flex Up)或關閉(Flex Down) YARN的運算節點

37 Simple wordcount, sort, randomtextwriter…
Benchmarking

38 randomtextwriter 隨機產生1GB的文字檔案
/opt/hadoop-2.7.1/bin/hadoop jar /opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples jar randomtextwriter -D mapreduce.randomtextwriter.bytespermap= /1GB

39 wordcount 使用產生1GB檔案進行 /opt/hadoop-2.7.1/bin/hadoop jar /opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples jar wordcount /1GB /wordcount

40 sort 使用產生1GB檔案進行排序 /opt/hadoop-2.7.1/bin/hadoop jar /opt/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples jar sort -outKey org.apache.hadoop.io.Text -outValue org.apache.hadoop.io.Text /1GB /sort


Download ppt "Cloud Computing for Big Data – Monitoring & Resource Management"

Similar presentations


Ads by Google