hadoop-security-design

Hadoop Security Design
Owen O’Malley,Kan Zhang,Sanjay Radia,Ram Marti,and Christopher Harrell
Yahoo!
{owen,kan,sradia,rmari,cnh}@yahoo-inc大型超市管理
October2009
Contents
1Overview2
1.1Security risks (2)
1.2Requirements (2)
1.3Design considerations (3)
2Use Cases3
2.1Assumptions (3)
2.2High Level Use Cases (4)
2.3Unsupported Use Cases (6)
2.4Detailed Use Cases (6)
3RPC8
4HDFS8
4.1Delegation Token (10)
4.1.1Overview (10)
4.1.2Design (10)
4.2Block Access Token (12)
4.2.1Requirements (12)
4.2.2Design (12)
5MapReduce14
5.1Job Submission (14)
5.2Task (15)
5.2.1Job Token (15)
5.3Shuffle (15)
5.4Web UI (16)
6Higher Level Services16
6.1Oozie (16)
1
1OVERVIEW 7Token Secrets Summary17
7.1Delegation Token (17)
7.2Job Token (17)
7.3Block Access Token (17)
8API and Environment Changes18 1Overview
1.1Security risks
We have identified the following security risks,among others,to be addressed first.
1.Hadoop services do not authenticate users or other services.As a result,
Hadoop is subject to the following security risks.
2014浙江高考语文作文
(a)A user can access an HDFS or MapReduce cluster as any other user.利什曼病
This makes it impossible to enforce access control in an uncooperative
environment.For example,file permission checking on HDFS can be
easily circumvented.
(b)An attacker can masquerade as Hadoop services.For example,user
code running on a MapReduce cluster can register itself as a new
TaskTracker.
2.DataNodes do not enforce any access control on accesses to its data blocks.
This makes it possible for an unauthorized client to read a data block as long as she can supply its block ID.It’s also possible for anyone to write arbitrary data blocks to DataNodes.
1.2Requirements
1.Users are only allowed to access HDFSfiles that they have permission to
access.
2.Users are only allowed to access or modify their own MapReduce jobs.
3.User to service mutual authentication to prevent unauthorized NameN-
odes,DataNodes,JobTrackers,or TaskTrackers.
4.Service to service mutual authentication to prevent unauthorized services
from joining a cluster’s HDFS or MapReduce service.
5.The acquisition and use of Kerberos credentials will be transparent to
the user and applications,provided that the operating system acquired a Kerberos Ticket Granting Tickets(TGT)for the user at login.
6.The degradation of GridMix performance should be no more than3%.
2
1.3Design considerations2USE CASES
1.3Design considerations
We choose to use Kerberos for authentication(we also complement it with a second mechanism as explained later).Another widely used mechanism is SSL. We choose Kerberos over SSL for the follo
wing reasons.
1.Better performance Kerberos uses symmetric key operations,which are
orders of magnitude faster than public key operations used by SSL.
2.Simpler user management For example,revoking a user can be done
by simply deleting the user from the centrally managed Kerberos KDC (key distribution center).Whereas in SSL,a new certificate revocation list has to be generated and propagated to all servers.
2Use Cases
2.1Assumptions
1.For backwards compatibility and single-user clusters,it will be possible to
configure the cluster with the current style of security.
2.Hadoop itself does not issue user credentials or create accounts for users.
Hadoop depends on external user OS login,Kerberos cre-dentials,etc).Users are expected to acquire those credentials from Ker-beros at operating system login.Hadoop services should also be configured with suitable credentials depending on the cluster setup to authenticate with each other.
3.Each cluster is set up and configured independently.To access multiple
clusters,a client needs to authenticate to each cluster separately.However,
a single sign on that acquires a Kerberos ticket will work on all appropriate
clusters.
4.Users will not have access to root accounts on the cluster or on the ma-
chines that are used to launch jobs.
5.HDFS and MapReduce communication will not travel on untrusted net-
works.
6.A Hadoop job will run no longer than7days(configurable)on a MapRe-
duce cluster or accessing HDFS from the job will fail.
7.Kerberos tickets will not be stored in MapReduce jobs and will not be
available to the job’s tasks.Access to HDFS will be authorized via dele-gation tokens as explained in section4.1.
3
2.2High Level Use Cases 2USE CASES
2.2High Level Use Cases
1.Applications accessing files on HDFS clusters Non-MapReduce ap-plications,including hadoop fs ,access files stored on one or more HDFS Name Node
Data Node
k
b l o Application
MapReduce
Task o k e n
o
e
)
Job Tracker Task Tracker
Task
Other Service
HDFS HDFS HDFS
NFS
smg9
job token
Application
credential
Oozie Application
HDFS
HDFS
Map
Reduce
HDFS
HDFS
HDFS k e
k
2.3Unsupported Use Cases2USE CASES
User Process Oozie
Job
c12蛋白芯片Tracker
Task
Tracker
Task
Name
Node
Data
Node
NFS
ZooKeeper
Browser
HTTP plug auth
HTTP HMAC
RPC Kerberos
RPC DIGEST
Block Access
Third Party
(b)Admin adds/changes specific users and groups to a cluster’s service
丁香人妻小说
authorization list
i.Only these users/groups will have access to the cluster regardless
of whetherfile or job queue permission allows access.
ii.Admin adds himself and a group to the superuser and super-group.
6

本文发布于:2024-09-22 03:46:15,感谢您对本站的认可!

本文链接:https://www.17tex.com/xueshu/279996.html

版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。

标签:高考   管理   小说   语文   浙江   大型超市   人妻   作文
留言与评论(共有 0 条评论)
   
验证码:
Copyright ©2019-2024 Comsenz Inc.Powered by © 易纺专利技术学习网 豫ICP备2022007602号 豫公网安备41160202000603 站长QQ:729038198 关于我们 投诉建议