1. Decentralized Access Control with Anonymous Authentication of Data Stored in Clouds
Abstract
We propose a new decentralized access control scheme for secure data storage in clouds that supportsanonymous authentication. In the proposed scheme, the cloud verifies the authenticity of the series without knowing the user’s identity before storing data. Our scheme also has the added feature of access control in which only valid users are able to decrypt the stored information. The scheme prevents replay attacks and supportscreation, modification, and reading data stored in the cloud. We also address user revocation. Moreover, our authentication and access control scheme is decentralized and robust, unlikeother access control schemes designed for clouds which are centralized. The communication, computation, and storage overheads are comparable to centralized approaches.
2. Modeling of Distributed File Systems for Practical Performance Analysis
Abstract—Cloud computing has received significant attention recently. Delivering quality guaranteed services in clouds is highly desired. Distributed file systems(DFSs) are the key component of any cloud-scale data processing middleware. Evaluating the performance of DFSs is accordingly very important. To avoid cost for late life cycle performance fixes and architectural redesign, providing performance analysis before the deployment of DFSs is also particularly important. In this paper, we propose a systematic and practical performance analysis framework, driven by architecture and design models for defining the structure andbehavior of typical master/slaveDFSs. We put forward a configuration guideline for specifications of configuration alternatives of such DFSs, and a practical approach for both qualitatively and quantitatively performance analysis of DFSs with various configuration settings in a systematic way. What distinguish our approach from others is that 1) most of existing works rely on performance measurements under a variety of workloads/strategies, comparing with other DFSs or running application programs, but our approach is based on architecture and design level modelsand systematically derived performance models; 2) our approach is able to both qualitatively and quantitatively evaluate the performance of DFSs; and 3) our approach not only can evaluate the overall performance of a DFS but also its components and individual steps. We demonstrate the effectiveness of our approach by evaluating Hadoop distributed file system (HDFS). A series of real-world experiments on EC2 (Amazon Elastic Compute Cloud),Tansuo and Inspur Clusters, were conducted to qualitatively evaluate the effectiveness of our approach. We also performed a set of experiments of HDFS on EC2 to quantitatively analyze the performance and limitation of the metadata server of DFSs. Results show that our approach can achieve sufficient performance analysis. Similarly, the proposed approach could be also applied to evaluateother DFSs such as MooseFS, GFS, and zFS.
3. Balancing Performance, Accuracy, and Precision for Secure Cloud Transactions
Abstract—In distributed transactional database systems deployed over cloud servers, entities cooperate to form proofs of authorizations that are justified by collections ofcertified credentials. These proofs and credentials may be evaluated and collected over extended time periods under the risk of having the underlying authorization policies orthe user credentials being in inconsistent states. It therefore becomes possible for policy-based authorization systems to make unsafe decisionsthat might threaten sensitive resources. In this paper, we highlight the criticality of the problem. We then define the notion of trusted transactions when dealing with proofs of authorization. Accordingly, we propose several increasingly stringent levels of policy consistency constraints, and present different enforcement approaches to guarantee the trustworthiness of transactions executing on cloud servers. We propose a Two-Phase Validation Commitprotocol as a solution, which is a modified version of the basic Two-Phase Validation Commit protocols. We finally analyze the different approaches presented using both analytical evaluation of the overheads and simulations to guide the decision makers to which approach to use.
4. A Scalable Two- Phase Top-Down Specialization Approach for Data Anonymization Using MapReduce on Cloud
Abstract—A large number of cloud services require users to share private data like electronic health records for data analysis or mining, bringing privacy concerns. Anonymizing data sets via generalization to satisfy certain privacy requirements such as kanonymity is a widely used category of privacy preserving techniques. At present, the scale of data in many cloud applications increases tremendously in accordance with the Big Data trend, thereby making it a challenge for commonly used software tools to capture, manage, and process such large- scale data within a tolerable elapsed time. As a result, it is a challenge for existing anonymization approaches to achieve privacy preservation on privacy-sensitive large-scale data sets due to their insufficiency of scalability. In this paper, we propose a scalable two-phase top-down specialization (TDS) approach to anonymize large-scale data sets using the MapReduce framework on cloud. In both phases of our approach, we deliberately design a group of innovative MapReduce jobs to concretely accomplish the specialization computation in a highly scalable way. Experimental evaluation results demonstrate that with our approach, the scalability and efficiency of TDS can be significantly improvedover existing approaches.
5. Dynamic Optimization of Multiattribute Resource Allocation in Self-Organizing Clouds
By leveraging virtual machine (VM) technology which provides performance and fault isolation, cloud resources can be provisioned on demand in a fine grained, multiplexed manner rather than in monolithic pieces. By integrating volunteer computing into cloud architectures, we envision a gigantic self-organizing cloud (SOC) being formed to reap the huge potentialof untapped commodity computing power over the Internet. Toward this new architecture where each participant may autonomously act as bothresource consumer and provider, we propose a fully distributed, VM-multiplexing resource allocationscheme to manage decentralized resources. Our approach not onlyachieves maximized resourceutilization using the proportional share model (PSM), but also delivers provably and adaptively optimalexecution efficiency. We also design a novel multi attribute range query protocol for locating qualified nodes. Contrary to existing solutionswhich often generate bulky messages per request, our protocol produces onlyone lightweight query message per task on the Content Addressable Network (CAN). Itworks effectively to find for each task its qualified resources under a randomized policy that mitigates the contention among requesters. We show the SOC with our optimized algorithms can make an improvement by 15-60 percent in system throughput than a P2P Grid model. Our solution also exhibits fairly high adaptability in a dynamic node-churning environment.
6. Scalable and Secure Sharing of Personal Health Records in Cloud Computing Using Attribute-Based Encryption
Personal health record (PHR) is an emerging patient-centricmodel of health information exchange, which is often outsourced to be stored at a third party, such as cloud providers. However, there have been wide privacy concerns as personal health information could be exposed to thosethird party servers and to Un authorized parties. To assure the patients' control over access to their own PHRs, it is a promising methodto encrypt the PHRs before outsourcing. Yet, issues such as risks of privacy exposure, scalability in key management, flexible access, and efficient user revocation, have remained the most important challenges toward achieving fine-grained, cryptographically enforced data access control. In this paper, we propose a novel patient-centric framework and a suite of mechanisms for data access controlto PHRs stored in semitrusted servers. To achieve fine- grained and scalable data access controlfor PHRs, we leverage attribute-based encryption (ABE) techniquesto encrypt each patient's PHR file. Different from previous works in secure data outsourcing, we focus on the multiple data owner scenario, and divide the users in the PHR system into multiple security domains that greatly reduces the key management complexity for owners and users. A high degree of patient privacy is guaranteed simultaneously by exploiting multiauthority ABE. Our scheme also enables dynamic modification of access policies or file attributes, supports efficient on-demand user/attributerevocation and break-glass access under emergency scenarios. Extensive analytical and experimental results are presented which show the security, scalability, and efficiency of our proposed scheme.
7.On Data Staging Algorithms for Shared Data Accesses in Clouds
In this paper, we study the strategies for efficiently achieving data staging and caching on a set of vantage sites in a cloud system witha minimum cost. Unlikethe traditional research, we do not intend to identify the access patterns to facilitate the future requests. Instead, with such a kind of information presumably known in advance, our goal is to efficiently stage the shared data items to predetermined sites at advocated time instants to align with the patterns while minimizing the monetary costs for caching and transmitting the requested data items. To this end, we follow the cost and network models in [1] and extend theanalysis to multipledata items, each with single or multiple copies. Our resultsshow that under homogeneous cost model, when the ratio of transmissioncost and caching cost is low, a single copy of each data item can efficiently serve all the user requests. While in multicopy situation, we also consider the tradeoff between the transmission cost and caching cost by controlling the upper bounds of transmissions and copies. The upper bound canbe given either on per-item basis or on all-item basis. We present efficient optimal solutions based on dynamic programming techniques to all these cases provided that the upper bound is polynomially bounded by the number of service requests and the number of distinctdata items. In addition to the homogeneous cost model, we also briefly discuss this problem under a heterogeneous cost model with some simple yet practical restrictions and present a 2-approximation algorithm to the general case. We validate our findings by implementing a data staging solver, whereby conducting extensive simulation studies on the behaviors of the algorithms.