What problems is Apache Ranger solving and how is that benefiting you?
centralized policy management for in production Hadoop ecosystem
Ranger provides dynamic data masking (in movement) for several frameworks of the Hadoop stack (e.g., HBase, Storm, Knox, Solr, Kafka, and YARN). Our recently published paper in the FGCS (Q1 journal) https://www.sciencedirect.com/science/article/pii/S0167739X19315948 used the Ranger admin console to set/modify policies (policy enforcement). In that research, we defined a reference architecture for big data systems that utilize Apache Ranger and the ACL to manage repository policies. Ranger will verify the fine-grain client access control, i.e., which HBase/Hive DB and table columns they have access to, Kafka queues, and HDFS level of access. Meanwhile, the ACL will verify the access control of the remaining entities. However, Ranger policies will take priority over those of ACL. If a Ranger policy does not exist, then local ACL will take effect. Hadoop daemon authentications and internal communication (such as task status) will primarily rely on using the Kerberos principal and keytab file locations and are enforced using Hadoop core access control, i.e., ACL.
Another case was employing the Ranger Audit Server in the Hadoop federation configuration. Our proposed big data federation access broker will aggregate all access logs into a centralized repository (RDBMS, HDFS, or Log4j). We demonstrate how to use Ranger and other frameworks for Audit Management and Analysis. In sum, Apache Ranger provides centralized Hadoop security administration and management, while Knox streamlines security for services and external users who access the cluster's data and execute jobs. Review collected by and hosted on G2.com.