Cloudera Data Platform Features
Reports (5)
Reports Interface
Reports interface for standard and self-service reports is intuitive and easy to use.
Steps to Answer
Requires a minimal number of steps/clicks to answer business question.
Graphs and Charts
Offers a variety of attractive graph and chart formats.
Score Cards
Score cards visually track KPI's.
Dashboards
Provides business users an interface to easily design, refine and collaborate on their dashboards
Self Service (6)
Calculated Fields
Using formulas based on existing data elements, users can create and calculate new field values.
Data Column Filtering
Business users have the ability to filter data in a report based on predefined or automodeled parameters.
Data Discovery
Users can drill down and explore data to discover new insights.
Search
Ability to search global data set to find and discover data.
Collaboration / Workflow
Ability for users to share data and reports they have built within the BI tool and outside the tool through other collaboration platforms.
Automodeling
Tool automatically suggests data types, schemas and hierarchies.
Advanced Analytics (3)
Predictive Analytics
Analyze current and historical trends to make predictions about future events.
Data Visualization
Communicate complex information clearly and effectively through advanced graphical techniques.
Big Data Services
Ability to handle large, complex, and/or siloed data sets.
Building Reports (4)
Data Transformation
Converts data formats of source data into the format required for the reporting system without mistakes.
Data Modeling
Ability to (re)structure data in a manner that allows extracting insights fast and accurate.
WYSIWYG Report Design
Provides business users an interface to easily design and refine their dashboards and reports. (What You See Is What You Get)
Integration APIs
Application Programming Interface - Specification for how the application communicates with other software. API's typically enable integration of data, logic, objects, etc with other software applications.
Database Features (7)
Storage
Availability
Stability
Scalability
Security
Data Manipulation
Query Language
Model Development (5)
Language Support
Supports programming languages such as Java, C, or Python. Supports front-end languages such as HTML, CSS, and JavaScript
Drag and Drop
Offers the ability for developers to drag and drop pieces of code or algorithms when building models
Pre-Built Algorithms
Provides users with pre-built algorithms for simpler model development
Model Training
Supplies large data sets for training individual models
Feature Engineering
Transforms raw data into features that better represent the underlying problem to the predictive models
Machine/Deep Learning Services (6)
Computer Vision
Offers image recognition services
Natural Language Processing
Offers natural language processing services
Natural Language Generation
Offers natural language generation services
Artificial Neural Networks
Offers artificial neural networks for users
Natural Language Understanding
Offers natural language understanding services
Deep Learning
Provides deep learning capabilities
Deployment (15)
Managed Service
Manages the intelligent application for the user, reducing the need of infrastructure
Application
Allows users to insert machine learning into operating applications
Scalability
Provides easily scaled machine learning applications and infrastructure
Language Flexibility
Allows users to input models built in a variety of languages.
Framework Flexibility
Allows users to choose the framework or workbench of their preference.
Versioning
Records versioning as models are iterated upon.
Ease of Deployment
Provides a way to quickly and efficiently deploy machine learning models.
Scalability
Offers a way to scale the use of machine learning models across an enterprise.
On-Premise
Provides On-Premise deployment options.
Cloud
Provides Cloud deployment options (private or public cloud, hybrid cloud).
Language Flexibility
Allows users to input models built in a variety of languages.
Framework Flexibility
Allows users to choose the framework or workbench of their preference.
Versioning
Records versioning as models are iterated upon.
Ease of Deployment
Provides a way to quickly and efficiently deploy machine learning models.
Scalability
Offers a way to scale the use of machine learning models across an enterprise.
Database (3)
Real-Time Data Collection
Collects, stores, and organizes massive, unstructured data in real time
Data Distribution
Facilitates the disseminating of collected big data throughout parallel computing clusters
Data Lake
Creates a repository to collect and store raw data from sensors, devices, machines, files, etc.
Integrations (2)
Hadoop Integration
Aligns processing and distribution workflows on top of Apache Hadoop
Spark Integration
Aligns processing and distribution workflows on top of Apache Hadoop
Platform (3)
Machine Scaling
Facilitates solution to run on and scale to a large number of machines and systems
Data Preparation
Curates collected data for big data analytics solutions to analyze, manipulate, and model
Spark Integration
Aligns processing and distribution workflows on top of Apache Hadoop
Processing (2)
Cloud Processing
Moves big data collection and processing to the cloud
Workload Processing
Processes batch, real-time, and streaming data workloads in singular, multi-tenant, or cloud systems
Data Transformation (2)
Real-Time Analytics
Facilitates analysis of high-volume, real-time data.
Data Querying
Allows user to query data through query languages like SQL.
Connectivity (4)
Hadoop Integration
Aligns processing and distribution workflows on top of Apache Hadoop
Spark Integration
Aligns processing and distribution workflows on top of Apache Spark
Multi-Source Analysis
Integrates data from multiple external databases.
Data Lake
Facilitates the dissemination of collected big data throughout parallel computing clusters.
Operations (7)
Data Visualization
Processes data and represents interpretations in a variety of graphic formats.
Data Workflow
Strings together specific functions and datasets to automate analytics iterations.
Governed Discovery
Isolates certain datasets and facilitates management of data access.
Notebooks
Use notebooks for tasks such as creating dashboards with predefined, scheduled queries and visualizations
Metrics
Control model usage and performance in production
Infrastructure management
Deploy mission-critical ML applications where and when you need them
Collaboration
Easily compare experiments—code, hyperparameters, metrics, predictions, dependencies, system metrics, and more—to understand differences in model performance.
Data Governance (3)
User Access Management
Allows administrators to assign role-based user access for specific data sets
Dynamic Data Masking
Hides and masks sensitive data automatically based on user permissions
Data Lineage
Provides historical insights into original data sources and transformations made to data sets
Data Preparation (6)
Search
Offers simple search capabilities to discover specific data sets
Data Quality and Cleansing
Allows users and administrators to easily clean data to maintain quality and integrity
Data Transformation
Converts data formats of source data into the format required for the reporting system without mistakes
Data Modeling
Tools to (re)structure data in a manner that allows extracting insights quickly and accurately
Connectors
Ability to connect the analytics platform with a wide range of connector options for common data sources, including popular enterprise applications.
Data Governance
Connects to enterprise data governance software, or provides integrated data governance features to avoid misuse of data
Collaboration (4)
Commenting
Allows users to comment on data sets to help future users better interact and interpret the data
Profiling and Classification
Permits profiling of data sets for increased organization, both by users and machine learning
Business and Data Glossary
Creates a business glossary for faster understanding by the average business user
Metadata Management
Indexes metadata descriptions for easier searching and enhanced insights
Artificial Intelligence (3)
Machine Learning Recommendations
Automates recommendations for users based on machine learning functionality
Natural Language Query
Offers natural language querying functionality for non-technical users
Automatic Data Cleansing
Cleans data to improve quality via automation
Administration (4)
Data Modelling
Tools to (re)structure data in a manner that allows extracting insights quickly and accurately
Recommendations
Analyzes data to find and recommend the highest value customer segmentations.
Workflow Management
Tools to create and adjust workflows to ensure consistency.
Dashboards and Visualizations
Presents information and analytics in a digestible, intuitive, and visually appealing way.
Compliance (4)
Sensitive Data Compliance
Supports compliance with PII, GDPR, HIPPA, PCI, and other regulatory standards.
Training and Guidelines
Provides guidelines or training related to sensitive data compliance requirements,
Policy Enforcement
Allows administrators to set policies for security and data governance
Compliance Monitoring
Monitors data quality and send alerts based on violations or misuse
Data Quality (3)
Data Preparation
Curates collected data for big data analytics solutions to analyze, manipulate, and model
Data Distribution
Facilitates the disseminating of collected big data throughout parallel computing clusters
Data Unification
Compile data from across all systems so that users can view relevant information easily.
Management (19)
Reporting
View ETL process data via reports and visualizations like charts and graphs.
Auditing
Record ETL historical data for auditing and potential data correction needs.
Cataloging
Records and organizes all machine learning models that have been deployed across the business.
Monitoring
Tracks the performance and accuracy of machine learning models.
Governing
Provisions users based on authorization to both deploy and iterate upon machine learning models.
Model Registry
Allows users to manage model artifacts and tracks which models are deployed in production.
Data dictionary
Stores the database metadata, that is the definitions of data elements, types, relationships etc.
Data Replication
Creates a copy of the database to maintain consistency and integrity.
Query Language
Allows users to create, update and retrieve data in a database.
Data Modeling
Defines the logical design of the data before building the schemas.
Performance Analysis
Monitors and analyzes critical database attributes like query performance, user sessions, dead lock detail, system errors etc and visualize them on a custom dashboard.
Business Glossary
Lets users build a glossary of business terms, vocabulary and definitions across multiple tools.
Data Discovery
Provides a built-in integrated data catalog that allows users to easily locate data across multiple sources.
Data Profililng
Monitors and cleanses data with the help of business rules and analytical algorithms.
Reporting and Visualization
Visualize data flows and lineage that demonstrates compliance with reports and dashboards through a single console.
Data Lineage
Provides an automated data lineage functionality which provides visibility over the entire data movement journey from data origination to destination.
Cataloging
Records and organizes all machine learning models that have been deployed across the business.
Monitoring
Tracks the performance and accuracy of machine learning models.
Governing
Provisions users based on authorization to both deploy and iterate upon machine learning models.
Functionality (5)
Extraction
Extract data from the designated source(s) like relational databases, JSON files, and XML files.
Transformation
Cleanse and re-format extracted data to the needed target format.
Loading
Load reformatted data into target database, data warehouse, or other storage location.
Automation
Arrange ETL processes to occur automatically on needed time schedule (e.g., daily, weekly, monthly).
Scalability
Capable of scaling processing power up or down based on ETL volume.
System (1)
Data Ingestion & Wrangling
Gives user ability to import a variety of data sources for immediate use
Data Modeling and Blending (3)
Data Querying
Using formulas based on existing data elements, users can create and calculate new field values
Data Filtering
Business users have the ability to filter data in a report based on predefined or automodeled parameters.
Data Blending
Allows the user to combine data from multiple sources into a functioning dataset.
Data Management (16)
Data Integration
Consolidate data from various disparate sources in a single unified view
Data Discovery
Understand the state of data, applications, systems and services
Multi - Platform
Manage data across environments (on-premises cloud, hybrid, and multi-cloud)
Metadata
Provides metadata management and lineage capabilities
Data Integration
Consolidates, Cleanses and Normalizes data from multiple disparate sources.
Data Compression
Helps save storage capacity and improves query performance.
Data Quality
Eliminates data inconsistency and duplications ensuring data integrity.
Built-In Data Analytics
SQL based analytics functions like Time series, pattern matching, geospatial analytics etc.
In-Database Machine Learning
Provides built in capabilities like machine learning algorithms, data preparation functions, model evaluation and management etc.
Data Lake Analytics
Allows data querying across data formats like parquet, ORC, JSON etc and analyze complex data types on HDFS
Data Model
Stores data as key-value pairs where key is a unique identifier.
Data Types
Supports multiple data types like lists, sets, hashes (similar to map), sorted sets etc.
Data Integration
Integrates data and data-related technologies into a single environment.
Metadata
Provides metadata management capabilities.
Self-service
Empowers the user via a self-service capability to manage data workflows.
Automated workflows
Completely automates end-to-end data workflows across the data integration lifecycle.
Analytics (3)
Data Analytics
Supports advanced analytics solutions for better business decision making
Analytics capabilities
Provides a high performance, flexibile analytics platform to support data management and embrace data driven decision making.
Dasboard visualizations
Collect and displays metrics across the data integration via a dashboard.
Security (17)
Compliance
Rules and regulations inherited from source systems or defined to secure sensitive data
Governance
Grant or restrict data access and control
Data Protection
Built - In Backup and Disaster Recovery
Data Encryption
Encrypts and transforms data at the database from a readable state into a ciphertext of unreadable characters.
User Access Control
Allows restricted user acess to modify depending on the access level.
Database Locking
Prevents other users and applications from accessing data while it is being updated to avoid data loss or update.
Access Control
Allows permissions to be granted or revoked in the database, schema or table levels.
Encryption
Built-in native encryption with enterprise key management.
Authentication
Provides multi-factor authentication with certificates.
Role-Based Authorization
Provides predefined system roles, privileges, and user-defined roles to users.
Authentication
Allows integration with external security mechanisms like Kerberos, LDAP authentication etc.
Encryption
Provides encryption capability for all the data at rest using encryption keys.
Data Governance
Policies, procedures and standards to manage and access data.
Data Security
Restricts data access at a cell level, mask or hide parts of cells, and encrypt data at rest and in motion
Access Control
Authenticates and authorizes individuals to access the data they are allowed to see and use.
Roles Management
Helps identify and manage the roles of owners and stewards of data.
Compliance Management
Helps adhere to data privacy regulations and norms.
Integration (3)
AI/ ML Integration
Integrates with data science workflows, Machine Learning and artificial intelligence (AI) capabilities.
BI Tool Integration
Integrates with BI Tools to transform data into Actionable Insights.
Data lake Integration
Provides speed in data processing and capturing unstructured, semi-structured and streaming data.
Performance (6)
Scalability
Manages huge volumes of data, upscale or downscale as per demand.
Disaster Recovery
Provides data recovery functionality to protect and restore data in a database.
Data Concurrency
Allows multi-version concurrency control.
Workload Management
Handles workloads, from single machines to data warehouses or web services with many concurrent users.
Advanced Indexing
Allows users to quickly retrieve data through various types of indexing like B-tree, hash table etc.
Query Optimizer
Helps interpret SQL queries and determine the fastest method of execution.
Maintenance (3)
Data Migration
Allows data movement from one database to another.
Backup and Recovery
Provides data backup and recovery functionality to protect and restore a database.
Multi-User Environment
Allows users to access and work on data concurrently, supporting several views of the data.
Support (4)
Text Search
Provides support for international character sets and full text search.
Data Types
Supports multiple data types like primitive, structured, document etc.
Languages
Supports multiple procedural programming languages like PL/PGSQL, Perl, Python etc.
Operating Systems
Available on multiple operating systems like Linux, Windows, MacOS etc.
Management (4)
Data Schema
Data is organized as a set of tables with columns and rows like a table structure.
Query Language
Allows users to create, update and retrieve data in a database.
ACID - Complaint
Adheres to ACID (atomicity, consistency, isolation, durability), a set of database transaction properties.
Data Replication
Provides log-based or/and trigger-based replication.
Availability (3)
Auto Sharding
Implements auto horizontal data partitioning that allows storing data on more than one node to scale out.
Auto Recovery
Restores a database to a correct (consistent) state in the event of a failure.
Data Replication
Copy data across multiple servers through master-slave, peer-to-peer replication architecture etc.
Support (2)
Multi-Model
Provides support to handle structured, semi-structured, and unstructured data with equal effect.
Operating Systems
Available on multiple operating systems like Linux, Windows, MacOS etc.
Maintainence (2)
Data Quality Management
Defines, validates, and monitors business rules to safeguard master data readiness.
Policy Management
Allows users to create and review data policies to make them consistent across the organization.
Monitoring and Management (2)
Data Observability
Involved solely in monitoring data pipelines, sending alerts and troubleshooting data.
Testing capabilities
Deploys testing capabilities such as report testing, big data testing, cloud data migration testing, ETL and data warehouse testing.
Cloud Deployment (2)
Hybrid cloud support
Supports analytical platforms and data pipelines across complex hybrid environments.
Cloud migration capabilities
Supports migration of component or pipeline to different cloud environments.
Generative AI (17)
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text-to-Image
Provides the ability to generate images from a text prompt.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
AI Text Generation
Allows users to generate text based on a text prompt.
AI Text Summarization
Condenses long documents or text into a brief summary.
Agentic AI - Machine Learning Data Catalog (5)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Decision Making
Makes informed choices based on available data and objectives
Agentic AI - Data Governance (6)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Natural Language Interaction
Engages in human-like conversation for task delegation
Decision Making
Makes informed choices based on available data and objectives
Agentic AI - Data Fabric (5)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Decision Making
Makes informed choices based on available data and objectives
Agentic AI - DataOps Platforms (5)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Decision Making
Makes informed choices based on available data and objectives
Agentic AI - Analytics Platforms (7)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Natural Language Interaction
Engages in human-like conversation for task delegation
Proactive Assistance
Anticipates needs and offers suggestions without prompting
Decision Making
Makes informed choices based on available data and objectives
Agentic AI - Data Science and Machine Learning Platforms (7)
Autonomous Task Execution
Capability to perform complex tasks without constant human input
Multi-step Planning
Ability to break down and plan multi-step processes
Cross-system Integration
Works across multiple software systems or databases
Adaptive Learning
Improves performance based on feedback and experience
Natural Language Interaction
Engages in human-like conversation for task delegation
Proactive Assistance
Anticipates needs and offers suggestions without prompting
Decision Making
Makes informed choices based on available data and objectives
Deployment & Integration - Analytics Platforms (4)
No-code Dashboard Builder
Enables non-technical users to build dashboards through intuitive, drag-and-drop interfaces
Report Scheduling and Automation
Enables automated report generation and scheduled delivery to stakeholders
Embedded Analytics and White-labeling
Allows dashboards and analytics to be embedded into external apps with branding flexibility
Data Source Connectivity
Supports integration with major data sources like cloud data warehouses, SQL/NoSQL databases, and SaaS applications
Performance & Scalability - Analytics Platforms (2)
Large data handling and Query Speed
Efficiently processes large datasets with minimal lag and ensures high performance under load
Concurrent User Support
Maintains performance and uptime during high traffic from multiple users or teams
Advanced Analytics & Modeling - Analytics Platforms (3)
Data Modeling and Governance
Supports semantic data layers, role-based access controls, and metadata governance
Notebook and Script Integration
Integrates with Jupyter, Python, or R for custom analytics and modeling
Built-in Predictive and Statistical Models
Provides native tools for statistical analysis, forecasting, and trend prediction
Agentic AI Capabilities - Analytics Platforms (4)
Auto-generated Insights and Narratives
Uses AI to generate textual summaries, key takeaways, and data stories from dashboards
Natural Language Queries
Allows users to query data and build reports using conversational or plain language
Proactive KPI Monitoring and Alerts
Detects and notifies users about KPI anomalies or significant metric changes in real time
AI Agents for Analytical Follow-ups
Recommends next questions, analyses, or exploration paths using autonomous AI agents
Personalized Intelligence - Analytics Platforms (3)
Behavioral Learning for Contextual Query Refinement
Learns from historical user interactions to improve and personalize query results over time
Role-based Insight Personalization
Tailors dashboard views and suggestions based on user roles, access levels, and past behavior
Conversational and Prompt-based Analytics
Supports AI-driven exploration via prompts or multi-turn conversations for iterative querying
Technology Glossary Features
View definitions of the features and discover new technology terms.
Data modeling is the process of creating visual representations of information systems to better communicate the connections between data points and structures. Learn more about data modeling in this G2 guide.
Data manipulation is the process of organizing, modifying, and transforming data to improve accuracy, usability, and analysis across systems and workflows.




