Best Machine Learning Data Catalog Software

    Machine learning data catalogs allow companies to categorize, access, interpret, and collaborate around company data across multiple data sources, while maintaining a high level of governance and access management. Artificial intelligence is key to many features of machine learning data catalogs, enabling functionality such as machine learning recommendations, natural language querying, and dynamic data masking for enhanced security purposes.

    Companies can utilize machine learning data catalogs to maintain data sets in a single location so that searching for and discovering data is simple for everyday business users and analysts alike. Users have the ability to comment on, share, and recommend data sets so colleagues can have an immediate understanding of what they are querying. Additionally, IT administrators can put into place user provisioning to ensure unauthorized employees are not accessing sensitive data.

    Machine learning data catalogs are most frequently implemented by companies that have multiple data sources, are searching for one source of truth, and are attempting to scale data usage company-wide. These products are generally administered by IT departments, who can maintain organization and security, but data can be accessed by data scientists or analysts and the average business user. The data can then be transformed, modeled, and visualized either directly in the machine learning data catalog or through an integration with business intelligence software.

    It should be noted that not all machine learning data catalogs provide data preparation capabilities and may require an integration with a business intelligence platform. Additionally, these tools differ from master data management software due to their enhanced governance, collaboration, and machine learning functionality.

    To qualify for inclusion in the Machine Learning Data Catalog category, a product must:

    Organize and consolidate data from all company sources in a single repository
    Provide user access management for security and data governance purposes
    Allow business users to search and access the data from within the catalog
    Offer collaboration features around data sets, including categorizing, commenting, and sharing
    Give intelligent recommendations based on machine learning for quicker access to relevant data

    Compare Machine Learning Data Catalog Software

    G2 takes pride in showing unbiased reviews on user satisfaction in our ratings and reports. We do not allow paid placements in any of our ratings, rankings, or reports. Learn about our scoring methodologies.
    Sort By:
    Results: 37
    View Grid®
    Adv. Filters
    (11)4.0 out of 5

    Aginity transforms the way world-leading companies compete on analytics. Aginity Amp software creates, catalogs and manages all analytics (analytic logic and data) as assets.

    (16)4.1 out of 5
    Optimized for quick response
    Entry Level Price:Free

    IBM Watson® Knowledge Catalog is a unified data catalog that can help your data users quickly find, curate, categorize and share data, analytical models and their relationships with other members of your organization. It serves as a single source of truth for data engineers, data stewards, data scientists and business analysts to shop for data they can trust, accelerating the implementation and value of DataOps for your organization. With active policy management, it helps your organization prot

    (17)4.3 out of 5

    Alation pioneered the data catalog market and today is leading its evolution into a platform for a broad range of data intelligence solutions including data search & discovery, data governance, data stewardship, analytics, and digital transformation. Thanks to its powerful Behavioral Analysis Engine, inbuilt collaboration capabilities, and open interfaces, Alation combines machine learning with human insight to successfully tackle even the most demanding challenges in data and metadata manag

    (17)4.3 out of 5
    Optimized for quick response

    Collibra is the Data Intelligence company. We accelerate trusted business outcomes by connecting the right data, insights and algorithms to all Data Citizens. Our cloud-based platform connects IT and the business to build a data-driven culture for the digital enterprise. Global organizations choose Collibra to unlock the value of their data and turn it into a strategic, competitive asset. We have a diverse global footprint, with offices in the U.S., Belgium, Australia, Czech Republic, France, Po

    (33)4.0 out of 5

    Denodo provides performance and unified access to the broadest range of enterprise, Big Data, cloud and unstructured sources.

    Oracle Enterprise Metadata Management (OEMM) is a comprehensive metadata management platform. OEMM can harvest and catalog metadata from virtually any metadata provider, including relational, Hadoop, ETL, BI, data modeling, and many more.

    (4)5.0 out of 5

    AI Augmented data quality platform that has built in processes and technologies to improve, monitor data quality and prepare “ready-to-use” data for use across reporting, analytics and MDM solutions. DQLabs was created with the vision to provide a simple way for organizations to handle issues around data quality, governance, curation, master data management effectively. With the use of AI and Machine Learning – the sophistication of the technology is blended carefully with the art of simplicity.

    Intel(R) Machine Learning Scaling Library (Intel(R) MLSL) is a library providing an efficient implementation of communication patterns used in deep learning.

    (1)4.0 out of 5

    At Zaloni, we believe in the unrealized power of data. Our data management software, Arena, provides an augmented catalog that enables self-service data enrichment and consumption. We work with the world's leading companies, delivering exceptional data governance built on an extensible, machine-learning platform that both improves and safeguards enterprises’ data assets. To find out more visit

    (1)5.0 out of 5

    Cloudera Navigator is a complete data governance solution for Hadoop, offering critical capabilities such as data discovery, continuous optimization, audit, lineage, metadata management, and policy enforcement. As part of Cloudera Enterprise, Cloudera Navigator enables performance agile analytics, supporting continuous data architecture optimization, and meeting regulatory compliance requirements.

    (1)5.0 out of 5

    Each entry in the dataset consists of a unique MP3 and corresponding text file. Many of the 1,368 recorded hours in the dataset also include demographic metadata like age, sex, and accent that can help train the accuracy of speech recognition engines. The dataset currently consists of 1,087 validated hours in 18 languages, but we're always adding more voices and languages.

    (1)4.5 out of 5 is the data catalog powered by a knowledge graph. It maps your data to familiar and consistent business concepts so your people get clear, accurate, fast answers to any business question.

    (2)4.0 out of 5

    A fully managed and highly scalable data discovery and metadata management service.

    (1)3.5 out of 5

    Data Catalog automatically crawls, profiles, organizes, links, and enriches all your metadata. Up to 80% of the information associated with the data is documented automatically and kept up-to-date through smart relationships and machine learning, continually delivering the most meaningful data to the user.

    (3)3.8 out of 5

    Unifi is a single data interface for the enterprise.

    0 ratings

    Alex is a Metadata Management Platform designed to empower everyone to securely find, understand, protect, and ethically use the world’s data. Recognised by Gartner as a Leader in the Magic Quadrant for Metadata Management Solutions. Alex Solutions is an Australian-based start-up that is bringing innovation and disruptive ideas to the way organisations manage and leverage their information assets.

    0 ratings

    Altair Knowledge Hub is an enterprise data prep solution that empowers individuals and organizations to intelligently tap into more data to drive faster insight and better value. Knowledge Hub provides clear lineage, evidence of integrity, and organizational governance controls as well as cross-team sharing and collaboration in a centralized marketplace where users can publish their output to any analytics or reporting platform.

    0 ratings

    A Semantic Layer for the Enterprise. Enabling Connected Data Access and Analytics on Demand. Anzo Smart Data Lake (ASDL) connects to both internal and external data sources, including cloud or on-premise Hadoop based data lakes to rapidly ingest and catalog large volumes of structured and unstructured data through horizontally scaled, automated Extract, Transform and Load (ETL) processes that can be mapped to establish a Semantic Layer of business meaning.

    0 ratings

    Appen collects and labels images, text, speech, audio, video, and other data to create training data used to build and continuously improve the world’s most innovative artificial intelligence systems. We offer a state of the art, licensable data annotation platform to annotate training data use cases in computer vision and natural language processing. Our platform enhances accuracy and efficiency through our Smart Labeling and Pre-Labeling features which use Machine Learning to ease human anno

    0 ratings
    Optimized for quick response

    Atlan is a Modern Data Workspace with the vision to enable data democratization within organizations, while maintaining the highest standards of governance and security. The diverse users of today’s modern data team, ranging from data engineers to business users, come together to collaborate on Atlan. By enabling data discovery, context sharing, governance, and security, data teams using Atlan are able to free upwards of 30% of their time—replacing manual, repetitive tasks with automation and m

    0 ratings

    BigID’s data intelligence platform enables organizations to know their enterprise data and take action for privacy, protection, and perspective. Customers deploy BigID to proactively discover, manage, protect, and get more value from their regulated, sensitive, and personal data across their data landscape. By applying advanced machine learning and deep data insight, BigID transforms data discovery and data intelligence to address data privacy, data security, and data governance challenges acro

    0 ratings

    signal processing, machine learning, and AI to solve real-world business challenges including in financial services

    0 ratings

    Insights are only as valuable as the quality of the data used to construct them. Start by identifying all accessible data using the automated catalog, search, and discovery features in Data360. Translate highly technical metadata into meaningful business information that will benefit everyone – and can be utilized by anyone.

    0 ratings

    machine-learning-based data catalog lets you classify and organize data assets across cloud, on-premises, and big data. It provides maximum value and reuse of data across your enterprise.

    0 ratings

    erwin Data Catalog (erwin DC) is metadata management software that helps organizations learn what data they have and where it’s located, including data at rest and in motion.

    0 ratings

    Immuta is the fastest way for algorithm-driven enterprises to accelerate the development and control of machine learning and advanced analytics. The company's hyperscale data management platform provides data scientists with rapid, personalized data access to dramatically improve the creation, deployment and auditability of machine learning and AI.

    0 ratings

    Data3Sixty facilitates answers to fundamental questions about data, such as source, use, meaning, ownership, and quality through a robust suite of governance solutions, including business glossary, data dictionary, data catalog, data lineage, and metadata management. Customizable dashboards and zero-code workflows ensure users can quickly and easily leverage data to maximum advantage.

    A machine-learning-based data catalog that allows to classify and organize data assets across cloud, on-premises, and big data. It provides maximum value and reuse of data across enterprise.

    (2)4.5 out of 5

    PoolParty Semantic Suite is the most complete and advanced semantic middleware platform on the global market. It uses innovative means to help organizations build and manage enterprise knowledge graphs as a basis for their AI strategy. Text Mining and NLP: PoolParty Semantic Suite uses advanced text mining algorithms and Natural Language Processing (NLP) to automatically extract relevant entities, terms and other metadata from text and documents. An Enterprise Knowledge Graph provides additiona

    (1)4.0 out of 5

    Qlik Data Catalyst accelerates the transition towards modern data management by providing essential capabilities in four areas.

    Select Grid® View
    G2 Grid® for Machine Learning Data Catalog
    Filter Grid®
    Filter Grid®
    Select Grid® View
    Check out the G2 Grid® for the top Machine Learning Data Catalog Software products. G2 scores products and sellers based on reviews gathered from our user community, as well as data aggregated from online sources and social networks. Together, these scores are mapped on our proprietary G2 Grid®, which you can use to compare products, streamline the buying process, and quickly identify the best products based on the experiences of your peers.
    High Performers
    IBM Watson Knowledge Catalog
    Market Presence