What do you like best about Databricks?
My comments on the Lakehouse are specific to Unity Catalog (UC):
Governance is all about being a " benevolent bad cop" to the enterprise audiences! That message , up until now(i.e advent of UC), was mostly /only possible via a 'stale Power Point' and , after the Governance teams enforce compliance standards , possibly due to an adverse event of data breach. WHat I have been able to 'show-and-tell' via live DBX UC demo's to the largest healthcare provider enterprise users has captured the rapt attention of the folks! That is my experience. Now coming to the features that UC offers - OKTA Inegration to rope in the Identities of any IAM system over to UC, APIs to setup ACCESS GRANTS & SCHEMA OBJECTS creation, Security via RLS/CLM, and above all, I feel, the cross-workspace access setup to ensure LOBs/Teams with Data Assets across several Catalogs, goes a long way to ensure seamless & ubiqutous data sharing.
The featuers allow for Power Users who are skilled in ANSI SQL to execute their querries across the three namespace architectures (catalog.schema.tables) once the cross WS access is setup. Now coming to the ML Model building Data Scientists and Citizen Data Scientist, the centralized storing of the Model Experiment with its features can be registered in Unity Catalog to ensure Centralized governance of the ensuring endpoints that enable Model Serving.
The Future release of ABACS (as opposed to RBACs) could deliver compute/cluster economies of scale/scope from a cost perspective while making Sensitive Data MAsking and Tagging at a DDL level seamless.
Another eagerly anticipated feature would be autmated sensitive data identification & tagging via the OKERA Integration of all "DBx registered Data Assets in DBx Catalogs".
The use of Service PRinciples as identities opens the scope to intelligently manage /address the limitation of the number of AD groups /Global Groups that can be created.
These are my current observations. Review collected by and hosted on G2.com.
What do you dislike about Databricks?
Not a "poke in the eye" of the hard working Solutions Enginners who face us the clients, music , but ....
1. The Product Engg teams appear to lack digesting the Governance Narratives that enterprises expect , out of the box, not wait for a product release.
2. The fact that Spark engine centric DBx compoutes/workspaces will see a heavy legacy SQL code with all its fun (hard coding, nest sub-querries, temp tables use, CTAS et al....) , the product engg teams appear to not hav such folks at " Product Desgin" phase. Ditto, moresoever, for point #1
3. The publicly available documentation pertaining to features appears to be stale when compared with the features being released.
4. The commitment to deliver a features (example ABACS) on the set date, has spanned several quarters over close to two years! When you promise solving world hunger and keep moving the goal post , credibility is impaired. Review collected by and hosted on G2.com.