It has significantly improves its performance with the Databricks Inout and Ouput Module. WIth better support for spark, it combines well with Microsoft Azure and Amazon AWS. It has faster execution and faster read write processes in its version 5.
A few schema related queries are still on the slower side considering huge data clusters and the processing involved for those clusters.
It runs on the clusters of machines managed by Databricks which gives us the assurance to manage data in a distributed manner. It includes Spark and adds a number of components and updates to performa big data analytics and data processing. It's parallel processing in RDD's is amazing.