Standard compute requirements and limitations

This page includes a list of requirements and limitations for standard compute. If you are using classic compute, Databricks recommends using standard access mode unless your workload is dependent on one of the limitations listed below.

Important

Init scripts and libraries have different support across access modes and Databricks Runtime versions. See Where can init scripts be installed? and Compute-scoped libraries.

Current standard compute limitations

The following sections list limitations for standard compute based on the most recent Databricks Runtime version. For limitations that apply to older Databricks Runtime versions, see Runtime-dependent limitations.

If these features are required for your workload, use dedicated compute instead.

General standard compute limitations

Databricks Runtime for ML is not supported. Instead, install any ML library not bundled with the Databricks Runtime as a compute-scoped library.
GPU-enabled compute is not supported.
Spark-submit job tasks are not supported. Use a JAR task instead.
DBUtils and other clients can only read from cloud storage using an external location.
DBFS root and mounts do not support FUSE.

Language limitations

R is not supported.

Spark API limitations

Spark Context (sc), spark.sparkContext, and sqlContext are not supported for Scala:
- Azure Databricks recommends using the spark variable to interact with the SparkSession instance.
- The following sc functions are also not supported: emptyRDD, range, init_batched_serializer, parallelize, pickleFile, textFile, wholeTextFiles, binaryFiles, binaryRecords, sequenceFile, newAPIHadoopFile, newAPIHadoopRDD, hadoopFile, hadoopRDD, union, runJob, setSystemProperty, uiWebUrl, stop, setJobGroup, setLocalProperty, getConf.
Setting certain Spark configuration properties is not supported. Creating or editing a cluster that sets a restricted property fails with an error. For the complete list, see Spark configuration limitations.
When creating a DataFrame from local data using spark.createDataFrame, row sizes cannot exceed 128MB.
RDD APIs are not supported.
Spark Connect, which is used in more recent versions of Databricks Runtime, defers analysis and name resolution to execution time, which may change the behavior of your code. See Compare Spark Connect to Spark Classic.

UDF limitations

Hive UDFs are not supported. Instead, use UDFs in Unity Catalog.
Scala UDFs cannot be used inside higher-order functions.

Streaming limitations

Note

Some of the listed Kafka options have limited support when used for supported configurations on Azure Databricks. All listed Kafka limitations are valid for both batch and stream processing. See Connect to Apache Kafka.

Working with socket sources is not supported.
The sourceArchiveDir must be in the same external location as the source when you use option("cleanSource", "archive") with a data source managed by Unity Catalog.
For Kafka sources and sinks, the following options are not supported:
- kafka.sasl.client.callback.handler.class
- kafka.sasl.login.callback.handler.class
- kafka.sasl.login.class
- kafka.partition.assignment.strategy
Python foreachBatch does not support ThreadPoolExecutor or multi-threaded execution. Multi-threaded execution may not throw errors but can result in data corruption or inconsistent results.

Network and file system limitations

Standard compute runs commands as a low-privilege user forbidden from accessing sensitive parts of the filesystem.
POSIX-style paths (/) for DBFS are not supported.
Only workspace admins and users with ANY FILE permissions can directly interact with files using DBFS.

You cannot connect to the instance metadata service or Azure WireServer.

Environment variable limitations

Only a predefined set of environment variables is available to the Spark engine and init scripts on standard compute. Environment variables that you set on a cluster but that aren't in this set remain available to your user code, including UDFs, but aren't available to the Spark engine or init scripts.

This set includes common configuration, credential, and runtime variables. The following examples are not exhaustive:

Networking and proxies: HTTP_PROXY, HTTPS_PROXY, NO_PROXY
TLS certificates: REQUESTS_CA_BUNDLE, SSL_CERT_FILE
AWS credentials and region: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION, AWS_DEFAULT_REGION
Azure credentials: AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, AZURE_TENANT_ID
Google Cloud credentials: GOOGLE_APPLICATION_CREDENTIALS
Databricks connection: DATABRICKS_HOST, DATABRICKS_TOKEN
Locale and time zone: TZ, LANG, LC_ALL
Parallelism and threading: OMP_NUM_THREADS, OPENBLAS_NUM_THREADS, MKL_NUM_THREADS, NUMEXPR_NUM_THREADS
Observability: DD_API_KEY, DD_SITE, DD_ENV
Unity Catalog defaults: CATALOG, CATALOG_NAME

Scala kernel limitations

The following limitations apply when using the scala kernel on standard compute:

Certain classes cannot be used in your code if they conflict with the internal almond kernel library, most notably Input. For a list of almond's defined imports, see almond imports.
Logging directly to log4j is not supported.
In the UI, the dataframe schema dropdown is not supported.
If your driver hits OOM, the Scala REPL will not terminate.
//connector/sql-aws-connectors:sql-aws-connectors is not in the Scala REPL's bazel target, use results in ClassNotFoundException.
The Scala kernel is incompatible with SQLImplicits.

Runtime-dependent limitations

The following limitations have been resolved through runtime updates, but might still apply to your workload if you use an older runtime.

Language support

Feature	Required Databricks Runtime version
Scala	13.3 or above
All runtime-bundled Java and Scala libraries available by default	15.4 LTS or above (for 15.3 or below, set `spark.databricks.scala.kernel.fullClasspath.enabled=true`)

Spark API support

Feature	Required Databricks Runtime version
Spark ML	17.0 or above
Python: `SparkContext (sc)`, `spark.sparkContext`, `sqlContext`	14.0 or above
Scala `Dataset` ops: `map`, `mapPartitions`, `foreachPartition`, `flatMap`, `reduce`, `filter`	15.4 LTS or above

UDF support

Feature	Required Databricks Runtime version
`applyInPandas`, `mapInPandas`	14.3 LTS or above
Scala scalar UDFs and Scala UDAFs	14.3 LTS or above
Import modules from Git folders, workspace files, or volumes in PySpark UDFs	14.3 LTS or above
Use custom versions of `grpc`, `pyarrow`, or `protobuf` in PySpark UDFs via notebook- or compute-scoped libraries	14.3 LTS or above
Non-scalar Python and Pandas UDFs, including UDAFs, UDTFs, and Pandas on Spark	14.3 LTS or above
Python scalar UDFs and Pandas UDFs	13.3 LTS or above

Streaming support

Feature	Required Databricks Runtime version
`transformWithStateInPandas`	16.3 or above
`applyInPandasWithState`	14.3 LTS or above
Scala `foreach`	16.1 or above
Scala `foreachBatch` and `flatMapGroupsWithState`	16.2 or above
Scala `from_avro`	14.2 or above
Kafka options `kafka.ssl.truststore.location` and `kafka.ssl.keystore.location` (specified location must be an external location managed by Unity Catalog)	13.3 LTS or above
Scala `StreamingQueryListener`	16.1 or above
Python `StreamingQueryListener` interacting with Unity Catalog-managed objects	14.3 LTS or above

Additionally, for Python, foreachBatch has the following behavior changes on Databricks Runtime 14.0 and above:

print() commands write output to the driver logs.
You cannot access the dbutils.widgets submodule inside the function.
Any files, modules, or objects referenced in the function must be serializable and available on Spark.

Network and file system support

Feature	Required Databricks Runtime version
Connections to ports other than 80 and 443	12.2 LTS or above

Spark configuration limitations

On Databricks Runtime 19 and above, you can't set the following Spark configuration properties in standard access mode. Creating or editing a cluster that sets any of these properties, or a property that begins with one of the listed prefixes (shown with a trailing .*), fails with an error. Remove these properties from your cluster configuration, compute policies, and job definitions before upgrading.

Category	Restricted Spark configuration properties
JVM, classpath, and native library options	`spark.driver.extraJavaOptions`, `spark.driver.defaultJavaOptions`, `spark.executor.extraJavaOptions`, `spark.executor.defaultJavaOptions`, `spark.yarn.am.extraJavaOptions`, `spark.yarn.am.defaultJavaOptions`, `spark.driver.extraClassPath`, `spark.executor.extraClassPath`, `spark.driver.extraLibraryPath`, `spark.executor.extraLibraryPath`
Environment variable injection	`spark.executorEnv.`, `spark.yarn.appMasterEnv.`, `spark.kubernetes.driverEnv.*`
Libraries, JARs, and files	`spark.jars`, `spark.jars.packages`, `spark.jars.ivy`, `spark.jars.ivySettings`, `spark.jars.repositories`, `spark.files`, `spark.archives`, `spark.submit.pyFiles`, `spark.sql.maven.additionalRemoteRepositories`, `spark.yarn.jars`, `spark.yarn.archive`, `spark.yarn.dist.jars`, `spark.yarn.dist.files`, `spark.yarn.dist.archives`, `spark.yarn.dist.pyFiles`, `spark.yarn.dist.forceDownloadSchemes`
Hive metastore JARs	`spark.sql.hive.metastore.jars`, `spark.sql.hive.metastore.jars.path`
Python and R executables	`spark.pyspark.python`, `spark.pyspark.driver.python`, `spark.r.command`, `spark.r.driver.command`, `spark.r.shell.command`
Kubernetes pod configuration	`spark.kubernetes.container.image`, `spark.kubernetes.driver.container.image`, `spark.kubernetes.executor.container.image`, `spark.kubernetes.driver.podTemplateFile`, `spark.kubernetes.executor.podTemplateFile`, `spark.kubernetes.driver.volumes.`, `spark.kubernetes.executor.volumes.`, `spark.kubernetes.driver.secrets.`, `spark.kubernetes.executor.secrets.`
Local file system access	`spark.connect.copyFromLocalToFs.allowDestLocal`, `spark.sql.artifact.copyFromLocalToFs.allowDestLocal`
Azure Databricks isolation and access control	`spark.databricks.pyspark.enableProcessIsolation`, `spark.databricks.pyspark.enablePy4JSecurity`, `spark.databricks.pyspark.runAsLowPrivilegeUser`, `spark.databricks.pyspark.enableIptables`, `spark.databricks.pyspark.onlyAllowTrustedFilesystems`, `spark.databricks.pyspark.trustedFilesystems`, `spark.databricks.pyspark.pythonUdfsOnly`, `spark.databricks.acl.dfAclsEnabled`, `spark.databricks.acl.fileAccess.enabled`, `spark.databricks.acl.allowTransformUsing`, `spark.databricks.passthrough.enabled`, `spark.databricks.runtimeConfigAllowlist.enabled`, `spark.databricks.runtimeConfigAllowlist.extraConfs`, `spark.testing.databricks.runtimeConfigAllowlist.enabled`, `spark.databricks.sql.jdbc.enableLakeguardDriver`

Feedback

Was this page helpful?

Last updated on 2026-06-26