New
- We are now more permissive when specifying configuration schema in order make constructing
configuration schema more concise.
- When specifying the value of scalar inputs in config, one can now specify that value directly as
the key of the input, rather than having to embed it within a
value
key.
Breaking
- The implementation of SQL-based event log storages has been consolidated,
which has entailed a schema change. If you have event logs stored in a
Postgres- or SQLite-backed event log storage, and you would like to maintain
access to these logs, you should run
dagster instance migrate
. To check
what event log storages you are using, run dagster instance info
. - Type matches on both sides of an
InputMapping
or OutputMapping
are now enforced.
New
- Dagster is now tested on Python 3.8
- Added the dagster-celery library, which implements a Celery-based engine for parallel pipeline
execution.
- Added the dagster-k8s library, which includes a Helm chart for a simple Dagit installation on a
Kubernetes cluster.
Dagit
- The Explore UI now allows you to render a subset of a large DAG via a new solid
query bar that accepts terms like
solid_name+*
and +solid_name+
. When viewing
very large DAGs, nothing is displayed by default and *
produces the original behavior. - Performance improvements in the Explore UI and config editor for large pipelines.
- The Explore UI now includes a zoom slider that makes it easier to navigate large DAGs.
- Dagit pages now render more gracefully in the presence of inconsistent run storage and event logs.
- Improved handling of GraphQL errors and backend programming errors.
- Minor display improvements.
dagster-aws
- A default prefix is now configurable on APIs that use S3.
- S3 APIs now parametrize
region_name
and endpoint_url
.
dagster-gcp
- A default prefix is now configurable on APIs that use GCS.
dagster-postgres
- Performance improvements for Postgres-backed storages.
dagster-pyspark
- Pyspark sessions may now be configured to be held open after pipeline execution completes, to
enable extended test cases.
dagster-spark
spark_outputs
must now be specified when initializing a SparkSolidDefinition
, rather than in
config.- Added new
create_spark_solid
helper and new spark_resource
. - Improved EMR implementation.
Bugfix
- Fixed an issue retrieving output values using
SolidExecutionResult
(e.g., in test) for
dagster-pyspark solids. - Fixes an issue when expanding composite solids in Dagit.
- Better errors when solid names collide.
- Config mapping in composite solids now works as expected when the composite solid has no top
level config.
- Compute log filenames are now guaranteed not to exceed the POSIX limit of 255 chars.
- Fixes an issue when copying and pasting solid names from Dagit.
- Termination now works as expected in the multiprocessing executor.
- The multiprocessing executor now executes parallel steps in the expected order.
- The multiprocessing executor now correctly handles solid subsets.
- Fixed a bad error condition in
dagster_ssh.sftp_solid
. - Fixed a bad error message giving incorrect log level suggestions.
Documentation
- Minor fixes and improvements.
Thank you
Thank you to all of the community contributors to this release!! In alphabetical order: @cclauss,
@deem0n, @irabinovitch, @pseudoPixels, @Ramshackle-Jamathon, @rparrapy, @yamrzou.
Breaking
- The
selector
argument to PipelineDefinition
has been removed. This API made it possible to
construct a PipelineDefinition
in an invalid state. Use PipelineDefinition.build_sub_pipeline
instead.
New
- Added the
dagster_prometheus
library, which exposes a basic Prometheus resource. - Dagster Airflow DAGs may now use GCS instead of S3 for storage.
- Expanded interface for schedule management in Dagit.
Dagit
- Performance improvements when loading, displaying, and editing config for large pipelines.
- Smooth scrolling zoom in the explore tab replaces the previous two-step zoom.
- No longer depends on internet fonts to run, allowing fully offline dev.
- Typeahead behavior in search has improved.
- Invocations of composite solids remain visible in the sidebar when the solid is expanded.
- The config schema panel now appears when the config editor is first opened.
- Interface now includes hints for autocompletion in the config editor.
- Improved display of solid inputs and output in the explore tab.
- Provides visual feedback while filter results are loading.
- Better handling of pipelines that aren't present in the currently loaded repo.
Bugfix
- Dagster Airflow DAGs previously could crash while handling Python errors in DAG logic.
- Step failures when running Dagster Airflow DAGs were previously not being surfaced as task
failures in Airflow.
- Dagit could previously get into an invalid state when switching pipelines in the context of a
solid subselection.
frozenlist
and frozendict
now pass Dagster's parameter type checks for list
and dict
.- The GraphQL playground in Dagit is now working again.
Nits
- Dagit now prints its pid when it loads.
- Third-party dependencies have been relaxed to reduce the risk of version conflicts.
- Improvements to docs and example code.
Breaking
- The interface for type checks has changed. Previously the
type_check_fn
on a custom type was
required to return None (=passed) or else raise Failure
(=failed). Now, a type_check_fn
may
return True
/False
to indicate success/failure in the ordinary case, or else return a
TypeCheck
. The newsuccess
field on TypeCheck
now indicates success/failure. This obviates
the need for the typecheck_metadata_fn
, which has been removed. - Executions of individual composite solids (e.g. in test) now produce a
CompositeSolidExecutionResult
rather than a SolidExecutionResult
. dagster.core.storage.sqlite_run_storage.SqliteRunStorage
has moved to
dagster.core.storage.runs.SqliteRunStorage
. Any persisted dagster.yaml
files should be updated
with the new classpath.is_secret
has been removed from Field
. It was not being used to any effect.- The
environmentType
and configTypes
fields have been removed from the dagster-graphql
Pipeline
type. The configDefinition
field on SolidDefinition
has been renamed to
configField
.
Bugfix
PresetDefinition.from_files
is now guaranteed to give identical results across all Python
minor versions.- Nested composite solids with no config, but with config mapping functions, now behave as expected.
- The dagster-airflow
DagsterKubernetesPodOperator
has been fixed. - Dagit is more robust to changes in repositories.
- Improvements to Dagit interface.
New
- dagster_pyspark now supports remote execution on EMR with the
@pyspark_solid
decorator.
Nits
- Documentation has been improved.
- The top level config field
features
in the dagster.yaml
will no longer have any effect. - Third-party dependencies have been relaxed to reduce the risk of version conflicts.
- Scheduler errors are now visible in Dagit
- Run termination button no longer persists past execution completion
- Fixes run termination for multiprocess execution
- Fixes run termination on Windows
dagit
no longer prematurely returns control to terminal on Windowsraise_on_error
is now available on the execute_solid
test utilitycheck_dagster_type
added as a utility to help test type checks on custom types- Improved support in the type system for
Set
and Tuple
types - Allow composite solids with config mapping to expose an empty config schema
- Simplified graphql API arguments to single-step re-execution to use
retryRunId
, stepKeys
execution parameters instead of a reexecutionConfig
input object - Fixes missing step-level stdout/stderr from dagster CLI
Adds a type_check
parameter to PythonObjectType
, as_dagster_type
, and @as_dagster_type
to
enable custom type checks in place of default isinstance
checks.
See documentation here:
https://dagster.readthedocs.io/en/latest/sections/learn/tutorial/types.html#custom-type-checks
Improved the type inference experience by automatically wrapping bare python types as dagster
types.
Reworked our tutorial (now with more compelling/scary breakfast cereal examples) and public API
documentation.
See the new tutorial here:
https://dagster.readthedocs.io/en/latest/sections/learn/tutorial/index.html
New solids explorer in Dagit allows you to browse and search for solids used across the
repository.
Enabled solid dependency selection in the Dagit search filter.
- To select a solid and its upstream dependencies, search
+{solid_name}
. - To select a solid and its downstream dependents, search
{solid_name}+
. - For both search
+{solid_name}+
.
Added a terminate button in Dagit to terminate an active run.
Added an --output
flag to dagster-graphql
CLI.
Added confirmation step for dagster run wipe
and dagster schedule wipe
commands (Thanks
@shahvineet98).
Fixed a wrong title in the dagster-snowflake
library README (Thanks @Step2Web).