ETL Meets AI: dbt, Airflow and Databricks all Deepen the AI Integrations

Biweekly Data & Analytics Digest: Cliffside Chronicle

dbt’s Handy Weighs in on How AI is Disrupting Data Engineering

Tristan Handy’s article explores how AI is reshaping data engineering by automating mundane tasks, cutting down on manual coding, and freeing data professionals to focus on higher-value work. He suggests that as AI-driven tools mature, data engineers will spend less time wrangling pipelines and more time on strategic initiatives that produce real business impact.

One of the key takeaways is that AI can help lower the barrier for individuals with fewer technical skills. Rather than needing a deep background in SQL or Python, teams can leverage AI-assisted solutions to transform raw data, generate insights, and move faster on projects. This shift places more emphasis on collaboration and domain expertise. If AI handles the technical heavy lifting, skilled data experts become even more essential for strategic direction and data governance.

Handy also warns that AI tools are only as good as the data and systems around them. We need to ensure our existing data pipelines, governance, and models remain robust. By pairing human oversight with AI’s productivity gains, we can get the best of both worlds: efficiency and reliability.

Apache Airflow adoption soars, per Astronomer’s 2025 report

Astronomer (the firm behind Airflow’s managed platform) released the State of Airflow 2025 report, highlighting the surging growth and critical role of the popular orchestration tool. Monthly Airflow downloads ballooned from ~888,000 in late 2020 to over 31 million in November 2024, and the number of organizations running Airflow tripled to 77,000+ in that same period.

Over 90% of surveyed data teams now consider Airflow mission-critical, and the report notes Airflow’s use has expanded beyond ETL into powering machine learning and AI workflows in production, cementing its status as foundational infrastructure in modern data stacks.

dbt Labs unveils next-gen engine and AI assistant

dbt Labs announced several enhancements aimed at speeding up analytics engineering workflows. The company is launching a next-generation dbt engine (integrating technology from its SDF Labs acquisition) and an official Visual Studio Code extension, promising dramatically faster project parsing and iterative development. dbt also introduced the general availability of dbt Copilot, an AI assistant that auto-generates code, tests, and documentation, and previewed dbt Core 1.10 with new features (like “sample mode” and stricter validations) to help teams move quickly without sacrificing data quality.

Airbyte adds unstructured data and Iceberg support in latest release

Open-source data integration platform Airbyte launched a Winter 2025 update with new connectors to help enterprises leverage their data for AI while preserving data sovereignty. The release includes connectors for unstructured data sources such as Google Drive and SharePoint, plus compatibility with Apache Iceberg – an open table format for large analytic datasets. Airbyte is also offering a new enterprise connector bundle (covering systems like NetSuite, SAP, ServiceNow, and Workday) to extract key business data, enabling companies to run AI and analytics on their own infrastructure without exposing sensitive data to third-party SaaS tools​

Blog Spotlight: Data Visualization With Databricks

This article dives into storytelling with data in Databricks, exploring how to move beyond static visuals to create engaging, narrative-driven insights. Whether it’s using interactive dashboards, dynamic filtering, or tailored annotations, the goal is to help decision makers understand the “why” behind the numbers.

What topics interest you most in AI & Data?

We’d love your input to help us better understand your needs and prioritize the topics that matter most to you in future newsletters.

Login or Subscribe to participate in polls.

Data are just summaries of thousands of stories – tell a few of those stories to help make the data meaningful.

– Chip and Dan Heath