Biweekly Data & Analytics Digest
Posts
From Lakehouses and OLTP Convergence to AI Workflows, Snowflake Gen2, and the Decline of Cloud Primitives

From Lakehouses and OLTP Convergence to AI Workflows, Snowflake Gen2, and the Decline of Cloud Primitives

Biweekly Data & Analytics Digest: Cliffside Chronicle

Josh Miramant
21 May • Estimated Reading Time: 9 minutes

From Analytics to Transactions: Databricks Expands Its Ambitions with Neon

Databricks is acquiring Neon, the open-source, Postgres-compatible serverless database startup, for $1 billion. Neon has made waves with its modern take on Postgres: decoupled storage and compute, autoscaling, and cloud-native architecture designed for performance and simplicity. With this move, Databricks isn’t just bolstering its storage layer…it’s stepping directly into the OLTP world, traditionally outside its core strength in analytics and AI. Neon’s open-source DNA also fits neatly with Databricks’ open-core strategy, reinforcing its position against Snowflake’s more closed ecosystem.

This is not just an acqui-hire or a bolt-on feature…this is a strategic realignment of Databricks’ stack. Until now, Databricks thrived in batch-heavy, analytics workloads (OLAP), while transactional systems (OLTP) remained the domain of traditional databases and cloud-native Postgres vendors. By bringing Neon in-house, Databricks could enable a unified lakehouse that supports both real-time transactional processing and analytical workloads…all on open standards. That’s a direct shot at Snowflake, which has been expanding into streaming, ML, and unstructured data. We’ve seen this play before: Snowflake added support for Iceberg to court openness; Databricks is doubling down on open Postgres to absorb OLTP.

If executed well, this could collapse the OLTP/OLAP boundary for good and make Databricks the go-to platform not just for data science, but for app backends too.

Continue Reading.

Snowflake Gen2 Warehouses: Pricing Confusion or Sanity

Snowflake’s Gen 2 Warehouses just dropped, promising automatic elasticity, lower latency, and fewer scaling headaches. The new architecture delivered significantly better query response times (up to 2x faster in some cases), better memory utilization, and eliminated the notorious “cold start” delay. It even scaled seamlessly under load without manual intervention. That’s a big win for teams burned by Gen1’s clunky auto-scaling and startup lags. However, it’s not a silver bullet, query costs stayed flat, and some workflows saw minimal gain.

This is more than just a backend tweak…it’s Snowflake trying to outmaneuver competitors in the performance arms race. What’s impressive is that Snowflake is finally pushing past the “scale up by default” mindset. Gen2 is built to scale smart, not just big. That’s a subtle but important shift for modern analytics—especially for teams running bursty or variable workloads. We’ve seen too many clients over-provision Gen1 clusters to avoid latency spikes, which just burns budget. If Gen2 truly removes the cold start tax and smooths scaling without tradeoffs, it could change how orgs think about provisioning. But we’re watching closely: there’s still too little transparency into how Snowflake’s “adaptive” logic makes decisions.

While Gen2 Warehouses are a real upgrade it’s not a revolution. Worth testing if you care about latency-sensitive workloads. But we still want more control and visibility.

Continue Reading.

AI vs. Data Purism: Stop Choosing Sides, Start Shipping Value

Joe Reis had a lively discussion on LinkedIn this week discussing a growing fracture in the data world: the AI evangelists vs. the data traditionalists. On one end, folks treat GenAI like it’s a silver bullet. On the other, the same old gatekeepers are still obsessed with perfect data models, flawless lineage, and ceremonies that have yielded… well, not much. The brutal truth? The “craft” of data has failed more often than it’s succeeded. AI won’t fix that alone—but clinging to 2015-era best practices won’t either.

We’ve seen this play out in our own work. Stakeholder patience is wearing thin. Speed beats purity. When you can go from fuzzy stakeholder notes to a working MVP dashboard or model in hours using AI-enhanced workflows, that’s a dopamine loop the business loves. We’re talking GPTs for stakeholder interview summaries, prototype pipelines, instant semantic layer drafts. Are they perfect? No. Are they better than the nothing that comes from 6-month data roadmap paralysis? Absolutely.

AI isn’t replacing the craft…it’s refactoring it. Let the AI do the first draft. Let the human make it real. And let’s finally kill the myth that slow = good.

If you’re still debating AI’s role, you’re missing the point. It’s not about choosing sides—it’s about reducing cycle time, improving iteration, and finally delivering wins in a space that’s overdue for some. What’s one tradition you think the data world should let go of?

Continue Reading.

Is MCP Just Rebranded Complexity?

Raz Polak’s recent blog post dismantles the current buzz around Modern Cloud Primitives (MCP), tools like Kafka, Redis, and Airflow, arguing they’ve outlived their usefulness as foundational components for modern data and application stacks. He suggests that MCPs, once hailed as scalable building blocks, now create brittle, over engineered systems requiring excessive DevOps overhead. With startups increasingly choosing managed APIs (like Vercel, PlanetScale, or Railway) over infrastructure-heavy primitives, the tide is shifting toward productivity, not purity. The post even questions whether the idealization of “composable architectures” has become a liability, not an asset.

This hits close to home for every data and platform leader trying to balance flexibility with focus. We’ve seen too many teams burn cycles wiring up best-in-class primitives only to end up maintaining a distributed Rube Goldberg machine. While MCPs enabled scale at FAANG-tier orgs, they could be overkill for 80% of use cases, especially in mid-market environments. Tools like Airflow or Kafka aren’t evil, but they assume a level of engineering capacity most companies just don’t have or need. We think Raz is right to call out the inertia in tech decision-making. Teams default to “what’s cool” instead of “what delivers value fastest.”

Here’s the real question: Should more data teams rethink their architecture choices as product decisions rather than engineering exercises?

Continue Reading.

The Data 3.0 Roadmap: Lakehouses, Metadata, and the New Enterprise Stack

Bessemer Ventures just shared their “Roadmap for Data 3.0,” and it’s a sharp lens on where data infrastructure is headed. Their core thesis? The lakehouse has won the storage wars, and now the frontier is shifting to metadata, governance, and intelligent orchestration. From the death of the data warehouse vs. data lake debate to the rise of horizontal platforms (e.g., Snowflake, Databricks, BigQuery) acting as the “OS of data,” Bessemer lays out a world where the value moves up the stack, to discovery layers, governance frameworks, and AI-infused ops. They highlight new architectural primitives (like table formats, decoupled compute, and data products) and the emerging role of metadata as a first-class citizen in building scalable, governable, intelligent data systems.

We think this piece is a must-read because it formalizes what many of us have been sensing: Data is no longer about plumbing…it’s about productizing, governing, and activating your data assets. The lakehouse is the default, yes, but what’s really happening is that control planes are taking over the value layer. If you’re a mid-market firm still wrestling with data silos or brittle Airflow pipelines, the roadmap is a wake-up call. Metadata is now your leverage point — not just for discovery, but for AI observability, lineage, cost control, and LLM grounding. We’ve seen this firsthand: clients that invest in metadata (e.g., via Unity Catalog, Atlan, or open standards like OpenMetadata) leapfrog past those obsessing over yet another dbt model. The winners in Data 3.0 won’t just store data better, they’ll orchestrate and trust it at scale.

Continue Reading.

Upcoming Event: Talking Data with Joe Reis

Join us in for a data-driven evening with Joe Reis—renowned data strategist and co-author of “Fundamentals of Data Engineering”—as we network, learn, and unwind in the heart of San Francisco.

Register now.

What topics interest you most in AI & Data?

We’d love your input to help us better understand your needs and prioritize the topics that matter most to you in future newsletters.

“Figures don’t lie, but liars do figure.”

— Mark Twain