Traditional semantic layer → AI-ready dbt Semantic Layer with MCP integration
dbt moved toward an AI-ready semantic layer architecture integrated with Model Context Protocol (MCP) servers and agent skills for AI-assisted analytics.
Source →Automated tracking of feature launches, pricing changes, partnerships, and architecture shifts across data integration and ingestion companies — updated daily.
Type
Company
dbt moved toward an AI-ready semantic layer architecture integrated with Model Context Protocol (MCP) servers and agent skills for AI-assisted analytics.
Source →Airbyte Agents are now available in ChatGPT, enabling users to create and manage AI agents through natural language.
Source →A unified MCP gateway that simplifies server management, routing, security, and observability across every tool and integration.
Source →AI agents available in ChatGPT enabling users to create, orchestrate, and manage AI agents through natural language.
Source →Airbyte Agents are now available in ChatGPT, allowing users to create and manage AI agents through natural language.
Source →A unified MCP gateway that simplifies server management, routing, security, and observability across every tool and integration.
Source →ChatGPT users can now access Airbyte Agents to create, orchestrate, and manage AI agents through natural language.
Source →A framework for shipping autonomous agents in production with faster feedback loops for analytics engineering workflows.
Source →Framework for developing and operationalizing autonomous agents in production with faster feedback loops for analytics engineering.
Source →Prefect, Snowflake, and HNI Corporation expanding integration to modernize data orchestration and accelerate AI initiatives on the Snowflake AI Data Cloud.
Source →Prefect, Snowflake, and HNI Corporation expanded their integration to modernize data orchestration and accelerate AI initiatives on the Snowflake AI Data Cloud.
Source →Core feature enabling multimodal content synthesis and processing capabilities.
Source →Support for multimodal synthesis capabilities in core, enabling handling of multimodal types with template variable formatting and prompt templates.
Source →Support for multimodal types and operations in core, including multimodal template var formatting, prompt templates, and chat prompt helpers.
Source →Capability enabling assets to be shared and managed across multiple teams.
Source →Catalog system for managing and organizing workflow assets.
Source →Default result storage capability for workflow executions.
Source →Dashboards for monitoring and visualizing rate limits across workflows.
Source →Configuration plugin system for extending Prefect Cloud capabilities.
Source →Automated management system to handle stuck workflow runs without manual intervention.
Source →UI-based infrastructure debugging capability shipped as part of Prefect's feature releases based on patterns from massive scale operations.
Source →Core feature enabling multimodal synthesis capabilities, supporting basic operations for multimodal types with template variable formatting and chat prompt helpers.
Source →Cross-team assets capability enabling asset sharing and collaboration across organizational teams.
Source →Assets Catalog feature for organizing and managing data assets across workflows.
Source →Default result storage functionality for automatic workflow result persistence.
Source →Rate limit monitoring dashboards to track and manage rate limiting across workflows.
Source →Config plugins capability added to Prefect Cloud for enhanced configuration management.
Source →Managed automations feature for automatically handling stuck workflow runs.
Source →Infrastructure debugging capabilities added to the Prefect Cloud UI for improved observability and troubleshooting.
Source →dbt-core v1.12.0-b1 adds JavaScript UDF support alongside existing SQL capabilities, with overload capability for managing multiple function variants.
Source →dbt-core v1.12.0-b1 introduces extensive support for new semantic layer YAML v2 schema with parsing and processing of semantic models, dimensions, metrics, and derived entities.
Source →Added support for Python 3.14.
Source →Added support for vars.yml to declare project variables and new selector method for resource selection.
Source →Added --sql flag to dbt run-operation for executing ad-hoc SQL/Jinja statements and compiled SQL for snapshots now written to target/compiled/.
Source →Environment variables can now be loaded from .env file, and dbt seed supports --empty flag to create tables without loading data.
Source →Support for JavaScript user-defined functions with overload capability via overloads block.
Source →Multiple new semantic layer features including support for dimensions, metrics, semantic models, derived entities, and agg_time_dimension in v2 YAML schema.
Source →Support for partial parsing of function nodes, new UnparsedMetricV2 for Semantic Layer metrics, and function arguments with default values.
Source →Enabled partial parsing support specifically for function nodes to improve parsing performance.
Source →Comprehensive implementation of new Semantic Layer YAML v2 schema with support for dimensions, entities, metrics, primary_entity fields, agg_time_dimension, and derived semantic entities.
Source →Write compiled SQL for snapshots to target/compiled/ during dbt compile.
Source →Add unit tests to the Jinja graph object, enabling tools like dbt-project-evaluator to run checks on unit tests.
Source →Implement config.meta_get and config.meta_require for accessing configuration metadata.
Source →Allow jinja suffixed extensions for markdown and sql files.
Source →Environment variables can now be loaded from .env file.
Source →Added 'selector' selector method for more flexible node selection.
Source →Added support for Python 3.14.
Source →Support for overloaded UDFs via overloads block in function YAML entries with per-overload start and result log events.
Source →Added --sql flag to dbt run-operation for executing ad-hoc SQL/Jinja statements.
Source →dbt seed now supports --empty flag to create tables without loading data.
Source →Added directory change instruction after dbt init and execute dbt debug logic after creating a new project.
Source →Added support for vars.yml to declare project variables.
Source →Support for JavaScript user-defined functions in dbt-core.
Source →Partial parsing support for function nodes, new UnparsedMetricV2 for Semantic Layer Metrics, function arguments with default values, and extensive semantic layer YAML v2 implementation including dimensions, entities, and metrics.
Source →The dbt Orchestrator moves from pod-per-task model to shared concurrency primitives, reducing infrastructure overhead and enabling efficient state-aware caching.
Source →dbt integration through Prefect dbt Orchestrator for model-by-model execution with state-aware caching and durable retries.
Source →Executes dbt graph model by model with state-aware caching, durable retries, and shared concurrency primitives, eliminating redundant work and reducing pod-per-task overhead.
Source →dbt-core v1.12.0-b1 adds support for declaring project variables in a separate vars.yml file for improved organization.
Source →dbt-core v1.12.0-b1 introduces support for JavaScript UDFs and allows defining overloaded functions via overloads block in function YAML entries.
Source →dbt-core v1.12.0-b1 adds support for new YAML format for semantic metrics, moving away from legacy SQL-based metric definitions to structured YAML configuration.
Source →Implement config.meta_get and config.meta_require for accessing metadata within configurations.
Source →Add catalogs.yml usage tracking for improved catalog management visibility.
Source →Execute dbt debug logic after creating a new project in dbt init for immediate project validation.
Source →Allow jinja suffixed extensions for markdown and sql files for more flexible templating.
Source →Allow continue running child on parent error to improve workflow resilience.
Source →Add Reused to NodeStatus and RunStatus, and per-overload start and result log events for overloaded UDFs.
Source →Add support for Python 3.14 to enable compatibility with latest Python releases.
Source →Write compiled SQL for snapshots to target/compiled/ during dbt compile for better debugging.
Source →Environment variables can now be loaded from .env file for better configuration management.
Source →Add 'selector' selector method for more flexible model selection during runs.
Source →dbt seed now supports --empty flag to create tables without loading data.
Source →Add --sql flag to dbt run-operation for executing ad-hoc SQL/Jinja statements.
Source →Support for vars.yml to declare project variables, allowing configuration of variables separately from dbt_project.yml.
Source →Added ability to indicate dbt Model also represents a Semantic Model, enabling dual representation in data lineage.
Source →Support for JavaScript UDFs (User Defined Functions) with overloaded UDFs via overloads block in function YAML entries.
Source →Add unit tests to Jinja graph object, enabling tools like dbt-project-evaluator to run checks on unit tests.
Source →Add semantic model YAML configuration including dimensions, entities, derived metrics, and primary_entity field support for v2 semantic YAML.
Source →Support for partial parsing of function nodes, new-style YAML Semantic Layer Metrics (UnparsedMetricV2), function arguments with default values, and directory change instructions after dbt init.
Source →Prefect dbt Orchestrator integration enables native orchestration of dbt DAGs with state-aware caching and durable retries.
Source →Prefect dbt Orchestrator executes dbt graph models with state-aware caching, durable retries, and shared concurrency primitives, eliminating unnecessary rebuilds and pod-per-task overhead.
Source →Introduction of JavaScript UDFs support with ability to define overloaded UDFs via overloads block in function YAML entries.
Source →dbt-core v1.12.0-b1 adds support for UnparsedMetricV2 and new YAML schema for semantic layer entities with extended functionality including derived dimensions and agg_time_dimension.
Source →Disable unit tests whose model is disabled and support unit testing models that depend on sources with the same name.
Source →Add unit tests to the Jinja graph object, enabling tools like dbt-project-evaluator to run checks on unit tests.
Source →Added per-overload start and result log events for overloaded UDFs (LogStartOverload, LogOverloadResult) and Reused to NodeStatus and RunStatus.
Source →Write compiled SQL for snapshots to target/compiled/ during dbt compile and add Python 3.14 support.
Source →Semantic layer enhancements including agg_time_dimension, primary_entity field, derived semantic entities, and derived dimensions parsing for v2 YAML.
Source →Added selector method for selection logic and support for jinja suffixed extensions for markdown and sql files.
Source →Added support for vars.yml to declare project variables and env vars can now be loaded from .env file.
Source →Added --empty flag to dbt seed to create tables without loading data and --sql flag to dbt run-operation for executing ad-hoc SQL/Jinja statements.
Source →Support for JavaScript UDFs with overloaded UDFs via overloads block in function YAML entries.
Source →Added config.meta_get and config.meta_require methods for runtime configuration object.
Source →Added directory change instruction after dbt init and ability to indicate dbt Model also represents a Semantic Model.
Source →Support for partial parsing of function nodes, new-style YAML Semantic Layer Metrics (UnparsedMetricV2), and function arguments with default values.
Source →Self-hosted HTTP backend for document parsing that preserves spatial layout of text rather than raw content, supporting PDFs, Word documents, PowerPoints, spreadsheets, and images with structured text and vision model endpoints.
Source →Self-hosted HTTP backend for document parsing that sits between naive extraction tools and expensive cloud APIs, supporting wide range of formats and providing structured text with position data or page rendering.
Source →LiteParse Server offers both slim version with no infrastructure dependencies and fuller setup with caching, rate limiting, tracing for flexible deployment options.
Source →A self-hosted HTTP backend for document parsing that preserves spatial layout of text with support for PDFs, Word documents, PowerPoints, spreadsheets, and images, featuring structured text with position data and page rendering as images.
Source →Self-hostable HTTP backend for document parsing that preserves spatial layout of text and supports PDFs, Word documents, PowerPoints, spreadsheets, and images with two main endpoints for structured text and image rendering.
Source →Self-hosted HTTP backend for document parsing that preserves spatial layout with support for PDFs, Word documents, PowerPoints, spreadsheets, and images, featuring two endpoints for structured text with position data and page rendering as images.
Source →dbt Labs introduced AI agents and developer agents that automate coding and testing tasks, enabling faster analytics engineering without manual intervention.
Source →An AI-powered coding agent for analytics engineering that is grounded in your dbt project to help ship faster without breaking downstream dependencies.
Source →Introduction of dbt Developer Agent and agentic approaches represents a shift from manual coding to AI-assisted, autonomous analytics engineering workflows.
Source →An AI-powered coding agent for analytics engineering that is now in Preview, enabling developers to automate analytics engineering tasks.
Source →An AI-powered coding agent for analytics engineering now available in Preview that assists with analytics engineering tasks.
Source →An AI coding agent for analytics engineering that is now in preview, enabling automated development assistance.
Source →Generally Available release with disaster recovery, governance, and audit logging capabilities on customer terms.
Source →Enterprise orchestration platform featuring Disaster Recovery, Governance, and Audit Logging capabilities now Generally Available.
Source →Introduces disaster recovery, governance, and audit logging capabilities for Airflow-as-a-service in customer environments, now generally available.
Source →Ship AI agents faster with the Airbyte Agent SDK skip custom data plumbing and connect agents to reliable, production-ready data pipelines.
Source →One connection gives AI agents secure access to your entire business context through Airbyte MCP—without rebuilding data integrations.
Source →Introducing Airbyte Agents—a new way to give AI agents trusted business context with Airbyte.
Source →Ship AI agents faster with the Airbyte Agent SDK, allowing teams to skip custom data plumbing and connect agents to reliable, production-ready data pipelines.
Source →One connection gives AI agents secure access to entire business context through Airbyte MCP without rebuilding data integrations.
Source →New way to give AI agents trusted business context with Airbyte, enabling agents to autonomously collect, process, analyze, and act on data.
Source →SDK enabling developers to ship AI agents faster by skipping custom data plumbing and connecting agents to production-ready data pipelines.
Source →One connection gives AI agents secure access to entire business context through Airbyte MCP without rebuilding data integrations.
Source →New feature introducing Airbyte Agents to give AI agents trusted business context with Airbyte.
Source →The only data engineering agent built for Airflow, bringing Astronomer's operational Airflow knowledge directly to engineers doing the work.
Source →The only data engineering agent built for Airflow, purpose-built to bring Astronomer's operational Airflow knowledge directly to engineers doing the work.
Source →The only data engineering agent built for Airflow, bringing Astronomer's operational knowledge directly to engineers doing the work.
Source →Model Context Protocol tools for agentic OCR with Parse, Classify, and Split capabilities, enabling integration with Claude, Cursor, or Copilot for agentic document processing.
Source →Official provider migration for AI and LLM workflows, built on PydanticAI with same decorators and engine, plus 350+ provider hooks as agent tools.
Source →Refactored MCP around agentic document processing with Parse, Classify, and Split tools for integrating LlamaParse with Claude, Cursor, or Copilot.
Source →Refactored Model Context Protocol integration around agentic document processing with Parse, Classify, and Split capabilities for AI agents like Claude, Cursor, and Copilot.
Source →Official provider for AI and LLM workflows in Apache Airflow, built on PydanticAI with durable execution and human review.
Source →Agentic OCR tools for AI agents with Parse, Classify, and Split capabilities, allowing connection to Claude, Cursor, or Copilot for complex document reading.
Source →Agentic OCR tools for AI agents refactored around agentic document processing with Parse, Classify, and Split capabilities, compatible with Claude, Cursor, and Copilot.
Source →Migration from airflow-ai-sdk to apache-airflow-providers-common-ai 0.1.0, Airflow's official provider for AI and LLM workflows built on PydanticAI with improved toolsets.
Source →New Model Context Protocols for Transform, Catalog, and Quality features with hands-on learning opportunities.
Source →New capabilities for Transform, Catalog, and Quality with hands-on training and demonstrations available.
Source →Transform, Catalog, and Quality MCPs enabling AI-ready data transformations and management capabilities.
Source →Generally Available disaster recovery feature with architecture decisions around database replication and warm standby compute with one-click failover.
Source →Generally Available disaster recovery feature with cross-region failover capabilities, Aurora Global Clusters database replication, and warm standby compute.
Source →Enables cross-region disaster recovery for Astro with Aurora Global Clusters, warm standby compute, and one-click failover, now generally available.
Source →Launched ParseBench, a new leaderboard and benchmark for evaluating document parsers, OCR systems, and agentic document understanding on enterprise files.
Source →LlamaIndex and Kaggle launched ParseBench, a new leaderboard and benchmark for evaluating document parsers and OCR systems for AI agents.
Source →HTTP client support integration for NVIDIA embeddings.
Source →Added HTTP client support to NVIDIA embeddings integration for flexible client configuration.
Source →Delivers significant improvements to Watcher execution mode for running dbt in Airflow, plus restructured documentation built around production usage.
Source →dbt Labs expanded partnership with Google Cloud including 45-day Enterprise free trial for startups and recognition as 2026 Google Cloud Partner of the Year.
Source →Added HTTP client support to NVIDIA embeddings integration for improved connectivity.
Source →Significant improvements to Watcher execution mode for running dbt in Airflow, plus fully restructured documentation built around production usage patterns.
Source →dbt Labs won a 2026 Google Cloud Partner of the Year Award recognizing its collaboration and impact on the Google Cloud platform.
Source →HTTP client support added to NVIDIA embeddings integration.
Source →Addition of HTTP client support to NVIDIA embeddings integration.
Source →Delivers significant improvements to Watcher execution mode for running dbt in Airflow, plus restructured documentation focused on production use cases.
Source →dbt Labs offers 45-day Enterprise free trial via Google Cloud Startup Perks program and won 2026 Google Cloud Partner of the Year Award.
Source →dbt Labs is integrating with Google's Antigravity agentic IDE to enable AI-assisted analytics engineering.
Source →dbt integration with Google's Antigravity agentic IDE, enabling AI agents to work within dbt projects for advanced analytics automation.
Source →Self-service DAG authoring for entire organizations with drag-and-drop no-code interface, allowing non-technical users to create Airflow pipelines without Python knowledge.
Source →dbt Labs partnered with Tableau through MCP integration to enable impact analysis, metric reconciliation, and structured context for reliable AI analytics.
Source →dbt and Tableau integration through MCPs enables structured context for reliable AI analytics, including impact analysis and metric reconciliation.
Source →Self-service DAG authoring tool with drag-and-drop no-code interface allowing anyone to create Airflow pipelines without Python or Airflow knowledge.
Source →Enables self-service DAG authoring with drag-and-drop no-code interface, allowing non-technical users to create Airflow pipelines using templates defined by data engineers.
Source →Integration enabling structured context for reliable AI analytics through Model Context Protocol support, unlocking impact analysis and metric reconciliation.
Source →Connect Unstructured to everything that comes after using webhooks for integration.
Source →Webhooks feature enabling connection of Unstructured to downstream systems and applications.
Source →Webhooks feature to connect Unstructured to downstream applications and services.
Source →Structure-aware PDF QA pipeline built using LanceDB for retrieval and LlamaParse for document parsing.
Source →Partnership on secure document agents, enabling secure authentication and access control for LlamaParse-powered document processing workflows.
Source →Structure-aware PDF QA pipeline integration with LanceDB for improved document question-answering capabilities.
Source →Partnership on secure document agents for agentic document processing with built-in security features.
Source →Prefect Cloud self-serve plans compared to Dagster+ with detailed pricing and feature breakdown for small teams and solo practitioners.
Source →Open-source benchmark for evaluating document OCR and parsing quality for AI agents, containing ~2,000 human-verified enterprise document pages with over 167,000 test rules across five critical dimensions: tables, charts, content faithfulness, semantic formatting, and visual grounding.
Source →First document parsing benchmark for AI agents with ~2,000 human-verified enterprise document pages and 167,000+ test rules, evaluating parsers across five critical dimensions: tables, charts, content faithfulness, semantic formatting, and visual grounding.
Source →The first document parsing benchmark for AI agents with ~2,000 human-verified enterprise document pages and 167,000+ test rules, evaluating parsers across five critical dimensions including tables, charts, content faithfulness, semantic formatting, and visual grounding.
Source →First open-source document parsing benchmark for AI agents with ~2,000 human-verified enterprise document pages and 167,000+ test rules, evaluating parsers across tables, charts, content faithfulness, semantic formatting, and visual grounding.
Source →Open-source benchmark for document parsing and OCR systems with ~2,000 human-verified enterprise document pages and 167,000+ test rules, evaluating parsers across five dimensions: tables, charts, content faithfulness, semantic formatting, and visual grounding.
Source →Documentation updates enable display of unit tests, add support for Saved Query node and UDF lineage graph visualization.
Source →Shift from traditional OCR that transcribes documents to agentic document processing that uses reasoning, self-correction, and adaptive model selection for reliable automation handling layout changes and complex documents.
Source →Add support for UDF (function) resource type in lineage graph visualization.
Source →Enable display of unit tests in dbt-docs and add support for Saved Query node type.
Source →Added @requires.catalogs decorator to compile and test commands to support REST Catalog-Linked database compilation.
Source →Add support for UDF (function) resource type in lineage graph visualization.
Source →Enable display of unit tests and add support for Saved Query node type in dbt-docs.
Source →Major release featuring asset partitions, async tasks, and continued improvements to the Airflow 3 platform.
Source →New Airflow release featuring asset partitions, async tasks, and continued improvements to the Airflow 3 platform.
Source →Introduces asset partitions, async tasks, and continued improvements to the Airflow 3 platform.
Source →New file transformation feature introduced by Unstructured.
Source →A program to recognize and support community leaders within the dbt ecosystem.
Source →New feature introduced for extracting data from documents, part of Unstructured's product offerings.
Source →New feature or product for document extraction capabilities.
Source →GPT-5.4 Mini and Nano variants support added.
Source →MiniMax LLM provider integration with M2.7 default model.
Source →New LLM provider integration with MiniMax M2.7 as default model.
Source →New LLM integration with MiniMax supporting M2.7 model as default.
Source →Support for OpenAI GPT-5 chat model variants including Mini and Nano models.
Source →MiniMax LLM provider integration with M2.7 default model support.
Source →Open-source local document parser with zero Python dependencies, runs entirely locally and uses grid projection algorithm to extract text while preserving spatial structure for tables, columns, and alignment.
Source →Open-source, lightweight, local document parser built from LlamaParse development, runs entirely locally with zero Python dependencies, and is optimized for how agents iterate on documents.
Source →Open-source, lightweight local document parser with zero Python dependencies, runs entirely locally and is built around how agents iterate on documents: parse text fast with fallback to screenshots for visual reasoning.
Source →Open-source lightweight document parser with zero Python dependencies, running entirely locally and built around agentic iteration with fast text parsing and visual reasoning fallback.
Source →Open-source, lightweight document parser running entirely locally with zero Python dependencies, designed for how agents iterate on documents with fast text parsing and screenshot fallback for visual reasoning.
Source →Architectural evolution incorporating visual grounding, bounding boxes, and spatial awareness to handle complex layouts, charts, and images alongside text extraction.
Source →CLI tool for agentic document search with v2 full LlamaParse v2 API migration and v3.0.0 major release featuring new capabilities for coding agents.
Source →Enhanced with standalone mode, automatic port management, and direct API access for faster Airflow pipeline development and deployment.
Source →Removed LlamaParse reader integration and updated Llama Cloud managed index API.
Source →Deprecated Python 3.9 support across all packages, moving to Python 3.10 and later versions.
Source →Updated managed index integration and removed LlamaParse reader dependency.
Source →Updated to use Chat API with retry logic and improved error handling.
Source →Support for Gemini 3 with temperature control and improved file API handling.
Source →Multiple enhancements including gpt-5 variants, reasoning content support, and improved tool calling.
Source →New LLM provider integration for ModelsLab with custom model support.
Source →Deprecated Python 3.9 support across all LlamaIndex integrations requiring Python 3.10 or higher.
Source →New LLM provider integration for ModelsLab API.
Source →Enhanced Astro CLI with standalone mode, automatic port management, and direct API access for faster pipeline development and deployment.
Source →New LLM provider integration with ModelsLab offering OpenAI-compatible LLM capabilities.
Source →Enhanced with standalone mode, automatic port management, and direct API access for faster Airflow pipeline development and deployment.
Source →Build model-agnostic infrastructure for AI agents that works across multiple LLMs, enabling flexibility and scalability.
Source →Build infrastructure that works across multiple LLMs, enabling flexibility, scalability, and future-proof AI systems.
Source →Neo4j graph store enhancement with APOC sample parameter for schema introspection.
Source →ModelsLab LLM provider integration.
Source →Neo4j graph store enhancement with APOC sample parameter for schema introspection.
Source →New ModelsLab LLM provider integration for accessing ModelsLab API.
Source →Strict per-minute rate limiting enforcer using sliding window algorithm for API rate control.
Source →Rate limiting mechanism for LLM and embedding API calls to manage request throttling.
Source →Enhanced reranking capability supporting multimodal input types for improved relevance ranking.
Source →Astronomer's new dataplane-based auth architecture improves Astro resilience, reduces blast radius, and keeps DAG runs running during control plane outages.
Source →Added RestrictedUnpickler to SimpleObjectNodeMapping to prevent unsafe pickle deserialization (CWE-502).
Source →Deprecated asyncio_module in favor of get_asyncio_module for better async context handling.
Source →Added user agent support and APOC sample parameter for large database schema introspection.
Source →Added multimodal support to the LLMReranker component for improved ranking with multimodal inputs.
Source →New rate limiter implementation for strict per-minute caps on API requests.
Source →Added token-bucket rate limiter for LLM and embedding API calls to prevent rate limiting issues.
Source →Startup program offering free credits and founder access to LlamaParse for early-stage companies.
Source →Deprecated asyncio_module parameter in favor of get_asyncio_module() function for better async context management.
Source →Enhanced Neo4j integration with user agent support and APOC schema introspection capabilities.
Source →Added apoc_sample parameter for efficient large database schema introspection.
Source →Added support for GPT-5 variants including GPT-5.2-chat, GPT-5.2-pro, and reasoning models.
Source →New reranker that supports multimodal inputs for improved relevance ranking.
Source →Added token-bucket and sliding window rate limiters for LLM and embedding API calls to prevent exceeding rate limits.
Source →Integration of Gemini Embedding 2 (3072-dimensional vectors with multimodal support) into the LlamaIndex stack for document processing and retrieval workflows.
Source →Reranker supporting multimodal inputs for improved ranking of retrieved documents in multimodal scenarios.
Source →Rate limiter for LLM and embedding API calls with support for both token-bucket and sliding window rate limiting strategies.
Source →Integration of Gemini Embedding 2 (3072-dimensional vectors, multimodal support) into LlamaIndex stack for building searchable audio knowledge bases.
Source →LlamaParse now offers a Startup Program with free credits and founder access for early-stage companies.
Source →Astronomer's new dataplane-based auth proxy architecture improves Astro resilience, reduces blast radius, and keeps DAG runs running during control plane outages.
Source →Coalesce is shifting from separate point solutions to an integrated Data Operating Layer that includes quality, observability, and data governance as core components.
Source →Coalesce acquired SYNQ to bring native data quality capabilities into the Data Operating Layer.
Source →Data observability solution integrated as part of the Data Operating Layer, providing automated testing, lineage, and compliance capabilities.
Source →Data observability integrated as part of the Data Operating Layer, bringing native data quality capabilities to the platform.
Source →Coalesce's architectural evolution incorporating data quality observability as a native component of the data operating layer.
Source →Data observability feature integrated as part of the data operating layer for monitoring and managing data quality.
Source →Decline of traditional RAG in favor of autonomous agents that reshape retrieval, reasoning, and knowledge workflows.
Source →Decline of traditional RAG in favor of autonomous agents reshaping retrieval, reasoning, and knowledge workflows in agentic AI.
Source →Shift from traditional RAG to agentic AI where autonomous agents reshape retrieval, reasoning, and knowledge workflows.
Source →Shift from traditional RAG to agentic AI systems reshaping retrieval, reasoning, and knowledge workflows.
Source →Strategic evolution of LlamaIndex from a RAG framework to a comprehensive agentic document processing platform serving 300k+ users across 50+ document formats.
Source →Integration enabling use of Claude Code for database work including querying, schema design, migrations, and optimization.
Source →Enterprise feature enabling fine-grained access control to individual DAGs within a shared deployment for Enterprise customers.
Source →Platform enabling smarter, agent-driven data integration with faster setup and reduced operational overhead, entered public beta.
Source →Airbyte Agent Engine enables smarter, agent-driven data integration with faster setup and reduced operational overhead, entering public beta.
Source →Fine-grained access control feature allowing Enterprise customers to control access to individual DAGs within shared deployments.
Source →Enterprise customers can now control access to individual DAGs within a shared deployment with fine-grained role-based access control.
Source →OCI DataScience streaming endpoint support for streaming predictions.
Source →Mistral LLM integration updated to support Azure SDK.
Source →Enhanced GitHubRepoReader with selective file fetching and deduplication.
Source →SharePoint reader enhancement with Microsoft Graph API pagination support.
Source →Added support for OCI DataScience /predictWithStream endpoint for streaming use cases.
Source →Updated Mistral LLM integration to support Azure SDK.
Source →Added pagination support for Microsoft Graph API calls in SharePoint reader.
Source →Enhanced GitHub reader with selective file fetching and deduplication capabilities.
Source →New reader integration for LayoutIR document parsing and extraction.
Source →Trust layer integration for LlamaIndex agents enabling secure agent mesh operations.
Source →Replaced eval() with json.loads in FaissMapVectorStore persistence for improved security.
Source →Added pagination support for Microsoft Graph API calls.
Source →Added support for new OCI DataScience endpoint with streaming capabilities.
Source →Enhanced GitHubRepoReader with selective file fetching and deduplication features.
Source →Support for Azure SDK integration and updated package requirements.
Source →Added support for Claude Sonnet 4.6 and Claude Opus 4.6 with enhanced structured predict methods.
Source →Trust layer for LlamaIndex Agents providing enhanced agent coordination and management.
Source →New reader integration for LayoutIR with selective file fetching capabilities.
Source →Removed persistent_connection parameter support from IBM embeddings integration.
Source →Removed persistent_connection parameter support from IBM LLM integration.
Source →Integration of LayoutIR for document layout understanding and extraction.
Source →Added support for Claude Sonnet 4.6 model in Anthropic and Bedrock integrations.
Source →Added pagination support for Microsoft Graph API calls in SharePoint reader.
Source →New document reader integration for LayoutIR format supporting structured document extraction.
Source →Selective file fetching and deduplication capabilities for GitHub repository reading.
Source →Trust layer for LlamaIndex agents providing secure agent mesh networking and orchestration.
Source →EvaporateExtractor now sandboxes LLM-generated code execution for improved security.
Source →Reader integration for document layout parsing.
Source →Reader integration for LayoutIR format supporting advanced document layout parsing.
Source →Trust layer for LlamaIndex agents enabling secure agent orchestration.
Source →Feature enabling extraction of structured data with custom schemas while preserving page-by-page granularity for auditable and actionable document insights.
Source →Shift toward page-level extraction granularity in LlamaExtract for better audit trails, actionability, and preservation of document context and source attribution.
Source →Integration for LLM analytics providing insights into LLM usage and performance in document processing workflows.
Source →Add config.meta_get to python model parsing for metadata access in Python models.
Source →Added config.meta_get to python model parsing for accessing metadata configuration.
Source →FaissMapVectorStore replaced unsafe eval() with json.loads() for JSON deserialization.
Source →Evolution from constrained batch processing to long-horizon agents with event triggers, persistent task backlogs, and autonomous document management capabilities operating over extended periods.
Source →Solar-pro3 model added to Upstage LLM integration.
Source →Langchain 1.x compatibility support added.
Source →Document chunking library integration for advanced node parsing.
Source →Custom base_url support added to Cohere LLM integration.
Source →Adaptive thinking support in Bedrock Converse integration.
Source →Claude Sonnet 4.6 and Opus 4.6 model support in Bedrock Converse integration.
Source →Claude Opus 4.6 model support added to Anthropic LLM integration.
Source →Claude Sonnet 4.6 model support added to Anthropic LLM integration.
Source →Callback handler for token-based cost governance and budget tracking.
Source →Integration with Chonkie library for advanced document chunking capabilities.
Source →New node parser integration based on Chonkie for advanced text chunking.
Source →Callback handler for cost governance and token budget management.
Source →Integration of Chonkie library for semantic-aware document chunking.
Source →Added support for Claude Opus 4.6 model in Anthropic and Bedrock integrations.
Source →New callback handler for cost governance and token budget tracking.
Source →Integration with Chonkie library for semantic-aware document chunking with improved token-based code splitting.
Source →Node parser integration for advanced document chunking.
Source →Callback handler for cost governance through token budget tracking.
Source →Integration of Chonkie node parser for advanced document parsing and chunking.
Source →Specialized Airflow knowledge integration with Claude Code, Cursor, VS Code, and 25+ compatible AI coding tools for local development workflows.
Source →Agent architecture differs fundamentally from data warehouses, requiring new design patterns for real-time action and autonomous operations.
Source →Natural language agent creation tool that generates appropriate agent workflows from plain-language descriptions of document processing tasks.
Source →Production-ready and generally available stable foundation for programmatically managing Astro at scale.
Source →Natural language agent builder that converts plain-language descriptions into deployed document processing agents for tasks like financial statement classification and resume screening.
Source →Natural language agent creation tool that converts plain-language descriptions into deployed document processing agent workflows without requiring manual coding.
Source →Production-ready and Generally Available API for programmatically managing Astro at scale, providing stable foundation for automation and migration.
Source →Tool to generate document processing agent workflows from natural language descriptions, enabling deployment of agents for classification and structured data extraction in minutes.
Source →Natural language agent creation tool that generates appropriate agent workflows from plain-language descriptions for tasks like financial statement classification and resume section extraction.
Source →Production-ready and generally available stable API for programmatically managing Astro at scale.
Source →MCP server optimized for intelligent knowledge orchestration, superior metadata control, and seamless end-to-end data operations.
Source →The New Airbyte Knowledge MCP Server optimized for intelligent knowledge orchestration, superior metadata control, and seamless end-to-end data operations.
Source →Optimized for intelligent knowledge orchestration, superior metadata control, and seamless end-to-end data operations.
Source →Optimized for intelligent knowledge orchestration, superior metadata control, and seamless end-to-end data operations.
Source →Public preview of data quality monitoring with event-driven monitoring capabilities that change when and how data is validated.
Source →Data quality monitoring in Astro Observe now available in public preview with event-driven monitoring capabilities for fundamental validation changes.
Source →Event-driven data quality monitoring now in public preview with powerful new capabilities for validating data.
Source →Unified interface for AI agents to access external data sources with fully-managed authentication module supporting OAuth, hosted agent connectors, and Entity Cache.
Source →New API version with redesigned llama-cloud SDKs for Python and TypeScript, featuring structured configuration objects replacing flat parameter lists for easier parsing option discovery.
Source →Unified interface for agents to access external data sources with fully-managed authentication module, hosted agent connectors, and Entity Cache.
Source →Voyage-4 models added to Voyage AI embeddings integration.
Source →Support for structured output JSON schema name sanitization for generic Pydantic models.
Source →Configurable search parameters for Qdrant vector store.
Source →Async client methods for OpenSearch vector store.
Source →Milvus vector store partition name parameter support enhancement.
Source →Hybrid search support added to VertexAI Vector Search.
Source →Volcengine MySQL vector store integration.
Source →Alibaba Cloud MySQL vector store integration.
Source →Apertis LLM provider integration with example documentation.
Source →Revamped YouRetriever integration with updated API compatibility.
Source →Reader integration for HuggingFace datasets library.
Source →Distributed data ingestion pipeline integration using Ray framework.
Source →Enhanced Qdrant vector store with configurable search parameters.
Source →Added async close and aclose methods to OpenSearch vector client.
Source →Enhanced Milvus vector store with partition name parameter support.
Source →Added hybrid search capability to VertexAI Vector Search integration.
Source →New vector store integration for Volcengine MySQL.
Source →New vector store integration for Alibaba Cloud MySQL.
Source →New Apertis LLM provider integration with example documentation.
Source →New tools integration enabling parallel web system operations and queries.
Source →Enhanced You.com retriever integration with updated API compatibility.
Source →New reader integration for loading datasets from HuggingFace datasets library.
Source →Distributed data ingestion pipeline using Ray framework for scalable processing.
Source →CodeSplitter enhancement with token-based splitting support for improved code chunking.
Source →Distributed data ingestion integration using Ray for parallel pipeline processing.
Source →Prototype without connectors and scale with one API for document processing.
Source →Removed ORM Collection mix-usage pattern in favor of MilvusClient for Milvus vector store.
Source →Replaced ChatMemoryBuffer with unified Memory component for improved consistency and flexibility.
Source →Revamped YouRetriever integration with updated API support.
Source →Added close and aclose methods for proper client lifecycle management.
Source →Fixed async integration for MongoDB vector store.
Source →Added hybrid search support for Vertex AI Vector Search.
Source →Added search parameters support for Qdrant vector store.
Source →Enhanced partition support and improved parameter naming in Milvus vector store.
Source →New vector store integration for Volcengine MySQL databases.
Source →New MySQL vector store integration for Alibaba Cloud.
Source →Added support for new Voyage AI embedding models including voyage-4 and multimodal-35.
Source →Enhanced integration supporting ARNs for Bedrock Embedding Models and improved Bedrock Converse API support with reasoning content.
Source →New tools integration for parallel web system interactions.
Source →New LLM integration for Apertis provider with example documentation.
Source →New reader for loading datasets from HuggingFace Datasets library.
Source →Integration for distributed data ingestion using Ray for parallel processing.
Source →API that allows users to prototype without connectors and scale with a single API.
Source →Removed ORM Collection mix-usage with MilvusClient in Milvus vector store for consistency.
Source →Deprecated gemini LLM integration in favor of google-genai integration.
Source →Replaced ChatMemoryBuffer with a new Memory abstraction for improved message and content block handling.
Source →Integration of Ray for distributed data ingestion through RayIngestionPipeline.
Source →Added hybrid search capabilities to VertexAI Vector Search integration.
Source →Tools integration for parallel web systems providing distributed web search capabilities.
Source →Reader integration for accessing patent data from PatentsView API.
Source →New reader integration for loading datasets from HuggingFace Hub.
Source →Revamped YouRetriever integration with enhanced search capabilities.
Source →Vector store integration for Alibaba Cloud MySQL with vector search capabilities.
Source →Vector store integration for Volcengine MySQL database backend.
Source →Code splitting support with token-based boundaries for more precise code chunking.
Source →Distributed data ingestion pipeline integration using Ray for parallel processing.
Source →Milvus vector store refactored to remove ORM Collection mix-usage with MilvusClient.
Source →Replaced ChatMemoryBuffer with improved Memory component for better chat history management and token counting.
Source →Hybrid search support for Vertex AI vector search.
Source →Revamped YouRetriever integration for web search capabilities.
Source →Reader integration for patent search and retrieval.
Source →Tools integration for parallel web searches.
Source →MySQL vector store integration for Volcengine.
Source →MySQL vector store integration for Alibaba Cloud.
Source →Reader integration for HuggingFace datasets.
Source →Distributed data ingestion integration using Ray framework.
Source →Hybrid search support for Vertex AI vector search combining dense and sparse retrieval.
Source →New LLM provider integration with Apertis.
Source →Vector store integration for Volcengine MySQL with vector search support.
Source →Vector store integration for Alibaba Cloud MySQL with vector search capabilities.
Source →Tools integration for parallel web system searches and operations.
Source →Reader integration for loading datasets from HuggingFace Hub.
Source →CodeSplitter with support for token-based code splitting in addition to existing splitting strategies.
Source →Replacement of ChatMemoryBuffer with improved Memory component for better chat history management.
Source →Integration for distributed data ingestion using Ray, enabling scalable ingestion pipeline processing.
Source →API that allows prototyping without connectors and scaling with a single unified API.
Source →Airbyte joins the Agentic AI Foundation under the Linux Foundation to support open, interoperable agentic AI infrastructure and standards.
Source →Airbyte joins the Agentic AI Foundation under the Linux Foundation to support open, interoperable agentic AI infrastructure and standards.
Source →Airbyte joins the Agentic AI Foundation under the Linux Foundation to support open, interoperable agentic AI infrastructure and standards.
Source →Airbyte joins the Agentic AI Foundation under the Linux Foundation to support open, interoperable agentic AI infrastructure and standards.
Source →Convergence of agents toward files and filesystems as primary interfaces for context management instead of complex tool ecosystems, reducing context loss and improving answer quality for smaller datasets.
Source →Discover Chat with Your Schema in Agent Engine – AI-powered configuration that simplifies data integration, automates workflows, and enhances agent intelligence.
Source →AI-powered configuration in Agent Engine that simplifies data integration, automates workflows, and enhances agent intelligence.
Source →AI-powered configuration in Agent Engine that simplifies data integration, automates workflows, and enhances agent intelligence.
Source →Shift from batch-based systems to real-time search, fetch, and write operations to enable AI agents to act on current data.
Source →Shift from batch data systems to Search, Fetch, and Write architecture enabling real-time, entity-centric agentic data operations.
Source →AI agents fail when built on batch data systems; agentic infrastructure requires real-time, entity-centric search, fetch, and write capabilities.
Source →Shift from batch data systems to search, fetch, and write patterns enabling real-time, entity-centric infrastructure for AI agents.
Source →GPT-5.2 and GPT-5.2 Pro model support added.
Source →Text-to-speech tool integration for Typecast service.
Source →AI Badgr OpenAI-compatible LLM integration.
Source →Flexible file_mode parameter for Google GenAI file handling.
Source →New tool integration for text-to-speech functionality via Typecast.
Source →New AI Badgr LLM integration providing OpenAI-compatible interface.
Source →New node parser for element-based document parsing and chunking.
Source →Fixed numpy array handling and improved persistence with JSON serialization.
Source →Updated FTS and GSI reference documentation for Couchbase vector store.
Source →Added missing filter operators and improved async tool spec support.
Source →Fixed nested metadata filters and added multimodal results support.
Source →Improved Ollama batch embedding with keep_alive parameter support.
Source →New tool integration providing text-to-speech features via Typecast.
Source →New node parser for structured element-based document parsing.
Source →Tools integration providing text-to-speech capabilities.
Source →Tools integration providing text-to-speech capabilities via Typecast.
Source →Node parser for extracting and organizing document elements with structured output.
Source →Tool integration for text-to-speech conversion.
Source →Tool integration adding text-to-speech features via Typecast.
Source →LLM integration for AI Badgr offering OpenAI-compatible API.
Source →New node parser for extracting and parsing HTML/document elements.
Source →Simplified LlamaParse with tier-based configuration (Fast, Cost Effective, Agentic, Agentic Plus), stable versions with long-term support, improved performance, and up to 50% cost reduction.
Source →Simplified four-tier configuration (Fast, Cost Effective, Agentic, Agentic Plus) with version pinning for production stability and automatic updates, up to 50% cost reduction.
Source →AI-configured connections in Airbyte automatically set up data pipelines using intelligent schema detection and best-practice configurations.
Source →LlamaParse v2 introduces simplified tier-based pricing with four tiers (Fast, Cost Effective, Agentic, Agentic Plus) with Agentic Plus offering 50% cost reduction compared to previous pricing while maintaining comparable accuracy.
Source →Redesigned API with new SDKs for Python and TypeScript featuring simplified tier-based configuration (Fast, Cost Effective, Agentic, Agentic Plus), stable versions with long-term support, and improved performance at reduced pricing with up to 50% cost reduction.
Source →Improvements to Astro IDE providing agent with more context and data engineers more control over agent responses.
Source →Automatically set up data pipelines using intelligent schema detection and best-practice configurations.
Source →Simplified tier-based configuration with four tiers (Fast, Cost Effective, Agentic, Agentic Plus), stable versions with long-term support, and improved performance at reduced pricing including 50% cost reduction on Agentic Plus.
Source →LlamaParse v2 introduces simplified tier-based pricing with up to 50% cost reduction in Agentic Plus tier while maintaining comparable accuracy to previous versions.
Source →Redesigned LlamaParse API with simplified tier-based configuration (Fast, Cost Effective, Agentic, Agentic Plus), stable versions with long-term support, improved performance, and up to 50% cost reduction.
Source →Automatically set up data pipelines using intelligent schema detection and best-practice configurations.
Source →Enable AI agents to securely access, sync, and act on real-time data across tools and systems.
Source →Introducing Agent Connectors from Airbyte enabling AI agents to securely access, sync, and act on real-time data across your tools and systems.
Source →Enable AI agents to securely access, sync, and act on real-time data across tools and systems.
Source →Enable AI agents to securely access, sync, and act on real-time data across tools and systems.
Source →Virtual filesystem isolation tool for coding agents that enables sandboxed filesystem access while protecting real files from accidental deletion.
Source →Virtual filesystem isolation tool for secure coding agent execution, enabling sandboxed document access without unrestricted filesystem permissions.
Source →Prefect's decomposed durability decouples results from workflow identity, enabling cross-workflow caching and exactly-once semantics through composable primitives.
Source →Prefect's decomposed durability approach decouples results from workflow identity, enabling cross-workflow caching and exactly-once semantics through composable primitives.
Source →Remix Servers move workflow orchestration from client-side to server-side, enabling portability across Claude, Cursor, ChatGPT, and any MCP client.
Source →Server-side MCP implementation that moves workflow orchestration server-side, making expertise portable across Claude, Cursor, ChatGPT, and any MCP client.
Source →MCP workflow orchestration moved from client-side to server-side architecture, enabling portable workflows across multiple MCP clients (Claude, Cursor, ChatGPT) without requiring client-side execution.
Source →Remix servers enable MCP orchestration to move server-side, making workflows portable across Claude, Cursor, ChatGPT, and other MCP clients.
Source →AI-powered document separation tool that automatically splits concatenated documents into distinct sections based on content categories defined by users.
Source →AI-powered document separation tool that automatically segments bundled documents into distinct sections based on content categories.
Source →AI-powered document separation tool that automatically segments bundled documents into distinct sections based on content categories, now in public beta with async batch PDF processing capabilities.
Source →AI-powered document separation tool that automatically divides concatenated or bundled documents into distinct sections based on content categories.
Source →AI-powered document separation API in beta that automatically segments bundled documents into distinct sections based on content categories with intelligent classification.
Source →Helps teams connect real-time data to AI agents with production-ready infrastructure, faster integration, and reliable context management.
Source →Agent Blueprint helps teams connect real-time data to AI agents with production-ready infrastructure, faster integration, and reliable context management.
Source →Helps teams connect real-time data to AI agents with production-ready infrastructure, faster integration, and reliable context management.
Source →Helps teams connect real-time data to AI agents with production-ready infrastructure, faster integration, and reliable context management.
Source →Document intelligence tool for structured data extraction using custom schemas with page-level granularity, supporting repeating entities extraction from tables and lists.
Source →Structured data extraction tool with page-level granularity, custom schema support, and new PER_TABLE_ROW extraction target for intelligently extracting repeating entities from documents.
Source →Structured data extraction tool with page-level granularity and PER_TABLE_ROW extraction target for exhaustive extraction of repeating entities from documents.
Source →Tool that transforms messy spreadsheets into AI-ready data, extracting structured tables from complex .xlsx files with merged cells and formatting quirks into clean parquet datasets.
Source →Transforms messy spreadsheets into AI-ready data, extracting structured tables from complex .xlsx files with merged cells and formatting quirks into clean parquet datasets.
Source →Transforms messy spreadsheets (.xlsx files) into AI-ready parquet data by intelligently interpreting formatting, layout, and semantic relationships, with support for merged cells and complex structures.
Source →Beta tool for transforming messy spreadsheets into AI-ready data, extracting structured tables from complex .xlsx files with merged cells and formatting quirks into clean parquet datasets.
Source →Beta feature that transforms messy spreadsheets (.xlsx files) into AI-ready data by intelligently interpreting formatting, layout, and semantic relationships to produce clean parquet outputs with typed tables and metadata.
Source →Shift to Watcher execution mode for dbt workflows in Airflow, enabling up to 80% reduction in DAG runtimes while maintaining task-level observability.
Source →Platform for building, serving, and deploying document agents combining LlamaParse's document processing with Agent Workflows orchestration, featuring pre-built templates and one-command deployment.
Source →dbt Labs expanded the dbt Fusion Engine ecosystem with Microsoft Fabric integration to enable faster, more governed data transformations in Data Factory.
Source →Open preview platform for building, serving, and deploying document extraction agents, combining LlamaParse document processing with Agent Workflows orchestration.
Source →dbt Labs expanded the dbt Fusion Engine ecosystem with Microsoft Fabric integration to enable faster, more governed data transformations in Data Factory.
Source →Open preview launch combining LlamaParse document processing with Agent Workflows orchestration for building, serving and deploying document extraction agents with pre-built templates and one-command deployment.
Source →dbt Labs expanded dbt Fusion Engine ecosystem with Microsoft Fabric integration to enable faster, more governed data transformations in Data Factory.
Source →Document extraction agent platform combining LlamaParse document processing with Agent Workflows orchestration for multi-step workflow deployment with templates like SEC Insights and Invoice Extraction.
Source →Document extraction agents in open preview combining LlamaParse document processing with Agent Workflows orchestration, featuring one-click deployment with pre-built templates for invoice processing, contract review, and claims handling.
Source →dbt Labs expanded the dbt Fusion Engine ecosystem with Microsoft Fabric integration, enabling faster and more governed data transformations in Data Factory.
Source →Email-triggered workflow integration enabling document processing workflows to be triggered by email events.
Source →Collaboration on enterprise document processing at scale, including webinars on document processing workflows.
Source →Simpler, faster, and more powerful way to transform documents in Unstructured.
Source →Simpler, faster, and more powerful way to transform documents in Unstructured with improved capabilities.
Source →Simpler, faster, and more powerful way to transform documents in Unstructured platform.
Source →dbt Labs integrated with Snowflake Intelligence, providing foundational data governance and trustworthiness for conversational AI capabilities.
Source →dbt is foundational to Snowflake Intelligence, a conversational AI solution that relies on dbt's data quality and governance capabilities.
Source →dbt is foundational to Snowflake Intelligence, providing conversational AI capabilities backed by trusted data infrastructure.
Source →dbt is foundational to Snowflake Intelligence, providing the data foundation for conversational AI capabilities.
Source →Model Context Protocol integration providing search_docs, grep_docs, and read_doc tools for LlamaIndex documentation access.
Source →