[Open Source Project] Fresh Data For AI

Open Sourced – https://github.com/cocoindex-io/cocoindex

We are thrilled to announce the open-source release of CocoIndex, the world’s first engine that supports both custom transformation logic and incremental updates specialized for data indexing.

CocoIndex is an ETL framework to prepare data for AI applications such as semantic search, retrieval-augmented generation (RAG). It offers a data-driven programming model that simplifies the creation and maintenance of data indexing pipelines, ensuring data freshness and consistency.

CocoIndex is now open source under the Apache License 2.0. This means the core functionality of CocoIndex is freely available for anyone to use, modify, and distribute. We believe that open sourcing CocoIndex will foster innovation, enable broader adoption, and create a vibrant community of contributors who can help shape its future. By choosing the Apache License 2.0, we’re ensuring that both individual developers and enterprises can confidently build upon and integrate CocoIndex into their projects while maintaining the flexibility to create proprietary extensions.

Key Features

  • Data Flow Programming: Build indexing pipelines by composing transformations like Lego blocks, with built-in state management and observability.

  • Support Custom Logic: Plug in your choice of chunking, embedding, and vector stores. Extend with custom transformations like deduplication and reconciliation.

  • Incremental Updates: Smart state management minimizes re-computation by tracking changes at the file level, with future support for chunk-level granularity.

  • Python SDK: Built with a RUST core 🦀 for performance, exposed through an intuitive Python binding for ease of use.
    We are moving fast and a lot of features and improvements are coming soon.

Getting Started
For a detailed walkthrough, refer to our Quickstart Guide.

🤗 Community
We are super excited for community contributions of all kinds – whether it’s code improvements, documentation updates, issue reports, feature requests on GitHub, and discussions in our Discord.

GitHub: Please give us a star – repository 🤗.
Documentation: Check out our documentation for detailed guides and API reference.
Discord: Join discussions, seek support, and share your experiences on our Discord server.
Social Media: Follow us on Twitter and LinkedIn for updates.

We would love to fostering an inclusive, welcoming, and supportive environment. Contributing to CocoIndex should feel collaborative, friendly and enjoyable for everyone. Together, we can build better AI applications through robust data infrastructure.

Looking forward to seeing what you build with CocoIndex!

原文链接:[Open Source Project] Fresh Data For AI

© 版权声明
THE END
喜欢就支持一下吧
点赞13 分享
Death comes to all, but great achievements raise a monument which shall endure until the sun grows old.
死亡无人能免,但非凡的成就会树起一座纪念碑,它将一直立到太阳冷却之时
评论 抢沙发

请登录后发表评论

    暂无评论内容