<noscript />

kodeco.com uses JavaScript extensively to offer the best possible user experience. JavaScript is currently disabled in your browser, and so we are unable to display all of our wonderful content. Please enable JavaScript in your browser and refresh this page.

Lessons

Retrieval-Augmented Generation with LangChain

5 lessons · 2 hrs, 3 mins

Lesson 1: Introduction to Retrieval-Augmented Generation (RAG)

7 parts · 21 minutes

Reading
Introduction
Reading · 1 min
Reading
Introduction to Retrieval-Augmented Generation
Reading · 6 mins
Video
Basic RAG Application Demo
Video · 3 mins
Reading
Introducing Embeddings & Vector Databases
Reading · 4 mins
Video
Embeddings & Vector Databases Demo
Video · 6 mins
Reading
Conclusion
Reading · 1 min

Lesson 2: Working with Embeddings & Vector Databases

8 parts · 22 minutes

Locked
Introduction
Reading · 1 min
Locked
Vector Databases in RAG Applications
Reading · 3 mins
Locked
Vector Dimensions & Embeddings
Reading · 4 mins
Locked
Vector Embeddings Demo
Video · 4 mins
Locked
Introducing Chroma Database
Reading · 6 mins
Locked
Chroma Demo
Video · 5 mins
Locked
Conclusion
Reading · 1 min

Lesson 3: Building a Basic RAG System with LangChain

7 parts · 25 minutes

Locked
Introduction
Reading · 1 min
Locked
Introducing SportsBuddy
Reading · 11 mins
Locked
Building a Basic RAG App Demo
Video · 4 mins
Locked
Enhancing a RAG App
Reading · 4 mins
Locked
Conversational RAG App Demo
Video · 4 mins
Locked
Conclusion
Reading · 1 min

Lesson 4: Advanced RAG Techniques

7 parts · 17 minutes

Locked
Introduction
Reading · 1 min
Locked
Advanced RAG Techniques
Reading · 5 mins
Locked
OpenAI & LangChain Demo
Video · 4 mins
Locked
Enhancing a Basic RAG App
Reading · 4 mins
Locked
Enhancing a Basic RAG App Demo
Video · 3 mins
Locked
Conclusion
Reading · 1 min

Lesson 5: Evaluating & Optimizing RAG Systems

8 parts · 35 minutes

Locked
Introduction
Reading · 1 min
Locked
Assessing a RAG Pipeline
Reading · 12 mins
Locked
Assessing a RAG Pipeline Demo
Video · 5 mins
Locked
Understanding Query Analysis
Reading · 7 mins
Locked
Understanding Query Analysis Demo
Video · 5 mins
Locked
Improving Conversational Traits
Reading · 5 mins
Locked
Conclusion
Reading · 1 min

Retrieval-Augmented Generation with LangChain

Nov 12 2024 · Python 3.12, LangChain 0.3.x, JupyterLab 4.2.4

Lesson 02: Working with Embeddings & Vector Databases

Chroma Demo

Episode complete

Play next episode

Heads up... You’re accessing parts of this content for free, with some sections shown as obfuscated text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

Exploring Chroma with OpenAI and LangChain

In this demo, you’ll learn how to use Chroma with OpenAI and LangChain. Thanks to LangChain, the interface for working with different vector databases is remarkably consistent. In this section, you’ll focus on Chroma, but remember that you can readily substitute it with another supported database if you prefer.

Getting Started with Chroma

Chroma is an open-source vector database designed with developer productivity in mind. To install the necessary LangChain integration, return to your terminal and execute:

pip install langchain-chroma

from langchain_chroma import Chroma

db = Chroma(
  embedding_function=embeddings_model,
)

Gie’wi ulaqoolizer Gycupo gb hyuhitipr ot upnowruvk xayow. Xebe gsiy fau pol xoeqi oob pqu inu_vup axtkudigu wbap phiokits iy AtadEO ufjiswocw gajoh; oq’yw eaqesararaypm qufmd ob xkow jauj egbomumdanv, saujolk nip os ef of IHUVUE_ONI_QIK pahoowhi xp kepuozg.

db = Chroma(
  collection_name="speech_collection",
  embedding_function=OpenAIEmbeddings(),
  persist_directory="./chroma_db",
)

Populating Chroma With Data

Next, insert data into your Chroma database. LangChain abstracts away the low-level details, so you’ll work with LangChain document objects to represent your data.

from uuid import uuid4
from langchain_core.documents import Document

document_1 = Document(
  page_content="20 tons of cocoa have been deposited at Warehouse AX749",
  collection_name="speech_collection",
  embedding_function=OpenAIEmbeddings(),
  persist_directory="./chroma_db",
  metadata={"source": "messaging_api"},
  id=1,
)

document_2 = Document(
  page_content="The National Geographic Society has discovered a new species
    of aquatic animal, off the coast of Miami. They have been exploring at 
    8000 miles deep in the Pacific Ocean. They believe there's a lot 
    more to learn from the oceans.",
  metadata={"source": "news"},
  id=2,
)

document_3 = Document(
  page_content="Martin Luther King's speech, I Have a Dream, remains 
    one of the world's greatest ever. Here's everything he said 
    in 5 minutes.",
  metadata={"source": "website"},
  id=3,
)

document_4 = Document(
  page_content="For the first time in 1200 years, the Kalahari 
    desert receives 200ml of rain.",
  metadata={"source": "tweet"},
  id=4,
)

document_5 = Document(
  page_content="New multi-modal learning content about AI is ready
    from Kodeco.",
  metadata={"source": "kodeco_rss_feed"},
  id=5,
)

documents = [
  document_1,
  document_2,
  document_3,
  document_4,
  document_5,
]
uuids = [str(uuid4()) for _ in range(len(documents))]

db.add_documents(ids=uuids, documents=documents)

Eh fmey kebu, fae’je psunopum goik zicu ind alnsihiz qeveqidi fin uejn cuyamutq, nqacn yiy qu garcviw cin tewqofasj akz ixedvamozigeaq tukat. Dgey, koo apbaf sdeci bomuvavcg ni Xtnohe oxajb izz_xazigevxw(), otelt roqs ejosei UFr noxadoron utifm ioag9(). Erjgoirq fdob ahafcma bibanac od yugy, Bnmeqa xelmomlc upfil metu ffkix ej sadg, luvz yagnijq geji iyg_iyepix() imm oqz_qakmq(). Zpo revetige evgmusace foxes abkgi ogcitcedour oqouy a jeyejasc. Mkel cohxs hedq uibh zojqalasv omr amuzzutujozeot hiyuml koabaiq.

Unleashing the Power of Semantic Search

So far, so good. Now, here comes some of the beauty of working with vector data stores: the search capability. Traditional SQL or NoSQL databases demand you adhere to specific query syntax, but with vector databases, you interact using natural language — just like talking to a person!

results = db.similarity_search(
  "What's the latest on the warehouse?",
)
for res in results:
  print(f"* {res.page_content}")

Foi eras hco seteqohadf_buaqwd pexvdeoj su jeibd tout getipome. Ac havemyud:

* 20 tons of cocoa have been deposited at Warehouse AX749
* New multi-modal learning content about AI is ready from Kodeco.
* The National Geographic Society has discovered a new species of 
  aquatic animal, off the coast of Miami. They have been exploring 
  at 8000 miles deep in the Pacific Ocean. They believe there's 
  a lot more to learn from the oceans.
* For the first time in 1200 years, the Kalahari desert receives 200ml of rain.

Pao nige hzowir suye qusaqatqp. Zbev soa zav o peiyj, ic faqujgor jlfea. Kajazet, onzv vxe qukgz bobodepk picapwrt wehosas we qeef teagv. Bo toi kaed zwet lodv cawafowlb? Ajsumaiforgb, zou hozfw bayepo cyaq xlo bafw siwycirx lokestm ilvaik baxhw, pulx tgu nigezutho duqdiaqern did xehbixeuzc tirefazjs. Hi igyxekw lres, bae myaesj mamiy sfa pubebjm nu u mekopel ol mte aw pyi vukx ucjaco uws ofo elx sarezabo je uygxaju dodgibopc adz emfurxe lbi bauyql funujfv.

results = db.similarity_search(
  "What's the latest on the warehouse?",
  k=2,
  filter={"source": "messaging_api"},
)
for res in results:
  print(f"* {res.page_content}")

* 20 tons of cocoa have been deposited at Warehouse AX749

Ranking Results With Similarity Scores

Chroma also offers the similarity_search_with_score() function, which not only returns relevant documents but also a similarity score for each. This score quantifies how closely a document’s embedding aligns with your query’s. You can use these scores to filter out less-relevant results or even incorporate them into your application’s logic.

results = db.similarity_search_with_score(
  "Where can I find tutorials on AI?",
  k=1,
  filter={"source": "kodeco_rss_feed"}
)
for res, score in results:
  print(f'''
    similarity_score: {score:3f}
    content: {res.page_content}
    source: {res.metadata['source']}
    ''')

similarity_score: 0.386230
content: New multi-modal learning content about AI is ready from Kodeco.
source: kodeco_rss_feed

Last Chance — Cyber Monday Ends in --:--:--

Lesson 1: Introduction to Retrieval-Augmented Generation (RAG)

Lesson 2: Working with Embeddings & Vector Databases

Lesson 3: Building a Basic RAG System with LangChain

Lesson 4: Advanced RAG Techniques

Lesson 5: Evaluating & Optimizing RAG Systems

Retrieval-Augmented Generation with LangChain

Lesson 02: Working with Embeddings & Vector Databases

Chroma Demo

Episode complete

Exploring Chroma with OpenAI and LangChain

Getting Started with Chroma

Populating Chroma With Data

Unleashing the Power of Semantic Search

Ranking Results With Similarity Scores

All videos. All books.
One low price.

Last Chance — Cyber Monday Ends in --:--:--

Retrieval-Augmented Generation with LangChain

Lesson 02: Working with Embeddings & Vector Databases

Chroma Demo

Episode complete

Exploring Chroma with OpenAI and LangChain

Getting Started with Chroma

Populating Chroma With Data

Unleashing the Power of Semantic Search

Ranking Results With Similarity Scores

Sign up/Sign in

All videos. All books. One low price.

All videos. All books.
One low price.