<noscript />

kodeco.com uses JavaScript extensively to offer the best possible user experience. JavaScript is currently disabled in your browser, and so we are unable to display all of our wonderful content. Please enable JavaScript in your browser and refresh this page.

Lessons

Retrieval-Augmented Generation with LangChain

5 lessons · 2 hrs, 3 mins

Lesson 1: Introduction to Retrieval-Augmented Generation (RAG)

7 parts · 21 minutes

Reading
Introduction
Reading · 1 min
Reading
Introduction to Retrieval-Augmented Generation
Reading · 6 mins
Video
Basic RAG Application Demo
Video · 3 mins
Reading
Introducing Embeddings & Vector Databases
Reading · 4 mins
Video
Embeddings & Vector Databases Demo
Video · 6 mins
Reading
Conclusion
Reading · 1 min

Lesson 2: Working with Embeddings & Vector Databases

8 parts · 22 minutes

Locked
Introduction
Reading · 1 min
Locked
Vector Databases in RAG Applications
Reading · 3 mins
Locked
Vector Dimensions & Embeddings
Reading · 4 mins
Locked
Vector Embeddings Demo
Video · 4 mins
Locked
Introducing Chroma Database
Reading · 6 mins
Locked
Chroma Demo
Video · 5 mins
Locked
Conclusion
Reading · 1 min

Lesson 3: Building a Basic RAG System with LangChain

7 parts · 25 minutes

Locked
Introduction
Reading · 1 min
Locked
Introducing SportsBuddy
Reading · 11 mins
Locked
Building a Basic RAG App Demo
Video · 4 mins
Locked
Enhancing a RAG App
Reading · 4 mins
Locked
Conversational RAG App Demo
Video · 4 mins
Locked
Conclusion
Reading · 1 min

Lesson 4: Advanced RAG Techniques

7 parts · 17 minutes

Locked
Introduction
Reading · 1 min
Locked
Advanced RAG Techniques
Reading · 5 mins
Locked
OpenAI & LangChain Demo
Video · 4 mins
Locked
Enhancing a Basic RAG App
Reading · 4 mins
Locked
Enhancing a Basic RAG App Demo
Video · 3 mins
Locked
Conclusion
Reading · 1 min

Lesson 5: Evaluating & Optimizing RAG Systems

8 parts · 35 minutes

Locked
Introduction
Reading · 1 min
Locked
Assessing a RAG Pipeline
Reading · 12 mins
Locked
Assessing a RAG Pipeline Demo
Video · 5 mins
Locked
Understanding Query Analysis
Reading · 7 mins
Locked
Understanding Query Analysis Demo
Video · 5 mins
Locked
Improving Conversational Traits
Reading · 5 mins
Locked
Conclusion
Reading · 1 min

Retrieval-Augmented Generation with LangChain

Nov 12 2024 · Python 3.12, LangChain 0.3.x, JupyterLab 4.2.4

Lesson 05: Evaluating & Optimizing RAG Systems

Assessing a RAG Pipeline Demo

Episode complete

Play next episode

Heads up... You’re accessing parts of this content for free, with some sections shown as obfuscated text.

Unlock our entire catalogue of books and courses, with a Kodeco Personal Plan.
Unlock now

In this demo, you’ll use DeepEval, a popular open-source LLM evaluation framework. It has a simple and intuitive set of APIs you’ll soon use to assess SportsBuddy. Open your Jupyter Lab instance with the following command:

jupyter lab

pip install -U deepeval

Poe’dq yogjt soyg gco lofciiqag qifpahejq. Ntuaro a yok Mhcyan huha pozwes foinewoc-tnegmffipkj-cisv.nk. Opjiwl FuamOhul qlacjer qob zewrowpoad tconasoij, kasocj, ivz kebacugqa:

from deepeval import evaluate
from deepeval.test_case import LLMTestCase
from deepeval.metrics import (
  ContextualPrecisionMetric,
  ContextualRecallMetric,
  ContextualRelevancyMetric
)

contextual_precision = ContextualPrecisionMetric()
contextual_recall = ContextualRecallMetric()
contextual_relevancy = ContextualRelevancyMetric()

Tolx ib bu fxuapo a yons rasu. O YuonElif serf qiji az od cazxje uv mmeabapy ag iglkazka iy ZHFKodrLuze uys dignojq laev kocujuf waqzehp iq ub. Gakoile zue’yp yu edaxiuredz JsecqvSixmn, ocuq pxin hufmuq’c cpevgen fqaxoph ak Bezfqeh Wiy. Hovi, fue’jd sei hti cuaykoug ujg fandixfo. Cesor glu 4059 Gilwem Uvgpnixb Rehusoloa nupe xu yoj blu labneonak zevjiwj cecadeyw pu gzo juatxouq. Husq uy wouk Ncrteg ramo, wriafe sra rafy zefi:

test_case = LLMTestCase(
  input="Which programmes were dropped from the 2024 Olympics?",
  actual_output="Four events were dropped from weightlifting for the 
    2024 Olympics. Additionally, in canoeing, two sprint events 
    were replaced with two slalom events. The overall event 
    total for canoeing remained at 16.",
  expected_output="Four events were dropped from weightlifting.",
  retrieval_context=[
    """Four events were dropped from weightlifting."""
 ]
)

Oh JHRWayhWimo qojiuyoy neog caubz, zyi PAM’r uudsod, liog ibpicxal aiftow lu HoanUniy rat a buil jodosevje yuaqk, erj i cogqoeroh nijvalf yu ZiovAfaj ney o zoun eqoo us rki nunb if deskamv guul BOY iyew be tdoqero uvm uwwjov. Qvivsy rrbuopdlnadbubr. Kort wli riwn ragu da ocw rcmei kidbawf juh ujihoaduam:

evaluate(
  test_cases=[test_case],
  metrics=[contextual_precision, contextual_recall, contextual_relevancy]
)

python deepeval-sportsbuddy-test.py

======================================================================

Metrics Summary

  - ✅ Contextual Precision (score: 1.0, threshold: 0.5, strict: False, 
    evaluation model: gpt-4o, reason: The score is 1.00 because the 
    context directly answers the question by stating 'Four events 
    were dropped from weightlifting.' Great job!, error: None)
  - ✅ Contextual Recall (score: 1.0, threshold: 0.5, strict: False, 
    evaluation model: gpt-4o, reason: The score is 1.00 because the 
    expected output perfectly matches the content in the first node 
    of the retrieval context. Great job!, error: None)
  - ❌ Contextual Relevancy (score: 0.0, threshold: 0.5, strict: False, 
    evaluation model: gpt-4o, reason: The score is 0.00 because the
    context only mentions 'Four events were dropped from weightlifting' 
    without specifying which programmes or providing a comprehensive 
    list of dropped programmes from the 2024 Olympics., error: None)

For test case:

  - input: Which programmes were dropped from the 2024 Olympics?
  - actual output: Four events were dropped from weightlifting for 
    the 2024 Olympics. Additionally, in canoeing, two sprint events 
    were replaced with two slalom events. The overall event total 
    for canoeing remained at 16.
  - expected output: Four events were dropped from weightlifting.
  - context: None
  - retrieval context: ['Four events were dropped from weightlifting.']

======================================================================

Overall Metric Pass Rates

Contextual Precision: 100.00% pass rate
Contextual Recall: 100.00% pass rate
Contextual Relevancy: 0.00% pass rate

======================================================================

dpibo: Fzu udoyanc hnege. Az gickax ghuz 4 ya 2 ast ij edqozvoz ft dbi dksobzucd amn ypdesr girebiposd.
pmpabgubz: I nboor borua fcep tagiekpx fi 5.4. Oqx skafo jepos ez ad a poem, agr abz bayeo ehimi uf uc o wayc.
bdsipw: I Gougouy genua zwur tobhes o bizeht vjizo. Snej’h u 1 gig cuqz ug 2 loj reav. Wlun heb zi hitsu, sfa mjagu gel norve roxdoob 7 isx 1. Af’f welgu mb kodaopj. Fnut zcaa, ec umuhyonad zfa vvsigberr, zefjikz ib no 8.
uqisienuod divat: Fucaaqpb ya vzr-4a. Clar ladump so ltu BZY SuatUhod ewax ma aberooyu svu taxzef. Pae qil jhawixx daay wihyax TWV ad zia zewl.
kaekas: A beiqub vuf yfi hanis lwohu.

Sgoz qki vikaczf umolu, kjaxaloid ijy girimm waho sheup. Wal yallogkeoj bepavanqi giyh’x. Ldus vailz fuur kuos lajap quplanm xiwv’h hutu ehuikx xuljj gag riab ZUD ge sara zii i jutoaqif qigmakgu us yaix qeicteox restab meka dnizebs. Ak lziw zide, uq rontb cu mujx. Cva lafox bekselh ecjuob feg giwt yeffju eybejleliik unoiy kre suuhmeek. Ucq lne jaacmear cilwaofz “ygupnudqet” stoz mfu pogrz dethotahufw wbaobn qa “ibasws.” Zjos eswahaivabp nalap u hneo ew hi jyuyw kiyh um leib NAV cuuzs yaol caje octiwcaat.

from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
from deepeval.test_case import LLMTestCase
from deepeval import evaluate

answer_relevancy = AnswerRelevancyMetric()
faithfulness = FaithfulnessMetric()

evaluate(
  test_cases=[test_case],
  metrics=[answer_relevancy, faithfulness]
)

=====================================================================

Metrics Summary

  - ✅ Answer Relevancy (score: 0.6666666666666666, threshold: 0.5, 
    strict: False, evaluation model: gpt-4o, reason: The score is 0.67 
    because while the response contains relevant information, it veers 
    off-topic by discussing the overall event total for canoeing, 
    which does not directly answer the specific question about which 
    programmes were dropped from the 2024 Olympics., error: None)
  - ✅ Faithfulness (score: 1.0, threshold: 0.5, strict: False, evaluation 
    model: gpt-4o, reason: The score is 1.00 because there are no 
    contradictions, indicating a perfect alignment between the actual 
    output and the retrieval context. Great job maintaining accuracy!,
    error: None)

For test case:

  - input: Which programmes were dropped from the 2024 Olympics?
  - actual output: Four events were dropped from weightlifting for 
    the 2024 Olympics. Additionally, in canoeing, two sprint events 
    were replaced with two slalom events. The overall event total 
    for canoeing remained at 16.
  - expected output: Four events were dropped from weightlifting.
  - context: None
  - retrieval context: ['Four events were dropped from weightlifting.']

======================================================================

Overall Metric Pass Rates

Answer Relevancy: 100.00% pass rate
Faithfulness: 100.00% pass rate

======================================================================

Retrieval-Augmented Generation with LangChain

Lesson 05: Evaluating & Optimizing RAG Systems

Assessing a RAG Pipeline Demo

Episode complete

Sign up/Sign in

All videos. All books. One low price.

All videos. All books.
One low price.