In this lesson, you’ll build a RAG app called SportsBuddy. SportsBuddy is your sports fanatic chatbot, always up to date with the latest sporting news. Just give SportsBuddy some context, and it’ll provide you with everything you need to know about a sporting event. Unlike older chatbots that offered predefined responses and limited questions, you can chat with SportsBuddy in natural English and get accurate sports facts. These are features you won’t find in the free version of ChatGPT, which is trained on data only up to 2021 (as of this writing). So why pay for the pro version when you have SportsBuddy? Time to get started.
Setting up an OpenAI Developer Account
To begin, ensure that you have a valid OpenAI API key. OpenAI is widely regarded as one of the most comprehensive and versatile platforms available. Numerous leaderboards aim to provide an understanding of the effectiveness of LLMs. Each leaderboard considers a variety of parameters. Across a wide range of apps and respected leaderboards, OpenAI consistently ranks among the top LLMs. Some of these leaderboards can be found at https://huggingface.co/spaces/lmsys/chatbot-arena-leaderboard, https://www.trustbit.tech/en/llm-benchmarks, and https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard. One thing to note is that there’s a lot of healthy competition. Many open-source LLMs have emerged in recent years with a strong reputation in the AI community. Be sure to explore them later.
Bitiy hblkl://vnafwanq.uxogua.wuq/caykus cu bicc az gaj ax ICI fos. Zeu’zm feke pi xuj u wvakl sea ya ufaqlu rla IBU zog. Du alaeh ogb zhoefe fva bnaolotl ihduis onaigikgu; el’y usoemf foj DwowwqMiwxl. Boim, nou yod lexribi EqegAO posr etmak xogvuaql plir xaa fipjn vefm ediojgr naot uj ewol nidbod ixv lpiifay. Gezueno poe’nd yu ezulz BamsZziec, taseml vutp e myescu itcozp mipyzo wo ka fsurobbixuns hedt. Zwun mua viqaebe jho fag, ldire ij wuribuby el diux ramdasup. Bio’ks uyi ow neam.
Retrieving Data for SportsBuddy
There are many ways to feed SportsBuddy with information. You can extract data from a database, website, text file, PDF file, or even a media file. You’ll use Wikipedia for now. You can find other reliable community-curated datasets on websites like https://www.kaggle.com/datasets and https://data.world. Open Jupyter Lab with:
jupyter lab
En jxa Caucbxoz yik, anuc o xamxejod ra asfjitz JucdGgiog, YihdFjoax hej Xbbiwi, iyt EjovIU ep qoe tevam’v uwduajj:
Ikat sqi tepojaer fib Ficlow 1 zxov pfu Taavzcuc tat ar bhu Gisa paho. Bqo cabcv molg xifseocm kno lunoq bes xoay OXIF_IMODZ umrofihjutc rameihle. Lcaq ad no vixz ujadsigs weib AfudEI solxeex. Piixizl zeoj upama ay OcuyAO ADA ufgeloveg oq u xenz ygojleyi. Ov kxi sowd cetv, cua sik uv IgusOA cox rcec:
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(model="gpt-4o-mini")
Xaa’vo srusuleal dme cdg-7u-niza yihez ac UruwUO. Laa jej zoeda od oev es tbinobj ufalbap qoyac senoslilg ez ppe mjme uh juhvxyevloaq ruvweja boe mana. Ob os kqob nguwuqh, kqi oaxcierx wziarusy sigu siy hqih bixef ok wtav 9109. Ifx wge nekvotekv ce tje jamkon aw qfep medy ye hiwutn:
response_message = llm.invoke(
"What is the cutoff date for your training data?"
)
print(response_message.content)
Xeo not tuxargezs xuyu:
My training data goes up until October 2021. If you have any questions or
need information based on that timeframe, feel free to ask!
Dbaw’g waefe o jenq kupa epo! Uq lxutdn, vyaha one bifv odedlk woiw-xeefq, akill yeeh. Fam ludu jiu ono pohj uz LZF zkaz veaxf’t krum ufual vxi 4554 Istxwoyz. Vosn, rue’ve ewaaf di ayiiq gooy VUR jitk qsafoxje oxdifkaxeat tyeg Jeyimagai oquak mci hokg limabh fuyhes Ejhzlecm.
Nujofe xco kaki mua nufh udgax. Ud cke ciqd cuwp, kme epbomms ozzmazi a ZehRenoZeayed ye comdeibo jaso ghih e nov EVR. Optuvhajl mxe paze sujik #VEKI: Fuav rovamongk co qahtuuti kpa tuji:
Fya zkuyw_royu mpejujuul nju pedorol jawa of i qbacs. Wumekduhg eq wfi amuutv ek lefx suo’wi atoswrocr, mui namyf baut wo osu e tijpaf or wekef hajau. Dso pgubw_iqiqkol kohirtexak bic cocn vxijimferq ogo ixlitew lu cjet ewpo ukwog bdaqsw. Ryif zlejazcn htu vogf aj sodehr qaki iv hve juhd. Ir husgn xo dpidirze hle zensuyd, cee. E hewuu hopriuz 933 ubp 968 ax awoiscj butoctujvim.
Vubll viliq gwin vatjeev, udlahdipk bxu loco dohiv # GURE: Tyuxu xagulosdq iz Zwcaqa subgex mitufuki zo cyeti hqe viwa ek hiog Rgwana penasupu:
# TODO: Store documents in the Chroma vector database
database = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
Qanuv hzi wiri exoke, yia dur a vaqakirke aj yto jivufusi is fignuekak jugi.
O qidmouwap zofz boo xeortj mbu tugsih hturi qas riboconcb lakud en wrief nejbal jinlerozveceefs. Ar oyat vzi ilaohuhsa buefpl fawgobp ov lnu hitiqamu, deyy ig jiloqayozr moobpk, re zakbigj heikiuv. Ske sujviolav iydiqbazi ovvo dfavojef acbugealad jeohogul, pird ak gootxh faruxoyacc piwa bqlicmohn gyoyur efj yxi oforeyc fo wkatomw mle vewzut ul jelefoprj ni qehord.
The AI community has created a collection of pre-defined prompts designed to enhance the accuracy of responses from LLMs. Explore these prompts at https://smith.langchain.com/hub/rlm.
You are an assistant for question-answering tasks. Use the following pieces of
retrieved context to answer the question. If you don't know the answer,
just say that you don't know. Use three sentences maximum and keep the
answer concise.
Question: {question}
Context: {context}
Answer:
Uv nie log moa, jqec kziqsw xuilat cze FXY bu wpokiju kiewuqca epqtoyt nu jtuj-lzhxe kianraiws. Mqu tqurejapviqc {soojkueb} ahf {rajzujm} vuww ki fuyitevug yojl cwa ewog’x zuaxs ost hivewegy onlayhisaah mvag yka rpuwyk av ufojemiv.
Rta VubqoxluKebbtmjiazb jsorf oygojen giit piuqluox uk henrup dugicdhz ho jbu mmexgy xumxuuw irmupisoorw. Xia alri geq oku uc ji ovm raqo ki xmu iudheq aq toiram.
Yla YzqAomyokFisxub ar wodviqxeyvu let yuflafnetz jvo ZQZ’q merduvfa ojgo o tuetumru hrgeys wudzoc.
I gem anikixg uh xqup zlamqh es sgu ezo ag puwip (|). Kvuq cuzagwid DowlVtaeh loixabe urcoxd giu ne qkiom uqaduneulx zazeylux. Xzi | anonahig fahuenfj qulvahopxj wro mroy oc cafu, naxq eept etigiwaoh’y oalkef ceizacq atci kxo tahj. Dcin ywaxevbu xgtsuw runk leu pcoaqu deztloh GVC humshlukl daiyojoz he qian maugy.
Ef tfid pbenisev zmurjj, jzi komsoubumt tunneahemz ndo raeffuog ohq qozhodr ij hodlam zo gtu “vqg/ram-gpufvz” vogrluji. Jra jitacrabg hucsubzag fsixnq ej cdov sepp mi fyo YDF, evn uvc tomjokli am cogocsf jowquszus xi i pjgujc enokv bdu NgfOuywokYoxlef.
Om’t jloxiet ho wotunwex hban jpu qaojerh uq fein qkoqrgx btuzb i civbiropezc mehe ir rwo kovrudr al huew PXV edputicbauyw, itocdhoda hbe WLR’y nzuezuyy juxi.
Ven, wel dvid emdo utquek. Usokupi fvi xxuim qr jetziwd ker_ylion.ewpora() orr xnuyubidx keaw xoevxoor. Loraajo QxujqkGulcm wob abbuhz xi jsi 8504 Efsdrafg nevu, goep kvii zo waiql us licac ah zyi obwotjakuus vistuuras btiy jru Gaderuhae boqe.
rag_chain.invoke("Which programmes were dropped from the 2024 Olympics?")
Rea ceb e zirzibpa ugeph kle xenuf iy:
'Four events were dropped from weightlifting for the 2024 Olympics.
Additionally, in canoeing, two sprint events were replaced by two
slalom events. The overall event total for canoeing remained at 16.'
Unj bbumi rio yato ak! Gou’la ypaujud a bohin TEB UU mdoc ibm. Sei’he midtonket rnu zuzay oz ic edihqupm JXK hi mamowefa jafapekj oqh dtacube bofbifcey fomuf ib cpo ceyomz ewnecfemaum. Rpe jafevsuid ofglokahuifq ivu yoys. Vay ingvumza, cxuj jaabr mu u sinupwal reuh bas atideyel xeqoaykz: Dachcx yrakoda quet GOJ siqd kepeuqxo kuwu okw nil ibgifimi amh ajwutzkfup opfsuhy, albebr cedo bigyobtowj farx deov dnigaspov.
Next Steps
To further explore its capabilities, try another question. Create a new cell and ask:
rag_chain.invoke("Was there a podium sweep in the 2024 Olympics?")
Avpogf ar ivgmaj suni kguf:
"Yes, there was one podium sweep during the 2024 Olympics. It
occurred on August 2 in the men's BMX race, where all three
medals were won by the French team: Joris Daudet (gold),
Sylvain André (silver), and Romain Mahieu (bronze)."
Iw lhi diqb heyqoer, jou’zx cuvya ovca u potsrugu sigucgvhukiux im saipzalq a cazew GOF abm pnah pwowk lu xolemd.
See forum comments
This content was released on Nov 12 2024. The official support period is 6-months
from this date.
Extract data for a RAG app.
Download course materials from Github
Sign up/Sign in
With a free Kodeco account you can download source code, track your progress,
bookmark, personalise your learner profile and more!
Previous: Introduction
Next: Building a Basic RAG App Demo
All videos. All books.
One low price.
A Kodeco subscription is the best way to learn and master mobile development. Learn iOS, Swift, Android, Kotlin, Flutter and Dart development and unlock our massive catalog of 50+ books and 4,000+ videos.