This is part of a series of articles to introduce Generative AI and step-by-step process with code sample to build GenAI Application; Part-1 (GenAI Introduction), Part-2 (Building GenAI App using Spring Boot and Amazon Bedrock) and Part-3 (Building GenAI App with Proprietary data).
GenAI Evolution
GenAI
(aka Generative AI) is the latest buzz word across Technology space.
This buzz is something similar to a set of earlier Tech milestone
revolutions, latest of such event was Smart Phone era (iPhone,
Android). Smart Phones have revolutionized content consumption, access
and interaction pattern (Billions of users experienced Internet only
because of Smart Phone!). Similarly, GenAI is introducing next phase of
Content creation, processing and interaction.
AI (Artificial
Intelligence) is not something new, it's been there as classic Computer
Science concept (aka theory) for ages. But it took huge amount of effort
and time to make it feasible (Tech, Infrastructure and Commercial
viability) and mainstream technology.
What is GenAI?
In simple words, Generative AI is a subset of AI / ML / Deep Learning which Learn and Process existing data to generate new content (Text, Audio, Video etc.) just like a Human Creator. Example, OpenAI's ChatGPT (GPT aka Generative Pre-trained Transformers) can generate content (blogs / stories) on any topic just like human content creators.
GenAI can be used for wide range of use cases like Content generation (blogs, stories), Language Translation, Sentiment Analysis, Question Answer (ChatBot), Content Summarisation and more.
How does GenAI work?
GenAI uses Transformer Architecture (neural networks) which takes text sequence as input (aka Prompt) and produces another text sequence as output (aka Result / Response). Neural Networks simulates human brain where multiple neural nodes orchestrate to solve any query. These Transformers are called Models (e.g. LLM - Large Language Model) in GenAI context. All these Transformers or Models are per-trained with vast amount of data to answer user query.
For example, a popular GenAI App (ChatGPT3) was trained with 45 terabytes of data from websites, books, wikipedia. That's why these Models know much more than what we human do!
GenAI Data Store
How does Models can handle such vast amount of data and search so FAST? Answer to this is Vector Database and Semantic or Similarity Search
(an approach of searching using meaning of query text instead of
keyword matching). Traditional databases store data in Table and Column.
Vector DB store data in multi-dimensional numeric vector. For
example, RGB Colors are in 3-dimensional structure where [0, 255, 0]
represents Green Color. In a practical Model, 1,000+ Vector dimensions
are used, higher dimensions increase accuracy.
Due to numeric representation of Vector Database, instead of keyword matching it uses distance calculation algorithm (Eculidian, Cosine similarity, Dot product) across Vectors to retrieve similar or proximity result. This helps Vector Databases to serve at micro second level latency, independent of database size.
GenAI Model Architecture
GenAI Model is a functional component which process queries and generate response. User query (aka Prompt) are in Natural Language, to process such query with there real meaning and relevance (i.e. Semantic Search), query text is further converted into Vectors (numeric representation). These query vectors are then matched against Vector Database to generate Result vectors, which are further converted to Word / Phrases (Final Response).
Converting query sentence to numeric set mapping (aka Vectors) involves processes like Tokenization (breaking down query sentence into smaller Tokens, set of 4-5 characters) and Vector Embedding (convert tokens to numeric mapping i.e. vector).
Concept of Token is very important, as AI Provider Platforms calculate usage (and thus charges) based on number of Input and Output Tokens i.e. size of request to be processed and amount of response content to be generated.
Model training is very expensive process (both time and cost). Real-life use cases would want Model to augment their Result based on proprietary records. Example, companies may want response of a Chat bot is based on their proprietary company details (sales, customer record etc.). This would provide more personalized and context driven experience.
RAG (Retrieval Augmented Generation) Architecture helps Models to leverage Knowledge Base (external proprietary records) for better contextual Response.
Importance of GPU
Models require heavy numeric calculation and parallel processing. GPU with their parallel processing capabilities, makes heavy computation like Image processing much faster. Thus GPU is more align to this requirement rather than CPU (sequential processing). This further allows to scale infrastructure at low cost (adding GPUs are much cost effective and scalable).
GenAI Provider Platforms
All leading cloud providers like AWS (Aamazon Bedrock), Google (Vertex AI), Azure / OpenAI etc. offer their fully-manged GenAI platform stack. All of them provides API based approach to interact with Applications. It depends on Application creator to choose their preferred platform based on their preferences (Cost, Existing cloud platform, Integration effort etc.).
Each of these platforms includes wide range of in-built Models (provided by different vendors- Titan by Amazon, Claude by Anthropic, Llma by Meta). For application development, use existing Models as per use cases and leverage Knowledge Base to infuse (augment) external proprietary information to generate more context driven Response.
Hope this gives a fare understanding of GenAI. If you are ready to explore deeper and build your first GenAI Application, here is another article GenAI App using Spring Boot and Amazon Bedrock.
Comments
Post a Comment