The article is written mainly by
, CTO at and creator of the From Beginner to Advanced LLM Developer Course, which contains 50+ hours on building and deploying RAG-based LLM applications.The operational costs of large language models (LLMs) in production environments are rising, but they don’t have to. Techniques like Retrieval-Augmented Generation (RAG) and fine-tuning can significantly reduce costs for many use cases.
In this guide, we explore the practical cost implications of these techniques through a real-world example: an AI tutor chatbot. Optimizing such a deployment requires evaluating several key factors, including performance trade-offs, resource availability, and the feasibility of continuous improvement. Our case study addresses these factors, offering actionable strategies to enhance performance, discussing the critical factors driving these improvements, and guiding you on effectively applying these insights to your use case based on our findings.
RAG vs. Fine-Tuning
As the name suggests, Retrieval-Augmented Generation (RAG) is an approach that augments LLMs with external data so that model responses are grounded in factual, up-to-date sources, reducing the risk of outdated or hallucinated content. It combines a separate retrieval system with a language model, allowing the model to access real-time, relevant information from external sources during inference.
Fine-tuning, on the other hand, involves adapting a pre-trained model to a specific task or domain. By further training the model on a targeted, smaller dataset, fine-tuning updates the model’s internal parameters, enhancing its ability to recognize subtle patterns and internalize specialized domain knowledge.
Combining these two approaches ensures that your application can deliver domain-specific accuracy and the model remains up-to-date and contextually relevant.
You don’t always need both approaches; the choice depends on the requirements of your application. So, let’s explore how to select the right approach based on your specific needs.
Choosing the Right Approach: RAG vs. Fine-Tuning
The table below compares these approaches based on key factors:
Dynamic Knowledge: RAG excels at real-time updates, while fine-tuning provides static knowledge. A hybrid approach combines dynamic retrieval with domain expertise.
Style/Format Control: Fine-tuning ensures consistent style, whereas RAG relies on retrieval quality. The hybrid model integrates style with relevant context.
Upfront Costs: RAG has lower initial costs, fine-tuning is more expensive, and the hybrid approach balances both with moderate costs.
Ongoing Maintenance: RAG requires frequent updates to external data sources, fine-tuning is low maintenance, and the hybrid model needs occasional updates to both the model and retrieval system.
Choosing between RAG, fine-tuning, or a hybrid approach is just the first step—each approach comes with a series of sub-decisions that shape scalability, cost, performance, functionality, and adaptability. From retrieval strategies to model optimization, every part of the pipeline has trade-offs.
You can learn how to break down these choices and how to test what works best for your use case in our From Beginner to Advanced LLM Developer course.
Case Study: AI Tutor System
Having explored the factors influencing the decision to use RAG and fine-tuning, let’s apply them to developing an AI Tutor. This chat-based system answers domain-specific questions related to AI and machine learning, utilizing a RAG approach to retrieve real-time data and ensure contextually relevant responses. For example, if you ask the AI Tutor to generate a sentence on LLM optimization, it will retrieve relevant, up-to-date contextual data through its RAG mechanism, integrate intricate technical details about optimization techniques, and produce a clear, engaging summary that highlights both the benefits and the impact of LLM optimization.
We integrate fine-tuning as a cost-effective enhancement within the RAG framework to optimize costs and improve profit margins while maintaining quality. By fine-tuning GPT-4o Mini using OpenAI’s managed fine-tuning, we optimize it for AI-related queries without the need for significant infrastructure or expertise. OpenAI’s solution eliminates the complexities of conventional fine-tuning, which would otherwise require extensive ML pipeline management.
This approach provides a streamlined, cost-efficient solution with minimal infrastructure overhead, enabling us to focus on improving response quality. By combining RAG with OpenAI’s fine-tuning, we achieve state-of-the-art performance at a fraction of the cost of scaling up to a full GPT-4o model.
Note: We have found this balance of optimal tools, techniques, models, and solutions after a lot of trial and error, and we are only presenting the most relevant techniques that balance cost vs. performance trade-offs.
Cost vs. Performance Trade-off
To illustrate the cost benefits of our approach, let’s compare different model configurations based on cost and performance:
Consider an AI Tutor handling 10,000 queries per day, with an average of 2,000 tokens per query:
GPT-4o → $1,500/month
Fine-tuned GPT-4o Mini → $180/month
This results in $1,320 monthly savings, demonstrating that fine-tuning within a RAG framework is a scalable and cost-efficient solution.
For this case study, we use Google Colab, so you can easily follow along without relying on a specific machine.
We will begin by setting up the RAG pipeline and fine-tune GPT-4o-mini using OpenAI. Finally, we will compare the results with GPT-4o using o3-mini as a judge LLM to evaluate the impact of fine-tuning.
Setup and Environment
Required Libraries
Access the complete implementation in the Colab Notebook.
For this AI tutor, we use LlamaIndex, ChromaDB, and OpenAI SDK for calling LLM.
Let’s start by installing the required libraries by running the following command:
!pip install -q llama-index==0.10.57 openai==1.59.8 chromadb==0.5.5 pydantic==2.10.5 jsonlines==4.0.0
We use ChromaDB because it’s open-source and easy to deploy both locally and in the cloud. It simplifies multimodal search for text, image, video, and audio. It supports embedding storage (with metadata), and efficient retrieval, making it ideal for LLM applications. ChromaDB is specifically optimized for index querying, integrating custom data with LLMs, and supports answer generation from retrieved contexts using LLMs. Further, we use LlamaIndex, an open-source data orchestration framework, for embedding generation and retrieval from ChromaDB.
Setup API Keys
Next, we set the OpenAI API key in the environment, which is required to use OpenAI embedding models and LLMs.
import os
# Set the following API Keys in the Python environment. Will be used later.
os.environ["OPENAI_API_KEY"] = "<YOUR_OPENAI_API_KEY>"
RAG
For RAG, the first step is configuring the embedding model. We use text-embedding-3-small
— the most advanced embedding model — by OpenAI. Further, we will also pre-populate chromaDB, with a vector store-dump from HuggingFace (explained more later) so we don’t have to embed and store data from scratch.
Configuring the Embedding Model
We use LlamaIndex to generate OpenAI embeddings. Configure the model by setting text-embedding-3-small
as the model string. We set the embedding model in the Settings
object, this way it becomes the global default throughout the application, simplifying configuration by reducing the need to specify the model in multiple places.
# Embedding Modelfrom llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core import Settings
# Configure Embedding model
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
Initialize the Vector Store
We initialize chroma DB with a pre-built vector store specifically designed for AI tutor applications with RAG in mind. It is available on the Hugging Face Hub under the username jaiganesan/ai_tutor_knowledge.
It integrates frameworks documentation (LlamaIndex, Hugging Face Transformers, PEFT, and TRL), Towards AI blog posts for expert insights, and ArXiv research papers focused on RAG and AI techniques. Additionally, it features pre-processed embeddings for quick setup and smart chunking, splitting data into 800-token segments with tagging to preserve contextual integrity.
What sets this store apart is its structured metadata (title, URL, source, and content type) that enables precise filtering, along with targeted filtering to focus on key AI terms like “transformer,” “RAG,” and “fine-tuning.” Unlike many alternatives, it sources data directly from open-source GitHub repositories and ArXiv papers, ensuring high-quality, AI-specific knowledge.
To set it up, first download the vector store. The dataset is stored on Hugging Face as a JSON Lines (.jsonl
) file containing structured entries (e.g., name
, content
, url
, and source
). It is processed into smaller chunks with metadata, embedded using Chroma, and exported as vectorstore.zip. This ZIP file contains the Chroma collection, including all chunks, metadata, and embeddings. The dataset is then uploaded to Hugging Face using Git.
vectorstore = hf_hub_download(repo_id="jaiganesan/ai_tutor_knowledge", filename="vectorstore.zip", repo_type="dataset", local_dir="/content")
Once downloaded, initialize ChromaDB, and create an ai_tutor_knowledge collection.
import chromadb
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.core import VectorStoreIndex
# Create your index
db = chromadb.PersistentClient(path="./ai_tutor_knowledge")
chroma_collection = db.get_or_create_collection("ai_tutor_knowledge")
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
# Create your index
vector_index = VectorStoreIndex.from_vector_store(vector_store)
Now, this information is ready to be retrieved by semantic search over the embedding space. That means, whenever a new query comes in, we can embed it using the OpenAI embedding model that we configured earlier and perform a search over the ai_tutor_knowledge
collection to retrieve similar chunks, which can be fed to an LLM to provide a response grounded in the provided data.
Testing RAG
Now, let’s evaluate GPT-4o and GPT-4o-mini in our RAG setup to see how well it performs.
GPT-4o
First, we use OpenAI import from LlamaIndex to invoke OpenAI. Here, we use gpt-4o
and retrieve similar chunks from ChromaDB (with top_k = 5
). This returns a query engine, which can later be used to generate a response from the model along with similar chunks returned by ChromaDB.
from llama_index.llms.openai import OpenAI
# GPT-4o
llm_gpt_4o = OpenAI(temperature=1, model="gpt-4o")
# Query Engine
query_engine_0 = vector_index.as_query_engine(llm= llm_gpt_4o, similarity_top_k=5)
response_gpt_4o = query_engine_0.query("Compare the knowledge retention abilities of a RAG model versus a BERT-based model that has been extensively fine-tuned using PEFT techniques. How do their outputs differ when the knowledge source is removed?")
response_gpt_4o.response
Below is the output generated by GPT-4o:
The knowledge retention abilities of a RAG model and a BERT-based model fine-tuned with
PEFT techniques differ primarily in their reliance on external versus internalized knowledge.
A RAG model is designed to combine retrieval and generation, using a dense vector index for
accessing knowledge dynamically. This model excels in generating outputs that are specific
and diverse, particularly in knowledge-intensive tasks, because it can access real-time
information from the index, effectively "augmenting" its responses with external data.
On the other hand, a BERT-based model fine-tuned using PEFT techniques relies more on the
knowledge embedded in its parameters. When fine-tuned extensively, it can perform well on
specific tasks using the knowledge it has learned during training. However, its ability to
adapt to new or unseen data without external information is limited compared to the RAG model.
In scenarios where the knowledge source is removed, a RAG model would likely struggle more
since its effectiveness hinges on accessing external data. Conversely, the BERT-based model
would revert to generating responses based solely on its learned internal parameters, which
may lead to less specific or outdated outputs if the task requires current or detailed knowledge.
GPT-4o-mini
Next, we do the same using GPT-4o-mini :
from llama_index.llms.openai import OpenAI
# GPT-4o-mini
llm_gpt_4o_mini = OpenAI(temperature=1, model="gpt-4o-mini")
# Query Engine
query_engine_1 = vector_index.as_query_engine(llm= llm_gpt_4o_mini, similarity_top_k=5)
response_gpt_4o_mini = query_engine_1.query("Compare the knowledge retention abilities of a RAG model versus a BERT-based model that has been extensively fine-tuned using PEFT techniques. How do their outputs differ when the knowledge source is removed?")
response_gpt_4o_mini.response
Below is the output generated by GPT-4o-mini:
The knowledge retention abilities of a RAG model and a BERT-based model fine-tuned with
parameter-efficient fine-tuning (PEFT) techniques differ significantly, particularly
when the knowledge source is removed.
RAG models integrate a retrieval mechanism that utilizes non-parametric memory, such as
a dense vector index of external knowledge (e.g., Wikipedia), to enhance their outputs.
When the knowledge source is available, RAG models retrieve relevant information to
generate responses that are more specific, diverse, and factually accurate, which is
particularly beneficial for knowledge-intensive tasks.
In contrast, a BERT-based model, even one that has been well fine-tuned with PEFT
techniques, primarily relies on its internal parameters for knowledge. Without an
external knowledge source, it may struggle to provide accurate or contextually rich
responses, as it doesn't have the same direct access to external information that RAG
models utilize for generation. The outputs from the BERT model in the absence of
additional knowledge sources often lack the specificity and factual grounding present
in the RAG model's outputs, making them potentially less informative and more generic.
When comparing the two outputs generated by the query engine, the GPT-4o provided a clearer and more accurate explanation of the query. It emphasized how RAG models rely heavily on external sources, while BERT-based models store knowledge within their parameters. On the other hand, GPT-4o-mini was less precise and suggested that BERT-based models struggle more without external knowledge, which isn’t entirely accurate.
Since GPT-4o-mini is a smaller model compared to GPT-4o, this is to be expected. However, GPT-4o-mini also comes at a fraction of the cost of GPT-4o. What’s interesting is that for specific use cases, like an AI Tutor, we can fine-tune GPT-4o-mini to enhance its performance while keeping costs low. Using GPT-4o responses as a benchmark, we can fine-tune GPT-4o-mini to better align with its tone, style, and depth.
Let’s see how we can achieve that.
Fine-Tuning
Fine-tuning with OpenAI involves customizing pre-trained language models, such as GPT-4o-mini, by training them on carefully curated, domain-specific datasets. This process helps the model better understand context and enhances its ability to provide accurate and relevant responses. In our case, to fine-tune GPT-4o-mini, we created 100 Q&A pairs for training and 30 for evaluation, to mimic GPT-4o’s response style. The training data was carefully selected to cover a range of AI-related topics, ensuring that it included examples of in-depth, factual explanations to guide the model’s learning. This targeted data helps the model adopt a clearer and more precise writing style, allowing it to communicate complex AI concepts more effectively.
This data was formatted to fit the specific structure needed for fine-tuning. The training and evaluation dataset was initially sourced from various places like blogs, technical documentation, and papers, and then formatted into a conversational structure suitable for fine-tuning GPT models. It is explained in detail below:
Dataset Preparation
Download Raw Data
We have open-sourced the raw dataset on HuggingFace. First, we will download that:
from huggingface_hub import hf_hub_download
file_path = hf_hub_download(
repo_id="jaiganesan/GPT_4o_mini_Fine_tune",
filename="question_answers_data_100.jsonl",
repo_type="dataset",
local_dir="/content"
)
Once downloaded, we will transform it to the desired format.
Data Transformation
OpenAI requires data in JSONL (json lines, .jsonl
) format. Each line in the file should contain a JSON object with a “messages” key, which includes an array of message objects. Each message object has a “role” (system, user, or assistant) and “content”. The sample line should look like this:
{"messages": [{"role": "system", "content": "You are a teaching assistant for Machine Learning."}, {"role": "user", "content": "What is machine learning?"}, {"role": "assistant", "content": "Machine learning is..."}]}
We use the following script to transform the original data into the desired format:
from huggingface_hub import hf_hub_download
import json
import jsonlines
from pprint import pprint
import tiktoken
from collections import defaultdict
format_errors = defaultdict(int)
def dataset_preparation(file_name):
file_path = hf_hub_download(
repo_id="jaiganesan/GPT_4o_mini_Fine_tune",
filename=file_name,
repo_type="dataset",
local_dir="/content"
)
with open(file_path, "r") as file:
data = [json.loads(line) for line in file]
print("Total entries in the dataset:", len(data))
print("-_"*30)
print(data[4])
output_data = []
for entry in data:
formatted_entry = {
"messages": [
{"role": "system", "content": "As AI Tutor, answer questions related to AI topics in an in-depth and factual manner."},
{"role": "user", "content": entry['question']},
{"role": "assistant", "content": entry['answer']}
]
}
output_data.append(formatted_entry)
# Validate and analyze the output data
validate_dataset(output_data)
counting_no_tokens(output_data)
print("-_"*30)
print(output_data[4])
base_file_name = os.path.splitext(file_name)[0]
output_file_path = f'formatted_{base_file_name}.jsonl'with jsonlines.open(output_file_path, mode='w') as writer:
writer.write_all(output_data)
print(f"\nFormatted dataset has been saved to {output_file_path}.")
Data Validation
Next, we need to upload data to OpenAI before we begin fine-tuning. A crucial step before doing that is to validate the data we have just formatted. We use the validate_dataset
function for this purpose.
The validate_dataset
function checks if the dataset meets OpenAI’s formatting requirements for fine-tuning LLMs like GPT-4o-mini. It ensures that each dataset example is a dictionary and the messages
field is not empty. It also validates that each message
has the required fields (role
and content
), preventing unrecognized keys. It also confirms that at least one message
has the “assistant
” role, which is vital for fine-tuning.
def validate_dataset(output_data):
for ex in output_data:
if not isinstance(ex, dict):
format_errors["data_type"] += 1
continue
messages = ex.get("messages", None)
if not messages:
format_errors["missing_messages_list"] += 1
continue
for message in messages:
if "role" not in message or "content" not in message:
format_errors["message_missing_key"] += 1
if any(k not in ("role", "content", "name", "function_call", "weight") for k in message):
format_errors["message_unrecognized_key"] += 1
if message.get("role", None) not in ("system", "user", "assistant", "function"):
format_errors["unrecognized_role"] += 1
content = message.get("content", None)
function_call = message.get("function_call", None)
if (not content and not function_call) or not isinstance(content, str):
format_errors["missing_content"] += 1
if not any(message.get("role", None) == "assistant" for message in messages):
format_errors["example_missing_assistant_message"] += 1
if format_errors:
print("Found errors:")
for k, v in format_errors.items():
print(f"{k}: {v}")
else:
print("\nNo errors found in the Formatted dataset \n")
Token Counting
We use tiktoken to calculate the number of tokens in our data. As OpenAI charges by the number of tokens, this helps us estimate the costs upfront. More on fine-tuning costs here.
def counting_no_tokens(output_data):
tokenizer = tiktoken.encoding_for_model("gpt-4o-mini")
total_tokens = sum(len(tokenizer.encode(...)) for entry in output_data)
print(f"Total tokens: {total_tokens}")
Upload Dataset to OpenAI
We initialize the OpenAI client and upload the training data.
from openai import OpenAI
import os
# Initialize OpenAI client with API key
client = OpenAI(api_key="your-api-key-here") # Replace with your API key
# Upload training file
fine_tune_file = client.files.create(
file=open("formatted_training_data.jsonl", "rb"),
purpose="fine-tune"
)
Once uploaded, we use a helper function check_files_status to ensure it is successfully uploaded and processed. OpenAI does its own validation and pre-processing steps before marking it processed.
def check_file_status(file_id):
file_info = client.files.retrieve(file_id)
print(f"File status: {file_info.status}")
return file_info.status == "processed"
Create fine-tuning job
Once the file is processed, we initiate the fine-tuning job with specified hyperparameters:
# Start fine-tuning
result_job = client.fine_tuning.jobs.create(
training_file=fine_tune_file.id,
model="gpt-4o-mini-2024-07-18",
hyperparameters={
"n_epochs": 2,
"batch_size": 1,
"learning_rate_multiplier": 0.8
}
)
This code initiates the fine-tuning process with specific hyperparameters:
n_epochs
: Number of training epochs (2 in this case). Epochs refer to the number of times the entire training dataset is passed through the model. For example, two epochs mean the model has undergone the entire training dataset twice. We chose two epochs to balance overfitting risk with compute costs
batch_size
: Number of training examples processed together (1 in this case). A smaller batch size, such as 1, processes one example at a time, which can be useful when working with limited hardware resources but may result in longer training times.
learning_rate_multiplier
: The learning rate multiplier scales the base learning rate, controlling how quickly or slowly the model adjusts its parameters during training. A multiplier of 0.8
means that the learning rate is increased by 80% during training, which can help speed up convergence but may need careful tuning to avoid instability.
The following code monitors the fine-tuning job’s status by querying the job every 60 seconds. Once done, OpenAI self-hosts it and provides us an ID to use it serverless. It only charges us for the tokens consumed.
import time
while True:
status = client.fine_tuning.jobs.retrieve(job.id)
print(f"Status: {status.status}")
if status.status in ["succeeded", "failed", "cancelled"]:
if status.status == "succeeded":
print(f"Fine-tuned model ID: {status.fine_tuned_model}")
break
time.sleep(60)
Testing Fine-tuned Model
To test the fine-tuned GPT-4o-mini, we use the same RAG vector index:
# Fine Tuned Model
from llama_index.llms.openai import OpenAI
llm_gpt_fine_tuned_model = OpenAI(temperature=1, model="id of your fine-tuned model")
# Query Engine
query_engine_2 = vector_index.as_query_engine(llm= llm_gpt_fine_tuned_model , similarity_top_k=5)
response_fine_tuned_model = query_engine_2.query("Compare the knowledge retention abilities of a RAG model versus a BERT-based model that has been extensively fine-tuned using PEFT techniques. How do their outputs differ when the knowledge source is removed?")
#Response
response_fine_tuned_model.response
This setup creates a query engine with the fine-tuned model, controlling randomness via a temperature
of 1, and using top_k=5
for similarity search.
Fine-tuned GPT-4o-mini Response:
The performance of a RAG model in terms of knowledge retention abilities shows a significant
improvement over a BERT-based model that is extensively fine-tuned using PEFT techniques,
especially in knowledge-intensive tasks. RAG models are designed to integrate both parametric
and non-parametric memory components, utilizing a dense vector index of external knowledge
(like Wikipedia) and enhancing their outputs with this retrieved information. This feature
allows RAG models to accurately recall and generate specific, diverse, and factual content,
resulting in superior performance on tasks requiring up-to-date knowledge.
BERT-based models, even when fine-tuned with PEFT techniques, tend to retain knowledge
primarily through their model parameters but cannot dynamically access external knowledge
bases. This lack of an external knowledge link means that BERT models face limitations in
accessing real-time or extensive domain knowledge, leading to outdated or generalized
generative outputs when the knowledge source is removed.
When the knowledge source is removed, outputs from a RAG model that relies on external
knowledge will likely be less factually accurate and decrease in diversity and specificity.
In contrast, a BERT-based model may still produce coherent text but its content quality is
significantly affected, leading to generic responses lacking in recent or domain-specific
insights. Thus, while BERT models can generate decent outputs from their training data, RAG
models excel because they combine stored knowledge with timely data, and without access to
this external source, their output will lack the necessary detail and accuracy expected in
knowledge-intensive contexts.
We can see that the fine-tuned RAG system compares RAG and BERT-based models fine-tuned with PEFT, focusing on how each model’s knowledge retention is affected without access to external data. It is much clearer like GPT-4o at a fraction of cost.
Model Response Comparison
To test the efficiency of our approach, we used o3-mini judge LLM to to rate the responses from GPT-4o, GPT-4o-mini, and the fine-tuned GPT-4o-mini on two AI-related questions. While this method provided useful insights, more sophisticated methods should be employed in the future. Automated metrics such as ROUGE, BLEU, and BERTScore can quantify similarity to expert-written answers, human evaluation offers nuanced assessments, and testing on standardized AI/ML exam questions can further gauge knowledge transfer and learning outcomes.
The two AI-related questions were:
Knowledge retention differences between RAG and BERT-based models when external sources are removed.
Comparative analysis of parameter-efficient fine-tuning techniques.
The goal is to determine if fine-tuning improves the GPT-4o-mini performance to match the larger, more powerful GPT-4o.
The evaluation focused on five key dimensions: Accuracy, Completeness, Clarity, Depth, and Overall Quality; each rated on a 1-10 scale. We then performed a comparative analysis to highlight the strengths and weaknesses of each model.
Key Findings
Below are the detailed evaluation scores comparing the three model variants:
Both GPT-4o and Fine-tuned GPT-4o-mini achieved identical overall scores (8.0/10) in accuracy and completeness. The fine-tuned model outperformed GPT-4o in completeness (8.0 vs 7.5), while GPT-4o maintained an advantage in depth (8.0 vs 7.0). Both models showed the same score in clarity (8.5).
These results indicate that fine-tuning improved the performance of the smaller GPT-4o-mini, bringing it closer to GPT-4o’s capabilities in specific areas. However, there is still potential for improvement in the depth criteria, suggesting that while fine-tuning is beneficial, further refinements are needed. The results show that targeted fine-tuning continues to be a cost-effective approach for deploying high-quality AI models in specialized domains.
Conclusion
Fine-tuning a language model for a specific task improves quality, consistency, and cost-efficiency but requires careful dataset preparation and tuning. RAG enhances this by adding real-time knowledge retrieval, making it ideal for tasks that need up-to-date information without retraining.
Combining fine-tuning with RAG allows cheaper models like GPT-4o Mini to match the performance of more expensive models, offering a scalable and cost-effective solution.
In a production environment, RAG requires a complex deployment strategy; finding this balance takes a lot of trial & error and testing. To minimize your learning process, we have created an extremely in-depth course that guides you through each stage of deploying an RAG application in a scalable production environment.
From Beginner to Advanced LLM Developer Course teaches, through 50+ hours of content, everything we learned when deploying a RAG-based LLM application so that you can follow and implement a pre-optimized path.
As readers of Decoding ML, we recommend their course and offer a 15% discount using code Paul_15 (available for all the courses from the Towards AI Academy).
Images
If not otherwise stated, all images are created by the author.
It was great working with you on this! Would love to explore more ideas together
What about compliance?