LLM Latency, Throughput, and Scalability

AI, large language models | Intermediate

16 videos | 1h 43m 28s
Earns a Badge

Latency, throughput, and scalability are critical factors in determining the performance of large language models (LLMs) in real-world applications. In this course, you will learn how to effectively manage throughput and scalability in large language models (LLMs), key concepts that ensure your models perform efficiently even under heavy workloads. Explore how to evaluate the throughput of different models to see how they cope with high-traffic situations, such as generating large amounts of content or processing vast datasets. Additionally, you'll learn about scalability, which focuses on ensuring your LLM can expand and adapt as workloads grow. Discover how to identify and address scalability challenges when deploying large models in production environments, so your LLM can handle increasing demands without slowing down or losing accuracy. By the end of this course, you will have the skills to optimize both throughput and scalability, enabling your models to excel in real-world applications.

WHAT YOU WILL LEARN

Discover the key concepts covered in this course

Define latency and its importance in real-world large language model (llm) applications

Define throughput and its importance in real-world large language model (llm) applications

Define scalability and its importance in real-world large language model (llm) applications

Measure latency of different large language models (llms) in real-time processing environments

Perform live measurements of inference latency on small and large models for a text generation task

Identify the trade-offs between low-latency and high accuracy in selecting llms for real-time applications

Define the throughput of different llms and their ability to handle high-traffic applications
Perform throughput analysis of llms processing large volumes of text in a distributed system

Identify the scalability challenges that arise when deploying large models in production environments

Evaluate an llm based on low-latency requirements for time-sensitive applications

Evaluate an llm based on low-latency requirements for balancing performance

Evaluate an llm based on low-latency requirements for real-time demands

Explore a use case where ethical concerns are evaluated, showing how to apply fairness and compliance measures when deploying llms

Demonstrate how hyperparameters (e.g., learning rate, batch size) can influence the cost and efficiency of training an llm

Summarize the key concepts covered in this course

IN THIS COURSE

2m 13s

In this video, we will discover the key concepts covered in this course. FREE ACCESS
7m 37s

In this video, learn how to define latency and its importance in real-world large language model (LLM) applications. FREE ACCESS
3. Large Language Model (LLM) Throughput

7m 24s

Upon completion of this video, you will be able to define throughput and its importance in real-world large language model (LLM) applications. FREE ACCESS
4. Large Language Model (LLM) Scalability

7m 23s

After completing this video, you will be able to define scalability and its importance in real-world large language model (LLM) applications. FREE ACCESS
5. Measuring LLM Latency

7m 13s

In this video, we will measure latency of different large language models (LLMs) in real-time processing environments. FREE ACCESS
6. Measuring Large Model Latency

6m 35s

During this video, discover how to perform live measurements of inference latency on small and large models for a text generation task. FREE ACCESS
7. Selecting Large Language Models

6m 50s

After completing this video, you will be able to identify the trade-offs between low-latency and high accuracy in selecting LLMs for real-time applications. FREE ACCESS
8. Large Language Model and High Traffic

6m 37s

In this video, we will define the throughput of different LLMs and their ability to handle high-traffic applications. FREE ACCESS
9. Evaluating Large Language Models & Distributed Text

7m 49s

Learn how to perform throughput analysis of LLMs processing large volumes of text in a distributed system. FREE ACCESS
10. Scaling Large Language Models

6m 24s

Upon completion of this video, you will be able to identify the scalability challenges that arise when deploying large models in production environments. FREE ACCESS
11. Large Language Models & Time Sensitive Applications

7m 9s

After completing this video, you will be able to evaluate an LLM based on low-latency requirements for time-sensitive applications. FREE ACCESS
12. Large Language Models & Performance Balancing

7m 19s

In this video, we will evaluate an LLM based on low-latency requirements for balancing performance. FREE ACCESS
13. Large Language Models & Real-time Demands

6m 24s

Upon completion of this video, you will be able to evaluate an LLM based on low-latency requirements for real-time demands. FREE ACCESS
14. Ethical Considerations in LLM Deployment

7m 17s

In this video, we will explore a use case where ethical concerns are evaluated, showing how to apply fairness and compliance measures when deploying LLMs. FREE ACCESS
15. Optimizing LLM Hyperparameters to Reduce Costs

7m 55s

In this video, we will demonstrate how hyperparameters (e.g., learning rate, batch size) can influence the cost and efficiency of training an LLM. FREE ACCESS
16. Course Summary

1m 21s

In this video, we will summarize the key concepts covered in this course. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.

Book Large Language Models: An Introduction

Course NLP with LLMs: Fine-tuning Models for Classification & Question Answering

(1)

Course Large Language Models and Key Metrics

Get Started

Sharpen your skills. Upgrade your career. Find the right learning path for you, based on your role and skills. Take part in hands-on practice, study for a certification, and much more - all personalized for you.

*Not included: Compliance, Leadership Development Program content, and Engineering books

Your content + our content + our platform = a path to learning success

Using our learning experience platform, Percipio, your learners can engage in custom learning paths that can feature curated content from all sources.

Learn More

Aspire to something bigger

Aspire Journeys are guided learning paths that set you in motion for career success.

Browse Aspire Journeys

Explore a world of live learning with Global Knowledge

Choose from convenient delivery formats to get the training you and your team need - where, when and how you want it.

Browse Live Learning

IT Skills and Salary Report

ESG Impact Report

LLM Latency, Throughput, and Scalability

WHAT YOU WILL LEARN

IN THIS COURSE

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

YOU MIGHT ALSO LIKE