LLM Latency, Throughput, and Scalability

AI, large language models    |    Intermediate
  • 16 videos | 1h 43m 28s
  • Earns a Badge
Latency, throughput, and scalability are critical factors in determining the performance of large language models (LLMs) in real-world applications. In this course, you will learn how to effectively manage throughput and scalability in large language models (LLMs), key concepts that ensure your models perform efficiently even under heavy workloads. Explore how to evaluate the throughput of different models to see how they cope with high-traffic situations, such as generating large amounts of content or processing vast datasets. Additionally, you'll learn about scalability, which focuses on ensuring your LLM can expand and adapt as workloads grow. Discover how to identify and address scalability challenges when deploying large models in production environments, so your LLM can handle increasing demands without slowing down or losing accuracy. By the end of this course, you will have the skills to optimize both throughput and scalability, enabling your models to excel in real-world applications.

WHAT YOU WILL LEARN

  • Discover the key concepts covered in this course
    Define latency and its importance in real-world large language model (llm) applications
    Define throughput and its importance in real-world large language model (llm) applications
    Define scalability and its importance in real-world large language model (llm) applications
    Measure latency of different large language models (llms) in real-time processing environments
    Perform live measurements of inference latency on small and large models for a text generation task
    Identify the trade-offs between low-latency and high accuracy in selecting llms for real-time applications
    Define the throughput of different llms and their ability to handle high-traffic applications
  • Perform throughput analysis of llms processing large volumes of text in a distributed system
    Identify the scalability challenges that arise when deploying large models in production environments
    Evaluate an llm based on low-latency requirements for time-sensitive applications
    Evaluate an llm based on low-latency requirements for balancing performance
    Evaluate an llm based on low-latency requirements for real-time demands
    Explore a use case where ethical concerns are evaluated, showing how to apply fairness and compliance measures when deploying llms
    Demonstrate how hyperparameters (e.g., learning rate, batch size) can influence the cost and efficiency of training an llm
    Summarize the key concepts covered in this course

IN THIS COURSE

  • 2m 13s
    In this video, we will discover the key concepts covered in this course. FREE ACCESS
  • 7m 37s
    In this video, learn how to define latency and its importance in real-world large language model (LLM) applications. FREE ACCESS
  • Locked
    3.  Large Language Model (LLM) Throughput
    7m 24s
    Upon completion of this video, you will be able to define throughput and its importance in real-world large language model (LLM) applications. FREE ACCESS
  • Locked
    4.  Large Language Model (LLM) Scalability
    7m 23s
    After completing this video, you will be able to define scalability and its importance in real-world large language model (LLM) applications. FREE ACCESS
  • Locked
    5.  Measuring LLM Latency
    7m 13s
    In this video, we will measure latency of different large language models (LLMs) in real-time processing environments. FREE ACCESS
  • Locked
    6.  Measuring Large Model Latency
    6m 35s
    During this video, discover how to perform live measurements of inference latency on small and large models for a text generation task. FREE ACCESS
  • Locked
    7.  Selecting Large Language Models
    6m 50s
    After completing this video, you will be able to identify the trade-offs between low-latency and high accuracy in selecting LLMs for real-time applications. FREE ACCESS
  • Locked
    8.  Large Language Model and High Traffic
    6m 37s
    In this video, we will define the throughput of different LLMs and their ability to handle high-traffic applications. FREE ACCESS
  • Locked
    9.  Evaluating Large Language Models & Distributed Text
    7m 49s
    Learn how to perform throughput analysis of LLMs processing large volumes of text in a distributed system. FREE ACCESS
  • Locked
    10.  Scaling Large Language Models
    6m 24s
    Upon completion of this video, you will be able to identify the scalability challenges that arise when deploying large models in production environments. FREE ACCESS
  • Locked
    11.  Large Language Models & Time Sensitive Applications
    7m 9s
    After completing this video, you will be able to evaluate an LLM based on low-latency requirements for time-sensitive applications. FREE ACCESS
  • Locked
    12.  Large Language Models & Performance Balancing
    7m 19s
    In this video, we will evaluate an LLM based on low-latency requirements for balancing performance. FREE ACCESS
  • Locked
    13.  Large Language Models & Real-time Demands
    6m 24s
    Upon completion of this video, you will be able to evaluate an LLM based on low-latency requirements for real-time demands. FREE ACCESS
  • Locked
    14.  Ethical Considerations in LLM Deployment
    7m 17s
    In this video, we will explore a use case where ethical concerns are evaluated, showing how to apply fairness and compliance measures when deploying LLMs. FREE ACCESS
  • Locked
    15.  Optimizing LLM Hyperparameters to Reduce Costs
    7m 55s
    In this video, we will demonstrate how hyperparameters (e.g., learning rate, batch size) can influence the cost and efficiency of training an LLM. FREE ACCESS
  • Locked
    16.  Course Summary
    1m 21s
    In this video, we will summarize the key concepts covered in this course. FREE ACCESS

EARN A DIGITAL BADGE WHEN YOU COMPLETE THIS COURSE

Skillsoft is providing you the opportunity to earn a digital badge upon successful completion on some of our courses, which can be shared on any social network or business platform.

Digital badges are yours to keep, forever.