DEV Community

Asmae Elazrak for cortecs

Posted on

Streamline Your Batch Jobs: The Power of LLM Workers 🤖

Have you ever felt overwhelmed by the sheer volume of data you need to process or wished you could automate repetitive tasks effortlessly?

Imagine being able to summarize hundreds of research papers in minutes, extract critical insights from vast datasets, or streamline tedious workflows.
In this article, we’ll explore how Cortecs helps you unlock the full potential of large language models (LLMs) with ease, scalability, and cost-efficiency. Specifically, we’ll focus on how Cortecs simplifies handling batch jobs and massive data workloads, guiding you through everything from environment setup to seamless data processing at scale.

Let’s dive in and see how Cortecs can transform your AI journey.

Table of Contents 📚


What is Cortecs?

Cortecs is a platform that gives you on-demand access to powerful LLMs running on dedicated servers. This ensures maximum performance, reliability, and scalability for your AI tasks.

Cortecs lets you manage LLM Workers for large-scale processing, offloading tasks to specialized AI workers for high throughput and faster processing of massive datasets⚡.

  • Dedicated Servers for Fast AI Processing: With Cortecs, you get exclusive access to dedicated servers, meaning faster, more efficient AI processing without the competition for resources 🚀.
  • Easy to Set Up and Use: Cortecs is designed for simplicity. It integrates seamlessly with your existing workflows, so you can start using LLMs right away with minimal setup.
  • Scalable and Cost-Effective: Cortecs scales with your needs, offering dynamic resource allocation that ensures you only pay for what you use💰, keeping costs low.

Setting Up Your Environment 🛠️

Before diving into batch processing, you'll need to set up your environment. First, register at Cortecs.ai and create your access credentials on your profile page 📋.

Profile page example from Cortecs interface

Once you have your credentials, set them as environment variables in your code:

import os

# Set the Cortecs API credentials as environment variables
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"
os.environ["CORTECS_CLIENT_ID"] = "your_cortecs_client_id"
os.environ["CORTECS_CLIENT_SECRET"] = "your_cortecs_client_secret"
Enter fullscreen mode Exit fullscreen mode

You'll also need to install several Python libraries to run the example below. These can be easily installed via pip. Here are the commands to install the required packages:

!pip install langchain
!pip install langchain-community
!pip install cortecs-py
!pip install arxiv
!pip install pymupdf
Enter fullscreen mode Exit fullscreen mode

Batch Processing with Cortecs-py 🔄

Cortecs-py is a lightweight Python wrapper for the Cortecs REST API. It provides you with the tools to dynamically manage your AI instances directly from your workflow, making batch processing seamless and efficient.

Combined with LangChain a versatile framework for LLM workflows, you can unlock incredible efficiency and power.

Let’s explore a real-world example of using Cortecs-py for batch processing

Step 1: Loading Documents 📄

After adding the necessary credentials and installing the required libraries, we’ll retrieve research papers from Arxiv using the ArxivLoader, focusing on a query like 'Reasoning.'

from langchain_community.document_loaders import ArxivLoader
from cortecs_py.client import Cortecs
from cortecs_py.integrations.langchain import DedicatedLLM

# Initialize Cortecs client
cortecs = Cortecs()

# Load documents
loader = ArxivLoader(
    query="reasoning",
    load_max_docs=40,
    get_ful_documents=True,
    doc_content_chars_max=25000,  
    load_all_available_meta=False
)
docs = loader.load()
Enter fullscreen mode Exit fullscreen mode

Step 2: Creating a Prompt 💬

Then, we’ll create a simple prompt that asks the model to explain the document content in plain language.

from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template("{text}\n\nExplain to me like I'm five:")
Enter fullscreen mode Exit fullscreen mode

Step 3: Batch Processing 🏭

With Cortecs-py, batch processing is straightforward. The DedicatedLLM class makes it even easier as it automatically takes care of starting and stopping your infrastructure.

with DedicatedLLM(client=cortecs, model_name='cortecs/phi-4-FP8-Dynamic') as llm:
    chain = prompt | llm

    print("Processing data batch-wise ...")
    summaries = chain.batch([{"text": doc.page_content} for doc in docs])

    for summary in summaries:
        print(summary.content + '-------\n\n\n')
Enter fullscreen mode Exit fullscreen mode

💡 Remark: Don't forget to choose a model that supports the required context length for your use case. In this example, we are using the phi-4-FP8-Dynamic model.
You can explore the full range of models offered by Cortecs here.

Below is an example of the batch-processing output 📊:

LLM workers output

This simple pipeline summarized 224,200 input tokens into 12,900 output tokens in just 55 seconds, proving the efficiency of batch processing with dedicated inference.

Company Model Comparison

When comparing the cost of using Cortecs for summarization tasks to other solutions like Fireworks or cloud-based services, Cortecs stands out for its cost efficiency, with no unpredictable costs. This makes it an ideal solution for companies looking to leverage AI without breaking the bank🏦.

Ready to transform your workflows and elevate your AI projects?

Discover how Cortecs can help you unlock the power of Large Language Models (LLMs) while maintaining cost efficiency🚀.

Top comments (0)