DEV Community

Cover image for Python Performance Tips You Must Know
Leapcell
Leapcell

Posted on

Python Performance Tips You Must Know

Image description

Comprehensive Guide to Python Code Performance Optimization

Python, as a dynamically typed interpreted language, may indeed have a slower execution speed compared to statically typed compiled languages like C. However, through certain techniques and strategies, we can significantly enhance the performance of Python code.

This article will explore how to optimize Python code to make it run faster and more efficiently. We will utilize Python's timeit module to accurately measure the execution time of the code.

Note: By default, the timeit module repeats the execution of the code one million times to ensure the accuracy and stability of the measurement results.

def print_hi(name):
    print(f'Hi, {name}')

if __name__ == '__main__':
    # Execute the print_hi('leapcell') method
    t = timeit.Timer(setup='from __main__ import print_hi', stmt='print_hi("leapcell")')
    t.timeit()
Enter fullscreen mode Exit fullscreen mode

How to Calculate the Running Time of a Python Script

In the time module, time.perf_counter() provides a high - precision timer, which is suitable for measuring short - time intervals. For example:

import time

# Record the start time of the program
start_time = time.perf_counter()

# Your code logic
#...

# Record the end time of the program
end_time = time.perf_counter()

# Calculate the running time of the program
run_time = end_time - start_time

print(f"Program running time: {run_time} seconds")
Enter fullscreen mode Exit fullscreen mode

I. I/O - Intensive Operations

I/O - intensive operations (Input/Output Intensive Operation) refer to programs or tasks that spend most of their execution time waiting for input/output operations to complete. I/O operations include reading data from disk, writing data to disk, network communication, etc. These operations usually involve hardware devices, so their execution speed is limited by hardware performance and I/O bandwidth.

Their characteristics are as follows:

  1. Waiting Time: When a program executes an I/O operation, it often needs to wait for data to be transferred from an external device to memory or from memory to an external device, which may cause the program's execution to be blocked.
  2. CPU Utilization Efficiency: Due to the waiting time of I/O operations, the CPU may be idle during this period, resulting in low CPU utilization.
  3. Performance Bottleneck: The speed of I/O operations often becomes the bottleneck of program performance, especially when the data volume is large or the transmission speed is slow.

For example, using the I/O - intensive operation print and running it one million times:

import time
import timeit

def print_hi(name):
    print(f'Hi, {name}')
    return

if __name__ == '__main__':
    start_time = time.perf_counter()
    # Execute the print_hi('leapcell') method
    t = timeit.Timer(setup='from __main__ import print_hi', stmt='print_hi("leapcell")')
    t.timeit()
    end_time = time.perf_counter()
    run_time = end_time - start_time
    print(f"Program running time: {run_time} seconds")
Enter fullscreen mode Exit fullscreen mode

The running result is 3s.

And when executing a method without using I/O operations, that is, calling the print_hi('xxxx') empty method without using print(), the program is significantly faster:

def print_hi(name):
    # print(f'Hi, {name}')
    return
Enter fullscreen mode Exit fullscreen mode

Optimization Methods for I/O - Intensive Operations

If necessary in the code, such as file reading and writing, the following methods can be used to improve efficiency:

  1. Asynchronous I/O: Use an asynchronous programming model such as asyncio, which allows the program to continue executing other tasks while waiting for I/O operations to complete, thereby improving CPU utilization.
  2. Buffering: Use a buffer to temporarily store data and reduce the frequency of I/O operations.
  3. Parallel Processing: Execute multiple I/O operations in parallel to improve the overall data processing speed.
  4. Optimize Data Structures: Select appropriate data structures to reduce the number of data reads and writes.

II. Using Generators to Generate Lists and Dictionaries

In Python 2.7 and subsequent versions, improvements have been introduced to list, dictionary, and set generators, making the construction process of data structures more concise and efficient.

1. Traditional Method

def fun1():
    list=[]
    for i in range(100):
        list.append(i)

if __name__ == '__main__':
    start_time = time.perf_counter()
    t = timeit.Timer(setup='from __main__ import fun1', stmt='fun1()')
    t.timeit()
    end_time = time.perf_counter()
    run_time = end_time - start_time
    print(f"Program running time: {run_time} seconds") 
    # Output result: Program running time: 3.363 seconds
Enter fullscreen mode Exit fullscreen mode

2. Optimizing the Code with Generators

Note: For the convenience of the following content, the code part of the main function main is omitted.

def fun1():
    list=[ i for i in range(100)] 
    # Program running time: 2.094 seconds
Enter fullscreen mode Exit fullscreen mode

As can be seen from the above derivation formula program, in addition to being more concise and easier to understand, it is also faster. This makes this method the preferred method for generating lists and loops.

III. Avoid String Concatenation, Use join()

join() is a string method in Python used to concatenate (or splice) elements in a sequence into a string, usually using a specific delimiter. Its advantages are usually as follows:

  1. High Efficiency: join() is an efficient method for concatenating strings, especially when dealing with a large number of strings. It is usually faster than using the + operator or % formatting. When concatenating a large number of strings, the join() method usually saves more memory than concatenating one by one.
  2. Conciseness: join() makes the code more concise and avoids repeated string concatenation operations.
  3. Flexibility: Any string can be specified as the delimiter, which provides great flexibility for string splicing.
  4. Wide Application: It can be used not only for strings but also for sequence types such as lists and tuples, as long as the elements can be converted into strings.

For example:

def fun1():
    obj=['hello','this','is','leapcell','!']
    s=""
    for i in obj:
        s+=i       
    # Program running time: 0.35186 seconds
Enter fullscreen mode Exit fullscreen mode

Using join() to achieve string concatenation:

def fun1():
    obj=['hello','this','is','leapcell','!']
    "".join(obj) 
    # Program running time: 0.1822 seconds
Enter fullscreen mode Exit fullscreen mode

Using join() reduces the execution time of the function from 0.35 seconds to 0.18 seconds.

IV. Use Map Instead of Loops

In most scenarios, the traditional for loop can be replaced by the more efficient map() function. map() is a built - in higher - order function in Python that can apply a specified function to various iterable data structures such as lists, tuples, or strings. The main advantage of using map() is that it provides a more concise and efficient way of data processing, avoiding the writing of explicit loop code.

Traditional Loop Method

def fun1():
    arr=["hello", "this", "is", "leapcell", "!"]
    new = []
    for i in arr:
        new.append(i) 
    # Program running time: 0.3067 seconds
Enter fullscreen mode Exit fullscreen mode

Using the map() Function to Do the Same Function

def fun2(x):
    return x

def fun1():
    arr=["hello", "this", "is", "leapcell", "!"]
    map(fun2,arr) 
    # Program running time: 0.1875 seconds
Enter fullscreen mode Exit fullscreen mode

After comparison, using map() saves nearly half of the time and greatly improves the running efficiency.

V. Choose the Right Data Structure

Choosing the appropriate data structure is crucial for improving the execution efficiency of Python code. Various data structures are optimized for specific operations. A reasonable choice can accelerate the retrieval, addition, and removal of data, thereby enhancing the overall operation efficiency of the program.

For example, when judging elements in a container, the lookup efficiency of a dictionary is higher than that of a list, but this is in the case of a large amount of data. The opposite is true for a small amount of data.

Testing with a Small Amount of Data

def fun1():
    arr=["hello", "this", "is", "leapcell", "!"]
    'hello' in arr
    'my' in arr      
    # Program running time: 0.1127 seconds

def fun1():
    arr={"hello", "this", "is", "leapcell", "!"}
    'hello' in arr
    'my' in arr    
    # Program running time: 0.1702 seconds
Enter fullscreen mode Exit fullscreen mode

Using numpy to Randomly Generate 100 Integers

import numpy as np

def fun1():
    nums = {i for i in np.random.randint(100, size=100)}
    1 in nums    
    # Program running time: 14.28 seconds

def fun1():
    nums = {i for i in np.random.randint(100, size=100)}
    1 in nums    
    # Program running time: 13.53 seconds
Enter fullscreen mode Exit fullscreen mode

It can be seen that in the case of a small amount of data, the execution efficiency of list is higher than that of dict, but in the case of a large amount of data, the efficiency of dict is higher than that of list.

If there are frequent addition and deletion operations and the number of elements added and deleted is large, the efficiency of list is not high. At this time, collections.deque should be considered. collections.deque is a double - ended queue that has the characteristics of both a stack and a queue and can perform insertion and deletion operations with a complexity of $O(1)$ at both ends.

Usage of collections.deque

from collections import deque  

def fun1():
    arr=deque()# Create an empty deque
    for i in range(1000000):
        arr.append(i)
    # Program running time: 0.0558 seconds

def fun1():
    arr=[]
    for i in range(1000000):
        arr.append(i)
    # Program running time: 0.06077 seconds
Enter fullscreen mode Exit fullscreen mode

The lookup operation of list is also very time - consuming. When it is necessary to frequently look up certain elements in a list or access these elements in an ordered manner, bisect can be used to maintain the order of the list object and perform binary search in it to improve the lookup efficiency.

VI. Avoid Unnecessary Function Calls

In Python programming, optimizing the number of function calls is crucial for improving code efficiency. Excessive function calls not only increase overhead but may also consume additional memory, thus slowing down the running speed of the program. To improve performance, we should try to reduce unnecessary function calls and attempt to combine multiple operations into one, thereby reducing execution time and resource consumption. Such optimization strategies help us write more efficient and faster code.

VII. Avoid Unnecessary import

Although Python's import statement is relatively fast, each import involves finding the module, executing the module code (if it has not been executed yet), and putting the module object into the current namespace. These operations all require a certain amount of time and memory. When you import modules unnecessarily, you will increase these overheads.

VIII. Avoid Using Global Variables

import math

size=10000
def fun1():
    for i in range(size):
        for j in range(size):
            z = math.sqrt(i) + math.sqrt(j) 
    # Program running time: 15.6336 seconds
Enter fullscreen mode Exit fullscreen mode

Many programmers initially write some simple scripts in the Python language. When writing scripts, they are usually accustomed to writing them as global variables directly, such as the code above. However, due to the different implementation methods of global variables and local variables, the code defined in the global scope runs much slower than that defined in a function. By putting the script statements into a function, a speed increase of usually 15% - 30% can be achieved.

import math

def fun1():
    size = 10000
    for i in range(size):
        for j in range(size):
            z = math.sqrt(i) + math.sqrt(j)  
    # Program running time: 14.9319 seconds
Enter fullscreen mode Exit fullscreen mode

IX. Avoid Module and Function Attribute Access

import math # Not recommended

def fun2(size: int):
    result = []
    for i in range(size):
        result.append(math.sqrt(i))
    return result

def fun1():
    size = 10000
    for _ in range(size):
        result = fun2(size) 
    # Program running time: 10.1597 seconds
Enter fullscreen mode Exit fullscreen mode

Each time the . (attribute access operator) is used, specific methods such as __getattribute__() and __getattr__() are triggered. These methods perform dictionary operations, so they bring additional time overhead. By using the from import statement, attribute access can be eliminated.

from math import sqrt # Recommended: Import only the modules you need

def fun2(size: int):
    result = []
    for i in range(size):
        result.append(sqrt(i))
    return result

def fun1():
    size = 10000
    for _ in range(size):
        result = fun2(size)
    # Program running time: 8.9682 seconds
Enter fullscreen mode Exit fullscreen mode

X. Reduce Calculations in Inner for Loops

import math

def fun1():
    size = 10000
    sqrt = math.sqrt
    for x in range(size):
        for y in range(size):
            z = sqrt(x) + sqrt(y) 
    # Program running time: 14.2634 seconds
Enter fullscreen mode Exit fullscreen mode

In the above code, sqrt(x) is located in the inner for loop and will be recalculated every time the loop runs, adding unnecessary time overhead.

import math

def fun1():
    size = 10000
    sqrt = math.sqrt
    for x in range(size):
        sqrt_x=sqrt(x) 
        for y in range(size):
            z = sqrt_x + sqrt(y) 
    # Program running time: 8.4077 seconds
Enter fullscreen mode Exit fullscreen mode

Leapcell: The Best Serverless Platform for Python app Hosting

Image description

Finally, I would like to introduce the best platform for deploying Python applications: Leapcell

1. Multi - Language Support

  • Develop with JavaScript, Python, Go, or Rust.

2. Deploy unlimited projects for free

  • Pay only for usage — no requests, no charges.

3. Unbeatable Cost Efficiency

  • Pay - as - you - go with no idle charges.
  • Example: $25 supports 6.94M requests at a 60ms average response time.

4. Streamlined Developer Experience

  • Intuitive UI for effortless setup.
  • Fully automated CI/CD pipelines and GitOps integration.
  • Real - time metrics and logging for actionable insights.

5. Effortless Scalability and High Performance

  • Auto - scaling to handle high concurrency with ease.
  • Zero operational overhead — just focus on building.

Image description

Explore more in the documentation!

Leapcell Twitter: https://x.com/LeapcellHQ

Top comments (0)