DEV Community

Cover image for Python Strings In A Nutshell
Christopher Ambala
Christopher Ambala

Posted on • Edited on

Python Strings In A Nutshell

Strings are enclosed within single (' '), double (" "), or triple (''' ''' or """ """) quotes.

single_quoted = 'This is a single-quoted string.'
double_quoted = "This is a double-quoted string."
triple_quoted = '''This is a triple-quoted string.'''
Enter fullscreen mode Exit fullscreen mode

Immutability

They cannot be changed in-place after they are created. For example, you can’t change a string by assigning to one of its
positions, but you can always build a new one and assign it to the same name.

Concatenation and Slicing

Python allows you to concatenate strings using the + operator, making it easy to combine text elements.

first_name = "John"
last_name = "Doe"
full_name = first_name + " " + last_name  # full_name is "John Doe"
Enter fullscreen mode Exit fullscreen mode
text = "Hello, World!"
first_char = text[0]  # first_char is 'H'
substring = text[7:12]  # substring is 'World'
Enter fullscreen mode Exit fullscreen mode

Every object in Python is classified as either immutable (unchangeable) or not. In terms of the core types, numbers, strings, and tuples are immutable; lists and dictionaries are
not.

String Operations

>>> match = re.match('/(.*)/(.*)/(.*)', '/usr/home/lumberjack')
>>> match.groups()
('usr', 'home', 'lumberjack')
Enter fullscreen mode Exit fullscreen mode
>>> S[-1] # The last item from the end in S
'm'
>>> S[-2] # The second to last item from the end
'a'
Enter fullscreen mode Exit fullscreen mode

A negative index is simply added to the string’s size,

>>> S[-1] # The last item in S
'm'
>>> S[len(S)-1] # Negative indexing, the hard way
'm'
Enter fullscreen mode Exit fullscreen mode

Sequences also support a more general form of indexing,slicing, which is a way to extract an entire section.

>>> S # A 4-character string
'Spam'
>>> S[1:3] # Slice of S from offsets 1 through 2 (not 3)
'pa'
Enter fullscreen mode Exit fullscreen mode

The general form, X[I:J], means “give me everything in X from offset I up to but not including offset J.” The result is returned in a new object. The second of the preceding operations, for instance, gives us all the characters in string S from offsets 1 through 2 (that is, 3 – 1) as a new string. The effect is
to slice or “parse out” the two characters in the middle.
In a slice, the left bound defaults to zero, and the right bound defaults to the length of
the sequence being sliced. This leads to some common usage variations:

Addition for(+)numbers, and concatenation for strings. This is a general property of Python that is called polymorphism.

Methods In String

>>> S.find('pa') # Find the offset of a substring
1
>>> S
'Spam'
>>> S.replace('pa', 'XYZ') # Replace occurrences of a substring with another
'SXYZm'
>>> S
'Spam'
Enter fullscreen mode Exit fullscreen mode
>>> S[1:] # Everything past the first (1:len(S))
'pam'
>>> S # S itself hasn't changed
'Spam'
>>> S[0:3] # Everything but the last
'Spa'
>>> S[:3] # Same as S[0:3]
'Spa'
>>> S[:-1] # Everything but the last again, but simpler (0:-1)
'Spa'
>>> S[:] # All of S as a top-level copy (0:len(S))
'Spam'
Enter fullscreen mode Exit fullscreen mode

Strings also support concatenation with the plus sign and repetition :

>>> S
Spam'
>>> S + 'xyz' # Concatenation
Strings | 81
'Spamxyz'
>>> S # S is unchanged
'Spam'
>>> S * 8 # Repetition
'SpamSpamSpamSpamSpamSpamSpamSpam'
Enter fullscreen mode Exit fullscreen mode
>>> line = 'aaa,bbb,ccccc,dd'
>>> line.split(',') # Split on a delimiter into a list of substrings
['aaa', 'bbb', 'ccccc', 'dd']
>>> S = 'spam'
>>> S.upper() # Upper- and lowercase conversions
'SPAM'
>>> S.isalpha() # Content tests: isalpha, isdigit, etc.
True
>>> line = 'aaa,bbb,ccccc,dd\n'
>>> line = line.rstrip() # Remove whitespace characters on the right side
>>> line
'aaa,bbb,ccccc,dd'
Enter fullscreen mode Exit fullscreen mode

Pattern Matching

This module has analogous calls for searching, splitting, and replacement, but because we can use patterns to specify substrings, we can be much more general:

>>> import re
>>> match = re.match('Hello[ \t]*(.*)world', 'Hello Python world')
>>> match.group(1)
'Python '
Enter fullscreen mode Exit fullscreen mode

This example searches for a substring that begins with the word “Hello,” followed by zero or more tabs or spaces, followed by arbitrary characters to be saved as a matched group, terminated by the word “world.”The following pattern, picks out three groups separated by slashes:

>>> match = re.match('/(.*)/(.*)/(.*)', '/usr/home/lumberjack')
>>> match.groups()
('usr', 'home', 'lumberjack')
Enter fullscreen mode Exit fullscreen mode

Top comments (0)