Strings

Some of the sections below use The Python Shell.

>>> "Hello"  # This is a string
'Hello'
>>> message = "Hello"  # Store the string in a variable
>>> message            # Show the value in the variable
'Hello'

String videos

  • Python Tutorial for Beginners 2: Strings - Working with Textual Data

    • Storing strings in variables

    • Variable naming conventions

    • Escape character \

    • len() function

    • Accessing single characters with square brackets []

    • IndexError

    • Substrings with slicing

    • String methods

      • .lower()

      • .count()

      • .find()

      • .replace()

      • .format()

    • String concatenation

    • f-strings

    • dir()

    • help()

String Concatenation

Concatenation is a fancy word for sticking strings together. In Python we use the addition operator to accomplish this

>>> "Hello" + "World"
'HelloWorld'

Notice how it doesn’t insert a space character automatically. You have to explicitly add a space character.

>>> "Hello" + " " + "World"
'Hello World'

Or you can just add a space in one of the strings if you are able.

>>> "Hello " + "World"
'Hello World'

If strings are stored in variables, you can treat the variables like strings and concatenate them.

>>> first = "Dave"
>>> last = "Smith"
>>> first + last
'DaveSmith'

Access a single character

Each character in a string is assigned an index location. The characters in "Hello" are assigned indixies as shown below.

0  1  2  3  4   # index values
H  e  l  l  o   # each character in the string

Note

In programming, all index values start at 0.

To access a single character at a specific index location, we can use bracket notation [].

>>> message = "Hello"
>>> message[0]
'H'
>>> message[1]
'e'
>>> message[4]
'o'
>>> message[5]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range

Warning

If you try to access an index value that doesn’t exist in the string, you will get an IndexError.

Get rear characters

Sometimes you need to grab the last, or second-last character in a string. The problem is, most of the time you don’t know exactly how long the string is. There are a couple approaches.

Use the len() function

Imagine we have a program that asks the user to enter a word. It would be impossible for the programmer writing the code to know exatly how long the word will be. For this, Python comes with a bulit-in length function len().

>>> message = "Hello, World!"
>>> len(message)
13
>>> message[len(message)]  # essentially: message[13]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
IndexError: string index out of range

Python knows the length of the string is 13, using len(message). When we try to use that number to access the last character, there is an IndexError. Because index values start at 0, the last index value for this string is 12.

>>> message[len(message)-1]  # essentially: message[12]
'!'
>>> message[len(message)-2]  # message[11]
'd'
>>> my_string = "ABCDEFG"
>>> my_string[len(my_string)-1]
'G'
>>> my_string[len(my_string)-2]
'F'
>>> my_string[len(my_string)-3]
'E'

Negative Index

Because the pattern of [len(message)-1] is so common, the creators of Python have created a shortcut.

>>> my_string = "ABCDEFG"
>>> my_string[-1]
'G'
>>> my_string[-2]
'F'
>>> my_string[-3]
'E'

Substrings and slicing

We can get a substring by slicing. Slicing uses square-bracket notation, with multiple numbers.

>>> message = "Hello"
>>> message[0:3]  # slice the first three characters
'Hel'
>>> message[1:4]
'ell'
>>> message[:3]
'Hel'
>>> message[2:]
'llo'

The first slice (message[0:3]) essentially means, get a slice of the string starting at index 0 up to (but not including) 3. Looking closely at the string and its index positions, we can see how the result would be Hel in this case:

|-------------- Start from
|          |------- go up to, not including 3
|  0  1  2 | 3  4
|  H  e  l | l  o

There is another shortcut. If you leave out the first, the from index, it defaults to slicing from the beginning.

>>> some_string = "This is fun"
>>> some_string[:4]
'This'

If you leave out the second, the to index, it will slice to the end.

>>> some_string = "This is fun"
>>> some_string[8:]
'fun'

Slice by step

There is one more missing piece to Python slices. The third number, the step.

>>> alphabet = "abcdefghijklmnopqrstuvqxyz"
>>> alphabet[0:10]    # the default step is 1
'abcdefghij'
>>> alphabet[0:10:1]  # from 0 to (not including) 10, by 1
'abcdefghij'
>>> alphabet[0:10:2]  # from 0 to (not including) 10, by 2
'acegi'
>>> alphabet[0:10:3]  # from 0 to (not including) 10, by 3
'adgj'
>>> alphabet[::2]     # from the beginning to the end, by 2
'acegikmoqsuqy'
>>> alphabet[::-1]    # from the end to the beginning, by -1
'zyxqvutsrqponmlkjihgfedcba'

Convert to lower/upper case

Three important string methods are .lower(), .upper(), and .capitalize().

>>> my_string = "HelLO WoRLD!"
>>> my_string.lower()
'hello world!'
>>> my_string
'HelLO WoRLD!'  # Notice! The original string was not changed
>>> my_string.upper()
'HELLO WORLD!'
>>> my_string.capitalize()
'Hello world!'

Replace characters or substrings

The .replace() method will return a copy of the string with the first substring (if found) replaced with the second argument.

>>> song = "The Long and Winding Road"
>>> song.replace("Long", "Short")
'The Short and Winding Road'
>>> song
'The Long and Winding Road'

Notice how the original string is not altered. To alter it, just re-assign the variable to the new string:

>>> song = "The Long and Winding Road"
>>> song = song.replace("Long", "Short")
>>> song
'The Short and Winding Road'

Split a string into a list

Very useful for coding problems like Canadian Computing Competition (CCC). Notice how .split removes the character you are splitting on. Examples are done in The Python Shell.

>>> some_string = "There's someone on the wing! Some THING!"
>>> some_string.split()
["There's", 'someone', 'on', 'the', 'wing!', 'Some', 'THING!']
>>> some_string.split("e")
['Th', 'r', "'s som", 'on', ' on th', ' wing! Som', ' THING!']
>>> some_string.split("!")
["There's someone on the wing", ' Some THING', '']

You can split on multiple characters. In this example, I am splitting on comma-space ", ".

>>> str_of_nums = "45, 123, 77, 323, 56"
>>> str_of_nums.split(", ")
['45', '123', '77', '323', '56']

Parsing

Sometimes you will have a string that contains valuable data as well as some unimportant things like formatting. The goal is to extract the important data from the string and leave the rest. This is called parsing the string.

Imagine we had a string like:

"x: 24, y: 35, z: 72"

We want to:

  • extract the 24 and place it in a variable called x.

  • extract the 35 and place it in a variable called y.

  • extract the 72 and place it in a variable called z.

For the sake of simplicity, we can assume that in every case, we will only get two-digit values for the numbers in our string. If that was not the case, we could make use of the str.index() method, or use regular expressions (advanced).

To extract the important data, we use string slicing.

formatted_info = "x: 24, y: 35, z: 72"
# index           0123456789111111111
#                           012345678

x = int(formatted_info[3:5])
y = int(formatted_info[10:12])
z = int(formatted_info[17:19])

print(x)  # 24
print(y)  # 35
print(z)  # 72

Aligning output

We can left-adjust (<), right-adjust (>) and center (^) our values. Here is a brief example:

x = 0

# Left-justify
print('L {:<20} R'.format(x))
# Center
print('L {:^20} R'.format(x))
# Right-justify
print('L {:>20} R'.format(x))

The output of these examples is:

L x                    R
L          x           R
L                    x R

Pretty cool. We told Python to leave 20 spaces for the text we wanted to enter, and depending on the symbol we specified, we were able to change the justification of our text.

You can even specify the character you want to use instead of empty spaces.

print('{:=<20}'.format('hello'))
print('{:_^20}'.format('hello'))
print('{:.>20}'.format('hello'))

The output of this example is:

hello===============
_______hello________
...............hello

Credit: TAKE CONTROL OF YOUR PYTHON PRINT() STATEMENTS: PART 3

Aligning with f-strings

x = 0
print(f"L {x:<20} R")
print(f"L {x:^20} R")
print(f"L {x:>20} R")

Useful string methods for alignment

  • str.ljust(): Left-justify

  • str.rjust(): Right-justify

  • str.center(): Center text

  • str.zfill(): Fill with zeros on the left

>>> "hello".ljust(10)
'hello     '
>>> "hello".ljust(10, ".")
'hello.....'
>>> "hello".rjust(10)
'     hello'
>>> "hello".rjust(10, "-")
'-----hello'
>>> "hello".center(10)
'  hello   '
>>> "hello".center(10, "-")
'--hello---'
>>> str(99).zfill(5)
'00099'