Skip to content

Strings

Strings

String Syntax

Python has a built-in string class named str with many handy features. String types can be enclosed by either double or single quotes. To interpret a quote as the literal quote, use the backslash escapes. This is common across many languages.

A double quoted string literal can contain single quotes without any issue (e.g. 'he said "hello"') and likewise, a single quoted string can contain double quotes"hello, 'he said'". A string literal can span multiple lines, but there must be a backslash at the end of each line to escape the newline. String literals inside triple quotes, """ , can span multiple lines of text.

Python strings are “immutable” which means they cannot be changed after they are created. Since strings can’t be changed, we construct new strings as we go to represent computed values. For example, the expression ("hello" + "there") takes in the 2 strings hello and there and builds a new string hellothere.

Strings come with some built in functions. Characters in a string can be accessed using the standard [ ] syntax, and like Java and C++, Python uses zero-based indexing, so if str is hello, str[1] is e. If the index is out of bounds for the string, Python raises an error. The handy “slice” syntax (below) also works to extract any substring from a string. The len(string) function returns the length of a string. The [ ] syntax and the len() function actually work on any sequence type — strings, lists, etc.. Python tries to make its operations work consistently across different types.

The + operator can concatenate two strings. Notice in the code below that variables are not pre-declared — just assign to them and go.

a = "Bioinformatics"
print ("The value of a is \t",a)
  • A string is stored in the variable ‘a’ and called when the statement is being printed. Including the variable name after comma following the print statement, calls the particular string.

s="Bioinformatics is not too difficult to learn"
# String functions as char arrays
x = s[0:5]
print ("The value of x is ", x)
x = s[:8]
print ("The value of x is now ",x)
  • A string is saved in the variable ‘s’. We are then specifying that x is the first 5 characters of the string ‘s’. x=s[0:5] Note that in python the numbering system begins with zero.
  • If we don’t specify the starting position of the character array, it is assumed that 0 is the starting point. x=s[:8]In this example the first 8 characters of the string are printed.

  • Some interesting operations can be done on strings using their built in functions. For example, the len function prints the length of the string. x= len(s)
x = len(s) 
print ("Try: The length of s (",s,") is ",x)

The str()function converts values to a string form so they can be combined with other strings or we can use the,

In this example, the value for pi is converted to a string and then printed.

pi = 3.14
text = 'The value of pi is ' + pi
text = 'The value of pi is ' + str(pi) ## yes
print(text)

String Methods

Here are some of the most common string methods. A method is like a function, but it runs “on” an object. If the variable s is a string, then the code s.lower() runs the lower() method on that string object and returns the result (this idea of a method running on an object is one of the basic ideas that make up Object Oriented Programming, OOP). Here are some of the most common string methods:

Method Function
s.lower()s.upper() returns the lowercase or uppercase version of the string
s.strip() returns a string with whitespace removed from the start and end
s.isalpha()/s.isdigit()/s.isspace() tests if all the string chars are in the various character classes
s.startswith('other')s.endswith('other') tests if the string starts or ends with the given other string
s.find('other') searches for the given other string (not a regular expression) within s, and returns the first index where it begins or -1 if not found
s.replace(‘old’, ‘new')  returns a string where all occurrences of ‘old’ have been replaced by ‘new’
s.split(‘delim’) returns a list of substrings separated by the given delimiter. The delimiter is not a regular expression, it’s just text.
aaa,bbb,ccc.split(',') [‘aaa’, ‘bbb’, ‘ccc’]. As a convenient special case s.split() (with no arguments) splits on all whitespace chars.
s.join(list) opposite of split(), joins the elements in the given list together using the string as the delimiter. e.g. ‘—‘.join([‘aaa’, ‘bbb’, ‘ccc’]) -> aaa—bbb—ccc

A google search for “python str” should lead you to the official python.org string methods which lists all the str methods.

Modifiers

Some additions can change the way a string is interpreted. A “raw” string literal is prefixed by an r and passes all the chars through without special treatment of backslashes, so r'xnx' evaluates to the string xnx. A u prefix allows you to write a Unicode string literal.

raw = r'\nthis and that' 
print (raw)

Using the ‘r’ prefix allows the contents within the quotes to be read as such. In this scenario, despite having the new line code in the statement (\n), the print function will simply print \n rather than moving to a new line.

multiple quotes allows statements that span multiple lines to be printed as such.

multi = """It was the best of times. It was the worst of times.""" 
print (multi)

% Format my numbers!

The % operator takes a printf-type format string on the left (%d is int, %s is string, %f is floating point), and the matching values in a tuple on the right (a tuple is made of values separated by commas, typically grouped inside parentheses):

text = ("%d little pigs come out or I'll %s and %s and %s" %(3, 'huff', 'puff', 'blow down')) 
print(text)

The ‘%’ operators act as placeholders. The placeholders are filled in with the values specified outside of the primary set of quotations.